91 datasets found

Z
Comprehensive Network Logs Dataset for Multi-Device Analysis
data.niaid.nih.gov
zenodo.org
Updated Jan 11, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Hasan, Raza (2024). Comprehensive Network Logs Dataset for Multi-Device Analysis [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_10492769
Explore at:
Dataset updated
Jan 11, 2024
Dataset provided by
Hasan, Raza
Salman, Mahmood
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset comprises diverse logs from various sources, including cloud services, routers, switches, virtualization, network security appliances, authentication systems, DNS, operating systems, packet captures, proxy servers, servers, syslog data, and network data. The logs encompass a wide range of information such as traffic details, user activities, authentication events, DNS queries, network flows, security actions, and system events. By analyzing these logs collectively, users can gain insights into network patterns, anomalies, user authentication, cloud service usage, DNS traffic, network flows, security incidents, and system activities. The dataset is invaluable for network monitoring, performance analysis, anomaly detection, security investigations, and correlating events across the entire network infrastructure.
Z
AIT Log Data Set V2.0
data.niaid.nih.gov
zenodo.org
Updated Jun 28, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Skopik, Florian (2024). AIT Log Data Set V2.0 [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_5789063
Explore at:
Dataset updated
Jun 28, 2024
Dataset provided by
Hotwagner, Wolfgang
Frank, Maximilian
Rauber, Andreas
Wurzenberger, Markus
Skopik, Florian
Landauer, Max
License
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Description
AIT Log Data Sets

This repository contains synthetic log data suitable for evaluation of intrusion detection systems, federated learning, and alert aggregation. A detailed description of the dataset is available in [1]. The logs were collected from eight testbeds that were built at the Austrian Institute of Technology (AIT) following the approach by [2]. Please cite these papers if the data is used for academic publications.

In brief, each of the datasets corresponds to a testbed representing a small enterprise network including mail server, file share, WordPress server, VPN, firewall, etc. Normal user behavior is simulated to generate background noise over a time span of 4-6 days. At some point, a sequence of attack steps is launched against the network. Log data is collected from all hosts and includes Apache access and error logs, authentication logs, DNS logs, VPN logs, audit logs, Suricata logs, network traffic packet captures, horde logs, exim logs, syslog, and system monitoring logs. Separate ground truth files are used to label events that are related to the attacks. Compared to the AIT-LDSv1.1, a more complex network and diverse user behavior is simulated, and logs are collected from all hosts in the network. If you are only interested in network traffic analysis, we also provide the AIT-NDS containing the labeled netflows of the testbed networks. We also provide the AIT-ADS, an alert data set derived by forensically applying open-source intrusion detection systems on the log data.

The datasets in this repository have the following structure:

The gather directory contains all logs collected from the testbed. Logs collected from each host are located in gather//logs/.

The labels directory contains the ground truth of the dataset that indicates which events are related to attacks. The directory mirrors the structure of the gather directory so that each label files is located at the same path and has the same name as the corresponding log file. Each line in the label files references the log event corresponding to an attack by the line number counted from the beginning of the file ("line"), the labels assigned to the line that state the respective attack step ("labels"), and the labeling rules that assigned the labels ("rules"). An example is provided below.

The processing directory contains the source code that was used to generate the labels.

The rules directory contains the labeling rules.

The environment directory contains the source code that was used to deploy the testbed and run the simulation using the Kyoushi Testbed Environment.

The dataset.yml file specifies the start and end time of the simulation.

The following table summarizes relevant properties of the datasets:

fox

Simulation time: 2022-01-15 00:00 - 2022-01-20 00:00

Attack time: 2022-01-18 11:59 - 2022-01-18 13:15

Scan volume: High

Unpacked size: 26 GB

harrison

Simulation time: 2022-02-04 00:00 - 2022-02-09 00:00

Attack time: 2022-02-08 07:07 - 2022-02-08 08:38

Scan volume: High

Unpacked size: 27 GB

russellmitchell

Simulation time: 2022-01-21 00:00 - 2022-01-25 00:00

Attack time: 2022-01-24 03:01 - 2022-01-24 04:39

Scan volume: Low

Unpacked size: 14 GB

santos

Simulation time: 2022-01-14 00:00 - 2022-01-18 00:00

Attack time: 2022-01-17 11:15 - 2022-01-17 11:59

Scan volume: Low

Unpacked size: 17 GB

shaw

Simulation time: 2022-01-25 00:00 - 2022-01-31 00:00

Attack time: 2022-01-29 14:37 - 2022-01-29 15:21

Scan volume: Low

Data exfiltration is not visible in DNS logs

Unpacked size: 27 GB

wardbeck

Simulation time: 2022-01-19 00:00 - 2022-01-24 00:00

Attack time: 2022-01-23 12:10 - 2022-01-23 12:56

Scan volume: Low

Unpacked size: 26 GB

wheeler

Simulation time: 2022-01-26 00:00 - 2022-01-31 00:00

Attack time: 2022-01-30 07:35 - 2022-01-30 17:53

Scan volume: High

No password cracking in attack chain

Unpacked size: 30 GB

wilson

Simulation time: 2022-02-03 00:00 - 2022-02-09 00:00

Attack time: 2022-02-07 10:57 - 2022-02-07 11:49

Scan volume: High

Unpacked size: 39 GB

The following attacks are launched in the network:

Scans (nmap, WPScan, dirb)

Webshell upload (CVE-2020-24186)

Password cracking (John the Ripper)

Privilege escalation

Remote command execution

Data exfiltration (DNSteal)

Note that attack parameters and their execution orders vary in each dataset. Labeled log files are trimmed to the simulation time to ensure that their labels (which reference the related event by the line number in the file) are not misleading. Other log files, however, also contain log events generated before or after the simulation time and may therefore be affected by testbed setup or data collection. It is therefore recommended to only consider logs with timestamps within the simulation time for analysis.

The structure of labels is explained using the audit logs from the intranet server in the russellmitchell data set as an example in the following. The first four labels in the labels/intranet_server/logs/audit/audit.log file are as follows:

{"line": 1860, "labels": ["attacker_change_user", "escalate"], "rules": {"attacker_change_user": ["attacker.escalate.audit.su.login"], "escalate": ["attacker.escalate.audit.su.login"]}}

{"line": 1861, "labels": ["attacker_change_user", "escalate"], "rules": {"attacker_change_user": ["attacker.escalate.audit.su.login"], "escalate": ["attacker.escalate.audit.su.login"]}}

{"line": 1862, "labels": ["attacker_change_user", "escalate"], "rules": {"attacker_change_user": ["attacker.escalate.audit.su.login"], "escalate": ["attacker.escalate.audit.su.login"]}}

{"line": 1863, "labels": ["attacker_change_user", "escalate"], "rules": {"attacker_change_user": ["attacker.escalate.audit.su.login"], "escalate": ["attacker.escalate.audit.su.login"]}}

Each JSON object in this file assigns a label to one specific log line in the corresponding log file located at gather/intranet_server/logs/audit/audit.log. The field "line" in the JSON objects specify the line number of the respective event in the original log file, while the field "labels" comprise the corresponding labels. For example, the lines in the sample above provide the information that lines 1860-1863 in the gather/intranet_server/logs/audit/audit.log file are labeled with "attacker_change_user" and "escalate" corresponding to the attack step where the attacker receives escalated privileges. Inspecting these lines shows that they indeed correspond to the user authenticating as root:

type=USER_AUTH msg=audit(1642999060.603:2226): pid=27950 uid=33 auid=4294967295 ses=4294967295 msg='op=PAM:authentication acct="jhall" exe="/bin/su" hostname=? addr=? terminal=/dev/pts/1 res=success'

type=USER_ACCT msg=audit(1642999060.603:2227): pid=27950 uid=33 auid=4294967295 ses=4294967295 msg='op=PAM:accounting acct="jhall" exe="/bin/su" hostname=? addr=? terminal=/dev/pts/1 res=success'

type=CRED_ACQ msg=audit(1642999060.615:2228): pid=27950 uid=33 auid=4294967295 ses=4294967295 msg='op=PAM:setcred acct="jhall" exe="/bin/su" hostname=? addr=? terminal=/dev/pts/1 res=success'

type=USER_START msg=audit(1642999060.627:2229): pid=27950 uid=33 auid=4294967295 ses=4294967295 msg='op=PAM:session_open acct="jhall" exe="/bin/su" hostname=? addr=? terminal=/dev/pts/1 res=success'

The same applies to all other labels for this log file and all other log files. There are no labels for logs generated by "normal" (i.e., non-attack) behavior; instead, all log events that have no corresponding JSON object in one of the files from the labels directory, such as the lines 1-1859 in the example above, can be considered to be labeled as "normal". This means that in order to figure out the labels for the log data it is necessary to store the line numbers when processing the original logs from the gather directory and see if these line numbers also appear in the corresponding file in the labels directory.

Beside the attack labels, a general overview of the exact times when specific attack steps are launched are available in gather/attacker_0/logs/attacks.log. An enumeration of all hosts and their IP addresses is stated in processing/config/servers.yml. Moreover, configurations of each host are provided in gather//configs/ and gather//facts.json.

Version history:

AIT-LDS-v1.x: Four datasets, logs from single host, fine-granular audit logs, mail/CMS.

AIT-LDS-v2.0: Eight datasets, logs from all hosts, system logs and network traffic, mail/CMS/cloud/web.

Acknowledgements: Partially funded by the FFG projects INDICAETING (868306) and DECEPT (873980), and the EU projects GUARD (833456) and PANDORA (SI2.835928).

If you use the dataset, please cite the following publications:

[1] M. Landauer, F. Skopik, M. Frank, W. Hotwagner, M. Wurzenberger, and A. Rauber. "Maintainable Log Datasets for Evaluation of Intrusion Detection Systems". IEEE Transactions on Dependable and Secure Computing, vol. 20, no. 4, pp. 3466-3482, doi: 10.1109/TDSC.2022.3201582. [PDF]

[2] M. Landauer, F. Skopik, M. Wurzenberger, W. Hotwagner and A. Rauber, "Have it Your Way: Generating Customized Log Datasets With a Model-Driven Simulation Testbed," in IEEE Transactions on Reliability, vol. 70, no. 1, pp. 402-415, March 2021, doi: 10.1109/TR.2020.3031317. [PDF]
Kyoushi Log Data Set
zenodo.org
data.niaid.nih.gov
zip
Updated Apr 24, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Max Landauer; Maximilian Frank; Florian Skopik; Wolfgang Hotwagner; Markus Wurzenberger; Andreas Rauber; Max Landauer; Maximilian Frank; Florian Skopik; Wolfgang Hotwagner; Markus Wurzenberger; Andreas Rauber (2025). Kyoushi Log Data Set [Dataset]. http://doi.org/10.5281/zenodo.5779411
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.5779411
Dataset updated
Apr 24, 2025
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Max Landauer; Maximilian Frank; Florian Skopik; Wolfgang Hotwagner; Markus Wurzenberger; Andreas Rauber; Max Landauer; Maximilian Frank; Florian Skopik; Wolfgang Hotwagner; Markus Wurzenberger; Andreas Rauber
License
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Description
This repository contains synthetic log data suitable for evaluation of intrusion detection systems. The logs were collected from a testbed that was built at the Austrian Institute of Technology (AIT) following the approaches by [1], [2], and [3]. Please refer to these papers for more detailed information on the dataset and cite them if the data is used for academic publications. Other than the related AIT-LDSv1.1, this dataset involves a more complex network structure, makes use of a different attack scenario, and collects log data from multiple hosts in the network. In brief, the testbed simulates a small enterprise network including mail server, file share, WordPress server, VPN, firewall, etc. Normal user behavior is simulated to generate background noise. After some days, two attack scenarios are launched against the network. Note that the AIT-LDSv2.0 extends this dataset with additional attack cases and variations of attack parameters.

The archives have the following structure. The gather directory contains the raw log data from each host in the network, as well as their system configurations. The labels directory contains the ground truth for those log files that are labeled. The processing directory contains configurations for the labeling procedure and the rules directory contains the labeling rules. Labeling of events that are related to the attacks is carried out with the Kyoushi Labeling Framework.

Each dataset contains traces of a specific attack scenario:

Scenario 1 (see gather/attacker_0/logs/sm.log for detailed attack log):

nmap scan

WPScan

dirb scan

webshell upload through wpDiscuz exploit (CVE-2020-24186)

privilege escalation

Scenario 2 (see gather/attacker_0/logs/dnsteal.log for detailed attack log):

DNSteal data exfiltration

The log data collected from the servers includes

Apache access and error logs (labeled)

audit logs (labeled)

auth logs (labeled)

VPN logs (labeled)

DNS logs (labeled)

syslog

suricata logs

exim logs

horde logs

mail logs

Note that only log files from affected servers are labeled. Label files and the directories in which they are located have the same name as their corresponding log file in the gather directory. Labels are in JSON format and comprise the following attributes: line (number of line in corresponding log file), labels (list of labels assigned to that log line), rules (names of labeling rules matching that log line). Note that not all attack traces are labeled in all log files; please refer to the labeling rules in case that some labels are not clear.

Acknowledgements: Partially funded by the FFG projects INDICAETING (868306) and DECEPT (873980), and the EU project GUARD (833456).

If you use the dataset, please cite the following publications:

[1] M. Landauer, F. Skopik, M. Wurzenberger, W. Hotwagner and A. Rauber, "Have it Your Way: Generating Customized Log Datasets With a Model-Driven Simulation Testbed," in IEEE Transactions on Reliability, vol. 70, no. 1, pp. 402-415, March 2021, doi: 10.1109/TR.2020.3031317.

[2] M. Landauer, M. Frank, F. Skopik, W. Hotwagner, M. Wurzenberger, and A. Rauber, "A Framework for Automatic Labeling of Log Datasets from Model-driven Testbeds for HIDS Evaluation". ACM Workshop on Secure and Trustworthy Cyber-Physical Systems (ACM SaT-CPS 2022), April 27, 2022, Baltimore, MD, USA. ACM.

[3] M. Frank, "Quality improvement of labels for model-driven benchmark data generation for intrusion detection systems", Master's Thesis, Vienna University of Technology, 2021.
ORIGINAL-NETWORK-TRAFFIC-Thursday-01-03-2018-LOGS
kaggle.com
Updated Jun 14, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Karen Pamela López (2023). ORIGINAL-NETWORK-TRAFFIC-Thursday-01-03-2018-LOGS [Dataset]. https://www.kaggle.com/datasets/karenp/original-network-traffic-thursday-01-03-2018-logs
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jun 14, 2023
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Karen Pamela López
Description
This data set was originally downloaded from: https://www.unb.ca/cic/datasets/ids-2018.html

The data set has a weight of 466GB.

When the download is done, the file contains 2 folders: Processed Traffic Data for ML Algorithms and Original network traffic and log data.

The "Processed Traffic Data for ML Algorithms" folder contains 10 csv files with the following names:

Friday-02-03-2018_TrafficForML_CICFlowMeter

Friday-16-02-2018_TrafficForML_CICFlowMeter

Friday-23-02-2018_TrafficForML_CICFlowMeter

Thuesday-20-02-2018_TrafficForML_CICFlowMeter

Thursday-01-03-2018_TrafficForML_CICFlowMeter

Thursday-15-02-2018_TrafficForML_CICFlowMeter

Thursday-22-02-2018_TrafficForML_CICFlowMeter

Wednesday-14-02-2018_TrafficForML_CICFlowMeter

Wednesday-21-02-2018_TrafficForML_CICFlowMeter

Wednesday-28-02-2018_TrafficForML_CICFlowMeter

And the "Original Network Traffic and Log data" folder contains 10 folders, each folder is named as the previous files. Each folder contains in turn two folders logs and pcap.

Here is the LOGS for Thursday-01-03-2018.
T
Traffic Signal Network Device Status Log
datahub.austintexas.gov
data.austintexas.gov
+2more
application/rdfxml +5
Updated Jun 30, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
City of Austin, Texas - data.austintexas.gov (2025). Traffic Signal Network Device Status Log [Dataset]. https://datahub.austintexas.gov/Transportation-and-Mobility/Traffic-Signal-Network-Device-Status-Log/pj7k-98z2
Explore at:
xml, csv, json, tsv, application/rssxml, application/rdfxmlAvailable download formats
Dataset updated
Jun 30, 2025
Dataset authored and provided by
City of Austin, Texas - data.austintexas.gov
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
The Austin Transportation Department manages thousands of IP-enabled devices which enable traffic signal operations. Devices include traffic cameras, battery backup systems, signal controllers, and vehicle detectors.

This dataset, updated daily, serves as a log of attempts to communicate with the various various devices on the traffic signals network.
ORIGINAL-NETWORK-TRAFFIC-Tuesday-20-02-2018-PCAP
kaggle.com
Updated Jun 28, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Karen Pamela López (2023). ORIGINAL-NETWORK-TRAFFIC-Tuesday-20-02-2018-PCAP [Dataset]. https://www.kaggle.com/datasets/karenp/original-network-traffic-tuesday-20-02-2018-pcap
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jun 28, 2023
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Karen Pamela López
Description
This data set was originally downloaded from: https://www.unb.ca/cic/datasets/ids-2018.html

The data set has a weight of 466GB.

When the download is done, the file contains 2 folders: Processed Traffic Data for ML Algorithms and Original network traffic and log data.

The "Processed Traffic Data for ML Algorithms" folder contains 10 csv files with the following names:

Friday-02-03-2018_TrafficForML_CICFlowMeter

Friday-16-02-2018_TrafficForML_CICFlowMeter

Friday-23-02-2018_TrafficForML_CICFlowMeter

Thuesday-20-02-2018_TrafficForML_CICFlowMeter

Thursday-01-03-2018_TrafficForML_CICFlowMeter

Thursday-15-02-2018_TrafficForML_CICFlowMeter

Thursday-22-02-2018_TrafficForML_CICFlowMeter

Wednesday-14-02-2018_TrafficForML_CICFlowMeter

Wednesday-21-02-2018_TrafficForML_CICFlowMeter

Wednesday-28-02-2018_TrafficForML_CICFlowMeter

And the "Original Network Traffic and Log data" folder contains 10 folders, each folder is named as the previous files. Each folder contains in turn two folders logs and pcap.

Here is the PCAP for Friday-02-03-2018.
i
DNS Over HTTPS network traffic
ieee-dataport.org
Updated Jan 17, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Kamil Jerabek (2022). DNS Over HTTPS network traffic [Dataset]. https://ieee-dataport.org/documents/dns-over-https-network-traffic
Explore at:
Dataset updated
Jan 17, 2022
Authors
Kamil Jerabek
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Dataset contains generated traffic from single requests towards DNS and DNS over Encryption servers as well as network traffic generated by browsers towards multiple DNS over HTTPS servers. The dataset contains also logs and csv files with queried domains. The IP addresses of the DoH servers are provided in the readme so that users can easily label the data extracted from pcap files. The dataset may be used for Machine Learning purposes (DNS over HTTPS identification).
Automotive Controller Area Network (CAN) Bus Intrusion Dataset
data.4tu.nl
zip
Updated Oct 30, 2019
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Guillaume Dupont; Alexios Lekidis; J.I. (Jerry) den Hartog; S. (Sandro) Etalle (2019). Automotive Controller Area Network (CAN) Bus Intrusion Dataset [Dataset]. http://doi.org/10.4121/uuid:5ffa5c8d-ab89-418c-b2c1-13ce95a88f73
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.4121/uuid:5ffa5c8d-ab89-418c-b2c1-13ce95a88f73
Dataset updated
Oct 30, 2019
Dataset provided by
4TUhttps://www.4tu.nl/
Authors
Guillaume Dupont; Alexios Lekidis; J.I. (Jerry) den Hartog; S. (Sandro) Etalle
License
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Description
This dataset contains automotive is Controller Area Network (CAN) bus data from three systems: two cars (Opel Astra and Renault Clio) and from a is Controller Area Network (CAN) bus prototype we built ourselves. Its purpose is meant to evaluate is Controller Area Network (CAN) bus Network Intrusion Detection Systems (NIDS). For each system, the dataset consists in a collection of log files captured from a is Controller Area Network (CAN) bus: normal (attack-free) data for training and testing detection algorithms, and different is Controller Area Network (CAN) bus attacks (Diagnostic, Fuzzing attacks, Replay attack, Suspension attack and Denial-of-Service attack)
Z
Data from: Traffic and Log Data Captured During a Cyber Defense Exercise
data.niaid.nih.gov
zenodo.org
Updated Jun 12, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jan Vykopal (2020). Traffic and Log Data Captured During a Cyber Defense Exercise [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_3746128
Explore at:
Dataset updated
Jun 12, 2020
Dataset provided by
Jan Vykopal
Stanislav Špaček
Daniel Tovarňák
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset was acquired during Cyber Czech – a hands-on cyber defense exercise (Red Team/Blue Team) held in March 2019 at Masaryk University, Brno, Czech Republic. Network traffic flows and a high variety of event logs were captured in an exercise network deployed in the KYPO Cyber Range Platform.

Contents

The dataset covers two distinct time intervals, which correspond to the official schedule of the exercise. The timestamps provided below are in the ISO 8601 date format.

Day 1, March 19, 2019

Start: 2019-03-19T11:00:00.000000+01:00

End: 2019-03-19T18:00:00.000000+01:00

Day 2, March 20, 2019

Start: 2019-03-20T08:00:00.000000+01:00

End: 2019-03-20T15:30:00.000000+01:00

The captured and collected data were normalized into three distinct event types and they are stored as structured JSON. The data are sorted by a timestamp, which represents the time they were observed. Each event type includes a raw payload ready for further processing and analysis. The description of the respective event types and the corresponding data files follows.

cz.muni.csirt.IpfixEntry.tgz – an archive of IPFIX traffic flows enriched with an additional payload of parsed application protocols in raw JSON.

cz.muni.csirt.SyslogEntry.tgz – an archive of Linux Syslog entries with the payload of corresponding text-based log messages.

cz.muni.csirt.WinlogEntry.tgz – an archive of Windows Event Log entries with the payload of original events in raw XML.

Each archive listed above includes a directory of the same name with the following four files, ready to be processed.

data.json.gz – the actual data entries in a single gzipped JSON file.

dictionary.yml – data dictionary for the entries.

schema.ddl – data schema for Apache Spark analytics engine.

schema.jsch – JSON schema for the entries.

Finally, the exercise network topology is described in a machine-readable NetJSON format and it is a part of a set of auxiliary files archive – auxiliary-material.tgz – which includes the following.

global-gateway-config.json – the network configuration of the global gateway in the NetJSON format.

global-gateway-routing.json – the routing configuration of the global gateway in the NetJSON format.

redteam-attack-schedule.{csv,odt} – the schedule of the Red Team attacks in CSV and ODT format. Source for Table 2.

redteam-reserved-ip-ranges.{csv,odt} – the list of IP segments reserved for the Red Team in CSV and ODT format. Source for Table 1.

topology.{json,pdf,png} – the topology of the complete Cyber Czech exercise network in the NetJSON, PDF and PNG format.

topology-small.{pdf,png} – simplified topology in the PDF and PNG format. Source for Figure 1.
Z
AIT Log Data Set V1.1
data.niaid.nih.gov
explore.openaire.eu
Updated Oct 18, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Hotwagner, Wolfgang (2023). AIT Log Data Set V1.1 [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_3723082
Explore at:
Dataset updated
Oct 18, 2023
Dataset provided by
Hotwagner, Wolfgang
Rauber, Andreas
Wurzenberger, Markus
Skopik, Florian
Landauer, Max
License
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Description
AIT Log Data Sets

This repository contains synthetic log data suitable for evaluation of intrusion detection systems. The logs were collected from four independent testbeds that were built at the Austrian Institute of Technology (AIT) following the approach by Landauer et al. (2020) [1]. Please refer to the paper for more detailed information on automatic testbed generation and cite it if the data is used for academic publications. In brief, each testbed simulates user accesses to a webserver that runs Horde Webmail and OkayCMS. The duration of the simulation is six days. On the fifth day (2020-03-04) two attacks are launched against each web server.

The archive AIT-LDS-v1_0.zip contains the directories "data" and "labels".

The data directory is structured as follows. Each directory mail..com contains the logs of one web server. Each directory user- contains the logs of one user host machine, where one or more users are simulated. Each file log.log in the user- directories contains the activity logs of one particular user.

Setup details of the web servers:

OS: Debian Stretch 9.11.6

Services:

Apache2

PHP7

Exim 4.89

Horde 5.2.22

OkayCMS 2.3.4

Suricata

ClamAV

MariaDB

Setup details of user machines:

OS: Ubuntu Bionic

Services:

Chromium

Firefox

User host machines are assigned to web servers in the following way:

mail.cup.com is accessed by users from host machines user-{0, 1, 2, 6}

mail.spiral.com is accessed by users from host machines user-{3, 5, 8}

mail.insect.com is accessed by users from host machines user-{4, 9}

mail.onion.com is accessed by users from host machines user-{7, 10}

The following attacks are launched against the web servers (different starting times for each web server, please check the labels for exact attack times):

Attack 1: multi-step attack with sequential execution of the following attacks:

nmap scan

nikto scan

smtp-user-enum tool for account enumeration

hydra brute force login

webshell upload through Horde exploit (CVE-2019-9858)

privilege escalation through Exim exploit (CVE-2019-10149)

Attack 2: webshell injection through malicious cookie (CVE-2019-16885)

Attacks are launched from the following user host machines. In each of the corresponding directories user-, logs of the attack execution are found in the file attackLog.txt:

user-6 attacks mail.cup.com

user-5 attacks mail.spiral.com

user-4 attacks mail.insect.com

user-7 attacks mail.onion.com

The log data collected from the web servers includes

Apache access and error logs

syscall logs collected with the Linux audit daemon

suricata logs

exim logs

auth logs

daemon logs

mail logs

syslogs

user logs

Note that due to their large size, the audit/audit.log files of each server were compressed in a .zip-archive. In case that these logs are needed for analysis, they must first be unzipped.

Labels are organized in the same directory structure as logs. Each file contains two labels for each log line separated by a comma, the first one based on the occurrence time, the second one based on similarity and ordering. Note that this does not guarantee correct labeling for all lines and that no manual corrections were conducted.

Version history and related data sets:

AIT-LDS-v1.0: Four datasets, logs from single host, fine-granular audit logs, mail/CMS.

AIT-LDS-v1.1: Removed carriage return of line endings in audit.log files.

AIT-LDS-v2.0: Eight datasets, logs from all hosts, system logs and network traffic, mail/CMS/cloud/web.

Acknowledgements: Partially funded by the FFG projects INDICAETING (868306) and DECEPT (873980), and the EU project GUARD (833456).

If you use the dataset, please cite the following publication:

[1] M. Landauer, F. Skopik, M. Wurzenberger, W. Hotwagner and A. Rauber, "Have it Your Way: Generating Customized Log Datasets With a Model-Driven Simulation Testbed," in IEEE Transactions on Reliability, vol. 70, no. 1, pp. 402-415, March 2021, doi: 10.1109/TR.2020.3031317. [PDF]
Synthetic Cybersecurity Logs for Anomaly Detection
kaggle.com
Updated Dec 16, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
fcWebDev (2024). Synthetic Cybersecurity Logs for Anomaly Detection [Dataset]. http://doi.org/10.34740/kaggle/dsv/10211131
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.34740/kaggle/dsv/10211131
Dataset updated
Dec 16, 2024
Dataset provided by
Kagglehttp://kaggle.com/
Authors
fcWebDev
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
This dataset contains synthetic HTTP log data designed for cybersecurity analysis, particularly for anomaly detection tasks.

Dataset Features Timestamp: Simulated time for each log entry. IP_Address: Randomized IP addresses to simulate network traffic. Request_Type: Common HTTP methods (GET, POST, PUT, DELETE). Status_Code: HTTP response status codes (e.g., 200, 404, 403, 500). Anomaly_Flag: Binary flag indicating anomalies (1 = anomaly, 0 = normal). User_Agent: Simulated user agents for device and browser identification. Session_ID: Random session IDs to simulate user activity. Location: Geographic locations of requests. Applications This dataset can be used for:

Anomaly Detection: Identify suspicious network activity or attacks. Machine Learning: Train models for classification tasks (e.g., detect anomalies). Cybersecurity Analysis: Analyze HTTP traffic patterns and identify threats. Example Challenge Build a machine learning model to predict the Anomaly_Flag based on the features provided.
d
can-log
data.dtu.dk
zip
Updated Dec 15, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Brooke Elizabeth Lampe (2023). can-log [Dataset]. http://doi.org/10.11583/DTU.24805506.v1
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.11583/DTU.24805506.v1
Dataset updated
Dec 15, 2023
Dataset provided by
Technical University of Denmark
Authors
Brooke Elizabeth Lampe
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
can-logThis dataset contains controller area network (CAN) traffic for the 2017 Subaru Forester, the 2016 Chevrolet Silverado, the 2011 Chevrolet Traverse, and the 2011 Chevrolet Impala.For each vehicle, there are samples of attack-free traffic--that is, normal traffic--as well as samples of various types of attacks. The spoofing attacks, such as RPM spoofing, speed spoofing, etc., have an observable effect on the vehicle under test.This repository contains only .log files. It is a subset of the can-dataset repository.
A
‘Traffic Signal Network Device Status Log’ analyzed by Analyst-2
analyst-2.ai
Updated Feb 13, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com), ‘Traffic Signal Network Device Status Log’ analyzed by Analyst-2 [Dataset]. https://analyst-2.ai/analysis/data-gov-traffic-signal-network-device-status-log-6d0d/39e6c45a/?iid=004-951&v=presentation
Explore at:
Dataset updated
Feb 13, 2022
Dataset authored and provided by
Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com)
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Analysis of ‘Traffic Signal Network Device Status Log’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://catalog.data.gov/dataset/16dee4dd-bdff-4198-ac62-c8cc996adca3 on 13 February 2022.

--- Dataset description provided by original source is as follows ---

The Austin Transportation Department manages thousands of IP-enabled devices which enable traffic signal operations. Devices include traffic cameras, battery backup systems, signal controllers, and vehicle detectors.

This dataset, updated daily, serves as a log of attempts to communicate with the various various devices on the traffic signals network.

--- Original source retains full ownership of the source dataset ---

Global Telecom Radio System Software Market Research Report: By Network...

wiseguyreports.com

Updated Aug 6, 2024

Facebook

Twitter

Click to copy link

Link copied

Cite

wWiseguy Research Consultants Pvt Ltd (2024). Global Telecom Radio System Software Market Research Report: By Network Technology (4G/LTE, 5G, 6G), By Deployment Type (On-Premises, Cloud-Based, Hybrid), By Functional Areas (Radio Resource Management, Mobility Management, Call Processing, Service Provisioning), By Security (Firewall Protection, Intrusion Detection Systems, Anti-Malware Solutions, Virtual Private Networks), By Input Source (Network Logs, Device Data, Subscribers' Data, Third-Party Applications) and By Regional (North America, Europe, South America, Asia Pacific, Middle East and Africa) - Forecast to 2032. [Dataset]. https://www.wiseguyreports.com/reports/telecom-radio-system-software-market

Explore at:

Dataset updated

Aug 6, 2024

Dataset authored and provided by

wWiseguy Research Consultants Pvt Ltd

License

https://www.wiseguyreports.com/pages/privacy-policyhttps://www.wiseguyreports.com/pages/privacy-policy

Time period covered

Jan 8, 2024

Area covered

Global

Description

BASE YEAR	2024
HISTORICAL DATA	2019 - 2024
REPORT COVERAGE	Revenue Forecast, Competitive Landscape, Growth Factors, and Trends
MARKET SIZE 2023	32.15(USD Billion)
MARKET SIZE 2024	33.67(USD Billion)
MARKET SIZE 2032	48.8(USD Billion)
SEGMENTS COVERED	Network Technology ,Deployment Type ,Functional Areas ,Security ,Input Source ,Regional
COUNTRIES COVERED	North America, Europe, APAC, South America, MEA
KEY MARKET DYNAMICS	5G deployment Growing demand for mobile data Increasing adoption of cloudbased radio access networks Focus on energy efficiency Government initiatives to promote connectivity
MARKET FORECAST UNITS	USD Billion
KEY COMPANIES PROFILED	Ericsson ,Nokia Siemens Networks ,Cisco System ,Texas Instruments ,ZTE Corporation ,Broadcom ,NEC Corporation ,Qualcomm ,Samsung Electronics ,AlcatelLucent ,Intel ,MediaTek ,Huawei Technologies ,Fujitsu ,Analog Devices
MARKET FORECAST PERIOD	2025 - 2032
KEY MARKET OPPORTUNITIES	5G Network Deployment Cloudbased Radio Access Networks Smart Cities and Internet of Things Network Virtualization and Automation Rural Connectivity
COMPOUND ANNUAL GROWTH RATE (CAGR)	4.75% (2025 - 2032)

Main network data sources worldwide 2021
statista.com
Updated Dec 10, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2024). Main network data sources worldwide 2021 [Dataset]. https://www.statista.com/statistics/1298898/network-data-sources-worldwide/
Explore at:
Dataset updated
Dec 10, 2024
Dataset authored and provided by
Statistahttp://statista.com/
Time period covered
Mar 2021
Area covered
Worldwide
Description
In 2021, companies' most important network data sources were security device logs at 30 percent, followed by application logs at 21 percent according to the respondents.
i
DDOS Attack Logs
ieee-dataport.org
Updated Apr 24, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Cornelius Itodo (2024). DDOS Attack Logs [Dataset]. https://ieee-dataport.org/documents/ddos-attack-logs
Explore at:
Dataset updated
Apr 24, 2024
Authors
Cornelius Itodo
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The sudden shift from physical office location to a fully remote or hybrid work model accelerated by the COVID-19 pandemic is a phenomenon that changed how organizations traditionally operated and thereby introduced new vulnerabilities and consequently changed the cyber threat landscape. This has led organizations around the globe to seek new approaches to protect their enterprise network. One such approach is the adoption of the Zero Trust security approach due to its many advantages over the traditional/perimeter security approach.
P
Phone call network for 2 years in a Euro country Dataset
paperswithcode.com
opendatalab.com
Updated Sep 21, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ding Lyu; Yuan Yuan; Lin Wang; Xiaofan Wang; Alex Pentland (2021). Phone call network for 2 years in a Euro country Dataset [Dataset]. https://paperswithcode.com/dataset/phone-call-network-for-2-years-in-a-euro
Explore at:
Dataset updated
Sep 21, 2021
Authors
Ding Lyu; Yuan Yuan; Lin Wang; Xiaofan Wang; Alex Pentland
Description
We employ a nationwide phone call dataset from Jan. 2015 to Dec. 2016. The log interaction duration and log interaction frequency in each phase (intermediate results) are both provided. Currently, we upload the Results folder to Google Drive. (https://drive.google.com/drive/folders/1h4rHZvzzQO7niYMelbzToJZernOij1dv?usp=sharing)

Please download the files from google drive for replication purposes.

In each file, we list tie ranges and interactions in all phases. For example, in 'Results/Graph_season_TR_Duration.txt', the former eight columns are tie range and the latter eight columns are log interaction duration. Tie range is calculated by the length of the second shortest path of two nodes. '-1' means that one node of this connection has no interaction with others in this phase. '100' means that there is no second path between two nodes, indicating that the tie range is infinite. '101' means that the degree of one node is 1, indicating that the tie range is infinite.

Differential privacy is applied to protect the privacy of users. Concretely, we add a Gaussian noise with μ=0, σ=5 to log interactions. When reproducing the results, please remove all numpy.log in the codes, and minus a σ for the calculation of error bars.
m
Data from: CTU Hornet 65 Niner: A Network Dataset of Geographically...
data.mendeley.com
data.niaid.nih.gov
+1more
Updated Oct 9, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Veronica Valeros (2024). CTU Hornet 65 Niner: A Network Dataset of Geographically Distributed Low-Interaction Honeypots [Dataset]. http://doi.org/10.17632/nt4p9zsv5k.1
Explore at:
Unique identifier
https://doi.org/10.17632/nt4p9zsv5k.1
Dataset updated
Oct 9, 2024
Authors
Veronica Valeros
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
CTU Hornet 65 Niner is a dataset of 65 days of network traffic attacks captured in cloud servers used as honeypots to help understand how geography may impact the inflow of network attacks. The honeypots were placed in nine different geographical locations: Amsterdam, London, Frankfurt, San Francisco, New York, Singapore, Toronto, Bangalore, and Sydney. The data was captured from April 28th to July 1st, 2024.

The nine cloud servers were created and configured following identical instructions using Ansible [1] in DigitalOcean [2] cloud provider. The network capture was performed using the Zeek [3] network monitoring tool, which was installed on each cloud server. The cloud servers had only one service running (SSH on a non-standard port) and were fully dedicated to being used as a honeypot. No honeypot software was used in this dataset.

The dataset is composed of nine scenarios:

Honeypot-Cloud-DigitalOcean-Geo-1: has 65 folders (YYYY-MM-DD), each containing 24 Zeek conn.log files and other Zeek files

Honeypot-Cloud-DigitalOcean-Geo-2: has 65 folders (YYYY-MM-DD), each containing 24 Zeek conn.log files and other Zeek files

Honeypot-Cloud-DigitalOcean-Geo-3: has 65 folders (YYYY-MM-DD), each containing 24 Zeek conn.log files and other Zeek files

Honeypot-Cloud-DigitalOcean-Geo-4: has 65 folders (YYYY-MM-DD), each containing 24 Zeek conn.log files and other Zeek files

Honeypot-Cloud-DigitalOcean-Geo-5: has 65 folders (YYYY-MM-DD), each containing 24 Zeek conn.log files and other Zeek files

Honeypot-Cloud-DigitalOcean-Geo-6: has 65 folders (YYYY-MM-DD), each containing 24 Zeek conn.log files and other Zeek files

Honeypot-Cloud-DigitalOcean-Geo-7: has 65 folders (YYYY-MM-DD), each containing 24 Zeek conn.log files and other Zeek files

Honeypot-Cloud-DigitalOcean-Geo-8: has 65 folders (YYYY-MM-DD), each containing 24 Zeek conn.log files and other Zeek files

Honeypot-Cloud-DigitalOcean-Geo-9: has 65 folders (YYYY-MM-DD), each containing 24 Zeek conn.log files and other Zeek files

References: [1] Ansible IT Automation Engine, https://www.ansible.com/. Accessed on 08/28/2024. [2] DigitalOcean, https://www.digitalocean.com/. Accessed on 08/28/2024. [3] Zeek Documentation, https://docs.zeek.org/en/master/index.html. Accessed on 08/28/2024.
Z
Network Forensics Market By solution (security information and event...
zionmarketresearch.com
pdf
Updated Jun 21, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Zion Market Research (2025). Network Forensics Market By solution (security information and event management, packet capture analysis, log management, intrusion detection system, threat intelligence, analytics, and firewall), By services (professional services and managed services), By application area (endpoint security, application security, data center security, network security, and others), By deployment mode (on-premises and cloud), By organization size (large enterprises and small and medium enterprises (SMEs)), By vertical (BFSI, healthcare, education, manufacturing, government, energy and utilities, telecom and IT, retail, and others) And By Region: - Global And Regional Industry Overview, Market Intelligence, Comprehensive Analysis, Historical Data, And Forecasts, 2023-2030 [Dataset]. https://www.zionmarketresearch.com/report/network-forensics-market
Explore at:
pdfAvailable download formats
Dataset updated
Jun 21, 2025
Dataset authored and provided by
Zion Market Research
License
https://www.zionmarketresearch.com/privacy-policyhttps://www.zionmarketresearch.com/privacy-policy
Time period covered
2022 - 2030
Area covered
Global
Description
Network Forensics Market was valued at $1.50 B in 2023, and is projected to reach $USD 5.49 B by 2032, at a CAGR of 15.49% from 2023 to 2032.
f
Comparison between quantum-like and classical inferences over a Bayesian...
plos.figshare.com
xls
Updated May 31, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Catarina Moreira; Emmanuel Haven; Sandro Sozzo; Andreas Wichert (2023). Comparison between quantum-like and classical inferences over a Bayesian network learned using an incomplete dataset (with 70% of missing data). [Dataset]. http://doi.org/10.1371/journal.pone.0207806.t004
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0207806.t004
Dataset updated
May 31, 2023
Dataset provided by
PLOS ONE
Authors
Catarina Moreira; Emmanuel Haven; Sandro Sozzo; Andreas Wichert
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The results show that quantum-like inferences achieved an average error of 5.90% when compared to the 22.85% error obtained in the classical inference. The column COMPLETE DATA BN represents the control network, which was learned using the full dataset.

Facebook

Twitter

Click to copy link

Link copied

Cite

Hasan, Raza (2024). Comprehensive Network Logs Dataset for Multi-Device Analysis [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_10492769

Comprehensive Network Logs Dataset for Multi-Device Analysis

Explore at:

Dataset updated

Jan 11, 2024

Dataset provided by

Hasan, Raza
Salman, Mahmood

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

This dataset comprises diverse logs from various sources, including cloud services, routers, switches, virtualization, network security appliances, authentication systems, DNS, operating systems, packet captures, proxy servers, servers, syslog data, and network data. The logs encompass a wide range of information such as traffic details, user activities, authentication events, DNS queries, network flows, security actions, and system events. By analyzing these logs collectively, users can gain insights into network patterns, anomalies, user authentication, cloud service usage, DNS traffic, network flows, security incidents, and system activities. The dataset is invaluable for network monitoring, performance analysis, anomaly detection, security investigations, and correlating events across the entire network infrastructure.

Clear search

Close search

Google apps

Main menu

Comprehensive Network Logs Dataset for Multi-Device Analysis

AIT Log Data Set V2.0

Kyoushi Log Data Set

ORIGINAL-NETWORK-TRAFFIC-Thursday-01-03-2018-LOGS

Traffic Signal Network Device Status Log

ORIGINAL-NETWORK-TRAFFIC-Tuesday-20-02-2018-PCAP

DNS Over HTTPS network traffic

Automotive Controller Area Network (CAN) Bus Intrusion Dataset

Data from: Traffic and Log Data Captured During a Cyber Defense Exercise

AIT Log Data Set V1.1

Synthetic Cybersecurity Logs for Anomaly Detection

can-log

‘Traffic Signal Network Device Status Log’ analyzed by Analyst-2

Global Telecom Radio System Software Market Research Report: By Network...

Main network data sources worldwide 2021

DDOS Attack Logs

Phone call network for 2 years in a Euro country Dataset

Data from: CTU Hornet 65 Niner: A Network Dataset of Geographically...

Network Forensics Market By solution (security information and event...

Comparison between quantum-like and classical inferences over a Bayesian...

Comprehensive Network Logs Dataset for Multi-Device Analysis