Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Packet Capture (PCAP) files of UNSW-NB15 and CIC-IDS2017 dataset are processed and labelled utilizing the CSV files. Each packet is labelled by comparing the eight distinct features: *Source IP, Destination IP, Source Port, Destination Port, Starting time, Ending time, Protocol and Time to live*. The dimensions for the dataset is Nx1504. All column of the dataset are integers, therefore you can directly utilize this dataset in you machine learning models. Moreover, details of the whole processing and transformation is provided in the following GitHub Repo:
https://github.com/Yasir-ali-farrukh/Payload-Byte
You can utilize the tool available at the above mentioned GitHub repo to generate labelled dataset from scratch. All of the detail of processing and transformation is provided in the following paper:
```yaml
@article{Payload,
author = "Yasir Ali Farrukh and Irfan Khan and Syed Wali and David Bierbrauer and Nathaniel Bastian",
title = "{Payload-Byte: A Tool for Extracting and Labeling Packet Capture Files of Modern Network Intrusion Detection Datasets}",
year = "2022",
month = "9",
url = "https://www.techrxiv.org/articles/preprint/Payload-Byte_A_Tool_for_Extracting_and_Labeling_Packet_Capture_Files_of_Modern_Network_Intrusion_Detection_Datasets/20714221",
doi = "10.36227/techrxiv.20714221.v1"
}
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Article Information
The work involved in developing the dataset and benchmarking its use of machine learning is set out in the article ‘IoMT-TrafficData: Dataset and Tools for Benchmarking Intrusion Detection in Internet of Medical Things’. DOI: 10.1109/ACCESS.2024.3437214.
Please do cite the aforementioned article when using this dataset.
Abstract
The increasing importance of securing the Internet of Medical Things (IoMT) due to its vulnerabilities to cyber-attacks highlights the need for an effective intrusion detection system (IDS). In this study, our main objective was to develop a Machine Learning Model for the IoMT to enhance the security of medical devices and protect patients’ private data. To address this issue, we built a scenario that utilised the Internet of Things (IoT) and IoMT devices to simulate real-world attacks. We collected and cleaned data, pre-processed it, and provided it into our machine-learning model to detect intrusions in the network. Our results revealed significant improvements in all performance metrics, indicating robustness and reproducibility in real-world scenarios. This research has implications in the context of IoMT and cybersecurity, as it helps mitigate vulnerabilities and lowers the number of breaches occurring with the rapid growth of IoMT devices. The use of machine learning algorithms for intrusion detection systems is essential, and our study provides valuable insights and a road map for future research and the deployment of such systems in live environments. By implementing our findings, we can contribute to a safer and more secure IoMT ecosystem, safeguarding patient privacy and ensuring the integrity of medical data.
ZIP Folder Content
The ZIP folder comprises two main components: Captures and Datasets. Within the captures folder, we have included all the captures used in this project. These captures are organized into separate folders corresponding to the type of network analysis: BLE or IP-Based. Similarly, the datasets folder follows a similar organizational approach. It contains datasets categorized by type: BLE, IP-Based Packet, and IP-Based Flows.
To cater to diverse analytical needs, the datasets are provided in two formats: CSV (Comma-Separated Values) and pickle. The CSV format facilitates seamless integration with various data analysis tools, while the pickle format preserves the intricate structures and relationships within the dataset.
This organization enables researchers to easily locate and utilize the specific captures and datasets they require, based on their preferred network analysis type or dataset type. The availability of different formats further enhances the flexibility and usability of the provided data.
Datasets' Content
Within this dataset, three sub-datasets are available, namely BLE, IP-Based Packet, and IP-Based Flows. Below is a table of the features selected for each dataset and consequently used in the evaluation model within the provided work.
Identified Key Features Within Bluetooth Dataset
Feature Meaning
btle.advertising_header BLE Advertising Packet Header
btle.advertising_header.ch_sel BLE Advertising Channel Selection Algorithm
btle.advertising_header.length BLE Advertising Length
btle.advertising_header.pdu_type BLE Advertising PDU Type
btle.advertising_header.randomized_rx BLE Advertising Rx Address
btle.advertising_header.randomized_tx BLE Advertising Tx Address
btle.advertising_header.rfu.1 Reserved For Future 1
btle.advertising_header.rfu.2 Reserved For Future 2
btle.advertising_header.rfu.3 Reserved For Future 3
btle.advertising_header.rfu.4 Reserved For Future 4
btle.control.instant Instant Value Within a BLE Control Packet
btle.crc.incorrect Incorrect CRC
btle.extended_advertising Advertiser Data Information
btle.extended_advertising.did Advertiser Data Identifier
btle.extended_advertising.sid Advertiser Set Identifier
btle.length BLE Length
frame.cap_len Frame Length Stored Into the Capture File
frame.interface_id Interface ID
frame.len Frame Length Wire
nordic_ble.board_id Board ID
nordic_ble.channel Channel Index
nordic_ble.crcok Indicates if CRC is Correct
nordic_ble.flags Flags
nordic_ble.packet_counter Packet Counter
nordic_ble.packet_time Packet time (start to end)
nordic_ble.phy PHY
nordic_ble.protover Protocol Version
Identified Key Features Within IP-Based Packets Dataset
Feature Meaning
http.content_length Length of content in an HTTP response
http.request HTTP request being made
http.response.code Sequential number of an HTTP response
http.response_number Sequential number of an HTTP response
http.time Time taken for an HTTP transaction
tcp.analysis.initial_rtt Initial round-trip time for TCP connection
tcp.connection.fin TCP connection termination with a FIN flag
tcp.connection.syn TCP connection initiation with SYN flag
tcp.connection.synack TCP connection establishment with SYN-ACK flags
tcp.flags.cwr Congestion Window Reduced flag in TCP
tcp.flags.ecn Explicit Congestion Notification flag in TCP
tcp.flags.fin FIN flag in TCP
tcp.flags.ns Nonce Sum flag in TCP
tcp.flags.res Reserved flags in TCP
tcp.flags.syn SYN flag in TCP
tcp.flags.urg Urgent flag in TCP
tcp.urgent_pointer Pointer to urgent data in TCP
ip.frag_offset Fragment offset in IP packets
eth.dst.ig Ethernet destination is in the internal network group
eth.src.ig Ethernet source is in the internal network group
eth.src.lg Ethernet source is in the local network group
eth.src_not_group Ethernet source is not in any network group
arp.isannouncement Indicates if an ARP message is an announcement
Identified Key Features Within IP-Based Flows Dataset
Feature Meaning
proto Transport layer protocol of the connection
service Identification of an application protocol
orig_bytes Originator payload bytes
resp_bytes Responder payload bytes
history Connection state history
orig_pkts Originator sent packets
resp_pkts Responder sent packets
flow_duration Length of the flow in seconds
fwd_pkts_tot Forward packets total
bwd_pkts_tot Backward packets total
fwd_data_pkts_tot Forward data packets total
bwd_data_pkts_tot Backward data packets total
fwd_pkts_per_sec Forward packets per second
bwd_pkts_per_sec Backward packets per second
flow_pkts_per_sec Flow packets per second
fwd_header_size Forward header bytes
bwd_header_size Backward header bytes
fwd_pkts_payload Forward payload bytes
bwd_pkts_payload Backward payload bytes
flow_pkts_payload Flow payload bytes
fwd_iat Forward inter-arrival time
bwd_iat Backward inter-arrival time
flow_iat Flow inter-arrival time
active Flow active duration
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Network traffic datasets with novel extended IP flow called NetTiSA flow
Datasets were created for the paper: NetTiSA: Extended IP Flow with Time-series Features for Universal Bandwidth-constrained High-speed Network Traffic Classification -- Josef Koumar, Karel Hynek, Jaroslav Pešek, Tomáš Čejka -- which is published in The International Journal of Computer and Telecommunications Networking https://doi.org/10.1016/j.comnet.2023.110147Please cite the usage of our datasets as:
Josef Koumar, Karel Hynek, Jaroslav Pešek, Tomáš Čejka, "NetTiSA: Extended IP flow with time-series features for universal bandwidth-constrained high-speed network traffic classification", Computer Networks, Volume 240, 2024, 110147, ISSN 1389-1286
@article{KOUMAR2024110147, title = {NetTiSA: Extended IP flow with time-series features for universal bandwidth-constrained high-speed network traffic classification}, journal = {Computer Networks}, volume = {240}, pages = {110147}, year = {2024}, issn = {1389-1286}, doi = {https://doi.org/10.1016/j.comnet.2023.110147}, url = {https://www.sciencedirect.com/science/article/pii/S1389128623005923}, author = {Josef Koumar and Karel Hynek and Jaroslav Pešek and Tomáš Čejka} }
This Zenodo repository contains 23 datasets created from 15 well-known published datasets, which are cited in the table below. Each dataset contains the NetTiSA flow feature vector.
NetTiSA flow feature vector
The novel extended IP flow called NetTiSA (Network Time Series Analysed) flow contains a universal bandwidth-constrained feature vector consisting of 20 features. We divide the NetTiSA flow classification features into three groups by computation. The first group of features is based on classical bidirectional flow information---a number of transferred bytes, and packets. The second group contains statistical and time-based features calculated using the time-series analysis of the packet sequences. The third type of features can be computed from the previous groups (i.e., on the flow collector) and improve the classification performance without any impact on the telemetry bandwidth.
Flow features
The flow features are:
Packets is the number of packets in the direction from the source to the destination IP address.
Packets in reverse order is the number of packets in the direction from the destination to the source IP address.
Bytes is the size of the payload in bytes transferred in the direction from the source to the destination IP address.
Bytes in reverse order is the size of the payload in bytes transferred in the direction from the destination to the source IP address.
Statistical and Time-based features
The features that are exported in the extended part of the flow. All of them can be computed (exactly or in approximative) by stream-wise computation, which is necessary for keeping memory requirements low. The second type of feature set contains the following features:
Mean represents mean of the payload lengths of packets
Min is the minimal value from payload lengths of all packets in a flow
Max is the maximum value from payload lengths of all packets in a flow
Standard deviation is a measure of the variation of payload lengths from the mean payload length
Root mean square is the measure of the magnitude of payload lengths of packets
Average dispersion is the average absolute difference between each payload length of the packet and the mean value
Kurtosis is the measure describing the extent to which the tails of a distribution differ from the tails of a normal distribution
Mean of relative times is the mean of the relative times which is a sequence defined as (st = {t_1 - t_1, t_2 - t_1, ..., t_n - t_1} )
Mean of time differences is the mean of the time differences which is a sequence defined as (dt = { t_j - t_i | j = i + 1, i \in {1, 2, \dots, n - 1} }.)
Min from time differences is the minimal value from all time differences, i.e., min space between packets.
Max from time differences is the maximum value from all time differences, i.e., max space between packets.
Time distribution describes the deviation of time differences between individual packets within the time series. The feature is computed by the following equation:(tdist = \frac{ \frac{1}{n-1} \sum_{i=1}^{n-1} \left| \mu_{{dt_{n-1}}} - dt_i \right| }{ \frac{1}{2} \left(max\left({dt_{n-1}}\right) - min\left({dt_{n-1}}\right) \right) })
Switching ratio represents a value change ratio (switching) between payload lengths. The switching ratio is computed by equation:(sr = \frac{s_n}{\frac{1}{2} (n - 1)})
where \(s_n\) is number of switches.
Features computed at the collectorThe third set contains features that are computed from the previous two groups prior to classification. Therefore, they do not influence the network telemetry size and their computation does not put additional load to resource-constrained flow monitoring probes. The NetTiSA flow combined with this feature set is called the Enhanced NetTiSA flow and contains the following features:
Max minus min is the difference between minimum and maximum payload lengths
Percent deviation is the dispersion of the average absolute difference to the mean value
Variance is the spread measure of the data from its mean
Burstiness is the degree of peakedness in the central part of the distribution
Coefficient of variation is a dimensionless quantity that compares the dispersion of a time series to its mean value and is often used to compare the variability of different time series that have different units of measurement
Directions describe a percentage ratio of packet direction computed as (\frac{d_1}{ d_1 + d_0}), where (d_1) is a number of packets in a direction from source to destination IP address and (d_0) the opposite direction. Both (d_1) and (d_0) are inside the classical bidirectional flow.
Duration is the duration of the flow
The NetTiSA flow is implemented into IP flow exporter ipfixprobe.
Description of dataset files
In the following table is a description of each dataset file:
File name
Detection problem
Citation of the original raw dataset
botnet_binary.csv Binary detection of botnet S. García et al. An Empirical Comparison of Botnet Detection Methods. Computers & Security, 45:100–123, 2014.
botnet_multiclass.csv Multi-class classification of botnet S. García et al. An Empirical Comparison of Botnet Detection Methods. Computers & Security, 45:100–123, 2014.
cryptomining_design.csv Binary detection of cryptomining; the design part Richard Plný et al. Datasets of Cryptomining Communication. Zenodo, October 2022
cryptomining_evaluation.csv Binary detection of cryptomining; the evaluation part Richard Plný et al. Datasets of Cryptomining Communication. Zenodo, October 2022
dns_malware.csv Binary detection of malware DNS Samaneh Mahdavifar et al. Classifying Malicious Domains using DNS Traffic Analysis. In DASC/PiCom/CBDCom/CyberSciTech 2021, pages 60–67. IEEE, 2021.
doh_cic.csv Binary detection of DoH Mohammadreza MontazeriShatoori et al. Detection of doh tunnels using time-series classification of encrypted traffic. In DASC/PiCom/CBDCom/CyberSciTech 2020, pages 63–70. IEEE, 2020
doh_real_world.csv Binary detection of DoH Kamil Jeřábek et al. Collection of datasets with DNS over HTTPS traffic. Data in Brief, 42:108310, 2022
dos.csv Binary detection of DoS Nickolaos Koroniotis et al. Towards the development of realistic botnet dataset in the Internet of Things for network forensic analytics: Bot-IoT dataset. Future Gener. Comput. Syst., 100:779–796, 2019.
edge_iiot_binary.csv Binary detection of IoT malware Mohamed Amine Ferrag et al. Edge-iiotset: A new comprehensive realistic cyber security dataset of iot and iiot applications: Centralized and federated learning, 2022.
edge_iiot_multiclass.csv Multi-class classification of IoT malware Mohamed Amine Ferrag et al. Edge-iiotset: A new comprehensive realistic cyber security dataset of iot and iiot applications: Centralized and federated learning, 2022.
https_brute_force.csv Binary detection of HTTPS Brute Force Jan Luxemburk et al. HTTPS Brute-force dataset with extended network flows, November 2020
ids_cic_binary.csv Binary detection of intrusion in IDS Iman Sharafaldin et al. Toward generating a new intrusion detection dataset and intrusion traffic characterization. ICISSp, 1:108–116, 2018.
ids_cic_multiclass.csv Multi-class classification of intrusion in IDS Iman Sharafaldin et al. Toward generating a new intrusion detection dataset and intrusion traffic characterization. ICISSp, 1:108–116, 2018.
unsw_binary.csv Binary detection of intrusion in IDS Nour Moustafa and Jill Slay. Unsw-nb15: a comprehensive data set for network intrusion detection systems (unsw-nb15 network data set). In 2015 military communications and information systems conference (MilCIS), pages 1–6. IEEE, 2015.
unsw_multiclass.csv Multi-class classification of intrusion in IDS Nour Moustafa and Jill Slay. Unsw-nb15: a comprehensive data set for network intrusion detection systems (unsw-nb15 network data set). In 2015 military communications and information systems conference (MilCIS), pages 1–6. IEEE, 2015.
iot_23.csv Binary detection of IoT malware Sebastian Garcia et al. IoT-23: A labeled dataset with malicious and benign IoT network traffic, January 2020. More details here https://www.stratosphereips.org /datasets-iot23
ton_iot_binary.csv Binary detection of IoT malware Nour Moustafa. A new distributed architecture for evaluating ai-based security systems at the edge: Network ton iot datasets. Sustainable Cities and Society, 72:102994, 2021
ton_iot_multiclass.csv Multi-class classification of IoT malware Nour Moustafa. A new distributed architecture for evaluating ai-based security systems at the edge: Network ton iot datasets.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset contains results from streaming VR content over Wi-Fi 6 using our Air Light VR (ALVR) v20.6.0 fork. In particular, it comprises ALVR session logs with statistics in JSON format for each test in Sections VI and VII of our published paper, NeSt-VR: Adaptive Bitrate Algorithm for Virtual Reality Streaming over Wi-Fi. Additionally, for each test in Section VI, it includes tshark-processed traffic traces in space-separated CSV format, collected using Wireshark v4.0.3 at both the server and the network emulator’s Ethernet interface to the access point. Moreover, for each test in Section VI, validation result figures are included. For each test in Section VII, temporal evolution and/or boxplot figures for several Quality of Service metrics—such as delivery frame rate, bitrate, video frame round-trip time, and packet loss—are also included.
Section VI tests use a Constant BitRate (CBR) of 100 Mbps with several emulated network effects, including limited bandwidth (100 Mbps, 95 Mbps, 90 Mbps), packet loss (0.5%, 1%, 2%), duplicated packets (0.5%, 1%, 2%), and packet jitter (0–6 ms, 0–10 ms, 0–20 ms).
The dataset structure for Section VII includes a folder for each subsection (VII A: 7.1, VII B: 7.2, VII C: 7.3, VII D: 7.4). Section 7.1 folder includes tests on emulated limited network bandwidth (100 Mbps, 95 Mbps, 90 Mbps) using either CBR, ALVR's native Adaptive BitRate (ABR) algorithm, or our VR-tailored ABR, NeSt-VR (Network-aware Step-wise ABR algorithm for VR streaming). Section 7.2 folder contains a single-user (user A) mobility test using either CBR or NeSt-VR. Section 7.3 folder includes a multi-user test with two users (user A and user B) using either CBR or NeSt-VR, with results for both users streaming in isolation or concurrently. Section 7.4 folder contains tests with Overlapping Basic Service Set (OBSS) activity, where two access points operate on the same frequency channel with overlapping coverage areas, using either a fully overlapping channel bandwidth of 40 MHz or 80 MHz.
ALVR session logs contain several built-in ALVR statistics (event_type:{"id":"GraphStatistics", which includes total pipeline latency and its components) and additional statistics incorporated in our ALVR fork (event_type:{"id":"GraphNetworkStatistics", which records metrics such as frame span, frame interarrival, video frame round-trip time, packet loss, instantaneous video network throughput, peak network throughput, video frame jitter, video packet jitter, and filtered one-way delay; event_type:{"id":"HeuristicStats", which includes the decision-making statistics involved in each NeSt-VR bitrate adjustment interval). Please refer to our published paper or our ALVR fork for more details.
Tshark-processed traffic traces contain several packet-level details: the relative timestamp (frame.time_relative), source and destination IP addresses (ip.src, ip.dst), total packet length including headers and payload (frame.len), and the raw packet payload (data.data). The first 22 bytes of each packet’s payload contain ALVR’s application-specific prefix, which includes the associated frame’s payload size in bytes (4 bytes), a stream identifier (2 bytes), the frame index (4 bytes), the number of packets composing the frame (4 bytes), the packet index within the frame (4 bytes), and the packet’s relative departure time (4 bytes).
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The IoMT-TrafficData dataset has been developed to benchmark Machine Learning models for Intrusion Detection Systems (IDS) in the Internet of Medical Things (IoMT). The dataset simulates real-world attacks and normal network behavior in IoT and IoMT environments to enhance medical device security and patient data protection.
The dataset and its benchmarking methodology are detailed in the research article.
If you use this dataset, please credit the original authors:
Areia, J., Bispo, I. A., Santos, L., & Costa, R. L. (2023). IoMT-TrafficData: Dataset and Tools for Benchmarking Intrusion Detection in Internet of Medical Things.
IEEE Access. DOI: 10.1109/ACCESS.2024.3437214
Zenodo DOI: 10.5281/zenodo.8116338
Original Source: Zenodo (Creative Commons Attribution 4.0 International License)
| Feature | Meaning |
|---|---|
| btle.advertising_header | BLE Advertising Packet Header |
| btle.advertising_header.ch_sel | Channel Selection Algorithm |
| btle.advertising_header.length | Advertising Length |
| btle.advertising_header.pdu_type | Advertising PDU Type |
| nordic_ble.crcok | Indicates if CRC is Correct |
| nordic_ble.packet_time | Packet time (start to end) |
| nordic_ble.phy | PHY |
| ... | (see Zenodo for full feature list) |
| Feature | Meaning |
|---|---|
| http.content_length | Length of HTTP response content |
| tcp.analysis.initial_rtt | Initial round-trip time for TCP |
| tcp.flags.syn | SYN flag in TCP |
| arp.isannouncement | Indicates ARP announcement |
| ... | (see Zenodo for full list) |
| Feature | Meaning |
|---|---|
| proto | Transport layer protocol |
| service | Application protocol |
| orig_bytes | Originator payload bytes |
| resp_bytes | Responder payload bytes |
| flow_duration | Duration of the flow |
| fwd_pkts_per_sec | Forward packets per second |
| flow_iat | Flow inter-arrival time |
| ... | (see Zenodo for full list) |
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset was created to evaluate characteristics of Unevenly sampled time series from network traffic (USTS) for the paper Unevenly Spaced Time Series from Network Traffic.
The file named time_series.tar.gz contains a folder with time series CSV files as raw data of the experiment. In the folder are the following files:
fts.csv -- contains 2.6 million Flow time series (FTS) created from 259 million IP flows,
pts.csv -- contains 19 million Packet time series (PTS) created from 110 million network packets,
sfts.csv -- contains 15 million Single flow time series (SFTS) created from 160 million network packets.
Traffic was captured on the national CESNET2 network from February 2023 to April 2023. All IP addresses in the dataset were anonymized.
The fts.csv has the following format:
ID_DEPENDENCY -- Identification of a network dependency observed as a Flow time series. (real IP address was anonimized by replacing with a random IP address)
N_FLOWS -- Number of flows in time series, i.e., number of data points.
N_PACKETS -- Number of packets in time series, i.e., the sum of metric PACKETS.
N_BYTES -- Number of bytes in time series, i.e., the sum of metric PACKETS.
PACKETS -- The array containing the time series metric number of packets in the IP flow.
BYTES -- The array containing the time series metric number of bytes in the IP flow.
START_TIMES -- The array containing the time series time axis of the flows starts.
END_TIMES -- The array containing the time series time axis of the flows ends.
The pts.csv has the following format:
ID_DEPENDENCY -- Identification of a network dependency observed as a Packet time series. (real IP address was anonymized by replacing with a random IP address)
BYTES -- The array containing the time series metric payload length of the network packet.
TIMES -- The array containing the time series time axis of the transmission of network packets.
The sfts.csv has the following format:
SRC_IP -- Source IP address. (real IP address was anonimized by replacing with a random IP address)
SRC_PORT -- Source port.
DST_IP -- Destination IP address (real IP address was anonymized by replacing with a random IP address)
DST_PORT -- Destination port.
bytes -- The array containing the time series metric payload length of the network packet.
time -- The array containing the time series time axis of the transmission of network packets.
The file named characteristics.tar.gz contains a folder with characteristics gained by experiments from time series files. In the folder are the following files:
fts.characteristics.csv -- Characteristics about Flow time series from the fts.csv.
pts.characteristics.csv -- Characteristics about Packet time series from the pts.csv.
sfts.characteristics.csv -- Characteristics about Single flow time series from the sfts.csv.
The fts.characteristics.csv has the following format:
LENGTH -- Number of data points in the source time series.
DURATION -- Duration of the source time series.
H_BYTES -- Hurst exponent of the source time series metric BYTES.
STATIONARITY_PACKETS -- Stationarity of the source time series metric PACKETS.
STATIONARITY_BYTES -- Stationarity of the source time series metric BYTES.
OVERALL_STATIONARITY -- Overal stationarity created by merging STATIONARITY_PACKETS and STATIONARITY_BYTES.
The pts.characteristics.csv and sfts.characteristics.csv have the following format:
LENGTH -- Number of data points in the source time series.
DURATION -- Duration of the source time series.
H -- Hurst exponent of the source time series.
STATIONARITY -- Stationarity of the source time series.
We provide the samples of all zipped files for a quick lookup: fts.characteristics.sample.csv, fts.sample.csv, pts.characteristics.sample.csv, pts.sample.csv, sfts.characteristics.sample.csv, sfts.sample.csv
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This repository contains three cabled DASH7 data sets. All data sets are formatted as sigmf-data and sigmf-meta pairs, which can be investigated using IQEngine, GNU Radio, or MATLAB. Below you can find a more extended description of the data sets.
CH0.zip, CH93.zip, CH186.zip:
logs.zip:
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description Welcome to the Drone-Based Malware Detection dataset! This dataset is designed to aid researchers and practitioners in exploring innovative cybersecurity solutions using drone-collected data. The dataset contains detailed information on network traffic, drone sensor readings, malware detection indicators, and environmental conditions. It offers a unique perspective by integrating data from drones with traditional network security metrics to enhance malware detection capabilities.
Dataset Overview The dataset comprises four main categories:
Network Traffic Data: Captures network traffic attributes including IP addresses, ports, protocols, packet sizes, and various derived metrics. Drone Sensor Data: Includes GPS coordinates, altitude, speed, heading, battery level, and other sensor readings from drones. Malware Detection Data: Contains indicators and scores relevant to detecting malware, such as anomaly scores, suspicious IP counts, reputation scores, and attack types. Environmental Data: Provides context through environmental conditions like location type, noise level, weather conditions, and more. Files and Features The dataset is divided into four separate CSV files:
network_traffic_data.csv
timestamp: Date and time of the traffic event. source_ip: Source IP address. destination_ip: Destination IP address. source_port: Source port number. destination_port: Destination port number. protocol: Network protocol (TCP, UDP, ICMP). packet_length: Length of the network packet. payload_data: Content of the packet payload. flag: Network flag (SYN, ACK, FIN, RST). traffic_volume: Volume of traffic in bytes. flow_duration: Duration of the network flow. flow_bytes_per_s: Bytes per second for the flow. flow_packets_per_s: Packets per second for the flow. packet_count: Number of packets in the flow. average_packet_size: Average size of packets. min_packet_size: Minimum packet size. max_packet_size: Maximum packet size. packet_size_variance: Variance in packet sizes. header_length: Length of the packet header. payload_length: Length of the packet payload. ip_ttl: Time to live for the IP packet. tcp_window_size: TCP window size. icmp_type: ICMP type (echo_request, echo_reply, destination_unreachable). dns_query_count: Number of DNS queries. dns_response_count: Number of DNS responses. http_method: HTTP method (GET, POST, PUT, DELETE). http_status_code: HTTP status code (200, 404, 500, 301). content_type: Content type (text/html, application/json, image/png). ssl_tls_version: SSL/TLS version. ssl_tls_cipher_suite: SSL/TLS cipher suite. drone_data.csv
latitude: Latitude of the drone. longitude: Longitude of the drone. altitude: Altitude of the drone. speed: Speed of the drone. heading: Heading of the drone. battery_level: Battery level of the drone. drone_id: Unique identifier for the drone. flight_time: Total flight time. signal_strength: Strength of the drone's signal. temperature: Temperature at the drone's location. humidity: Humidity at the drone's location. pressure: Atmospheric pressure at the drone's location. wind_speed: Wind speed at the drone's location. wind_direction: Wind direction at the drone's location. gps_accuracy: Accuracy of the GPS signal. malware_detection_data.csv
anomaly_score: Score indicating the level of anomaly detected. suspicious_ip_count: Number of suspicious IP addresses detected. malicious_payload_indicator: Indicator for malicious payload (0 or 1). reputation_score: Reputation score for the network entity. behavioral_score: Behavioral score indicating potential malicious activity. attack_type: Type of attack (DDoS, phishing, malware). signature_match: Indicator for signature match (0 or 1). sandbox_result: Result from sandbox analysis (clean, infected). heuristic_score: Heuristic score for potential threats. traffic_pattern: Pattern of the traffic (burst, steady). environmental_data.csv
location_type: Type of location (urban, rural). nearby_devices: Number of nearby devices. signal_interference: Level of signal interference. noise_level: Noise level in the environment. time_of_day: Time of day (morning, afternoon, evening, night). day_of_week: Day of the week. weather_conditions: Weather conditions (sunny, rainy, cloudy, stormy). Usage and Applications This dataset can be used for:
Cybersecurity Research: Developing and testing algorithms for malware detection using drone data. Machine Learning: Training models to identify malicious activity based on network traffic and drone sensor readings. Data Analysis: Exploring the relationships between environmental conditions, drone sensor data, and network traffic anomalies. Educational Purposes: Teaching data science, machine learning, and cybersecurity concepts using a comprehensive and multi-faceted dataset.
Acknowledgements This dataset is based on real-world data collected from drone sensors and network traffic monitoring s...
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The dataset was collected using honeypots deployed with the Honeytrap agent. The honeypots captured both benign and malicious network traffic, providing valuable insights into different attack behaviors. The dataset consists of 9 features that represent various aspects of network traffic, including both structural and payload data. These features are as follows:
This dataset was used to train machine learning models to classify the network traffic as either benign or malicious. The features provide valuable information to differentiate between normal communication and suspicious activities, such as potential cyber-attacks.
Facebook
TwitterThis dataset meticulously captured for the analysis and detection of cyberattacks using machine learning techniques. It comprises 100,000 rows, each representing a unique cyberattack event. The dataset includes a diverse range of attack types, protocols, and affected systems, making it an invaluable resource for developing and testing detection models.
YYYY-MM-DD HH:MM:SS. This column helps in analyzing the temporal patterns of attacks and identifying trends over time.Detected) or not (Not Detected). This binary label is crucial for evaluating the effectiveness of detection models and understanding detection rates.The dataset introduces a realistic element by including null values in various columns. This simulates real-world data imperfections and prepares the dataset for more robust handling and preprocessing techniques during analysis. The inclusion of unique IP addresses for both source and destination adds to the authenticity, reflecting the diverse nature of cyberattacks in the real world.
Overall, this dataset is a valuable resource for researchers, analysts, and developers working on cybersecurity solutions. It provides a rich, varied, and realistic foundation for developing and testing machine learning models aimed at detecting and mitigating cybe...
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
We recommend using the CESNET DataZoo python library, which facilitates the work with large network traffic datasets. More information about the DataZoo project can be found in the GitHub repository https://github.com/CESNET/cesnet-datazoo.
The modern approach for network traffic classification (TC), which is an important part of operating and securing networks, is to use machine learning (ML) models that are able to learn intricate relationships between traffic characteristics and communicating applications. A crucial prerequisite is having representative datasets. However, datasets collected from real production networks are not being published in sufficient numbers. Thus, this paper presents a novel dataset, CESNET-TLS-Year22, that captures the evolution of TLS traffic in an ISP network over a year. The dataset contains 180 web service labels and standard TC features, such as packet sequences. The unique year-long time span enables comprehensive evaluation of TC models and assessment of their robustness in the face of the ever-changing environment of production networks.
Data description The dataset consists of network flows describing encrypted TLS communications. Flows are extended with packet sequences, histograms, and fields extracted from the TLS ClientHello message, which is transmitted in the first packet of the TLS connection handshake. The most important extracted handshake field is the SNI domain, which is used for ground-truth labeling.
Packet Sequences Sequences of packet sizes, directions, and inter-packet times are standard data input for traffic analysis. For packet sizes, we consider the payload size after transport headers (TCP headers for the TLS case). We omit packets with no TCP payload, for example ACKs, because zero-payload packets are related to the transport layer internals rather than services’ behavior. Packet directions are encoded as ±1, where +1 means a packet sent from client to server, and -1 is a packet from server to client. Inter-packet times depend on the location of communicating hosts, their distance, and on the network conditions on the path. However, it is still possible to extract relevant information that correlates with user interactions and, for example, with the time required for an API/server/database to process the received data and generate a response. Packet sequences have a maximum length of 30, which is the default setting of the used flow exporter. We also derive three fields from each packet sequence: its length, time duration, and the number of roundtrips. The roundtrips are counted as the number of changes in the communication direction; in other words, each client request and server response pair counts as one roundtrip.
Flow statistics Each data record also includes standard flow statistics, representing aggregated information about the entire bidirectional connection. The fields are the number of transmitted bytes and packets in both directions, the duration of the flow, and packet histograms. The packet histograms include binned counts (not limited to the first 30 packets) of packet sizes and inter-packet times in both directions. There are eight bins with a logarithmic scale; the intervals are 0-15, 16-31, 32-63, 64-127, 128-255, 256-511, 512-1024, >1024 [ms or B]. The units are milliseconds for inter-packet times and bytes for packet sizes (More information in the PHISTS plugin documentation). Moreover, each flow has its end reason---either it ended with the TCP connection termination (FIN packets), was idle, reached the active timeout, or ended due to other reasons. This corresponds with the official IANA IPFIX-specified values. The FLOW_ENDREASON_OTHER field represents the forced end and lack of resources reasons.
Dataset structure The dataset is organized per weeks and individual days. The flows are delivered in compressed CSV files. CSV files contain one flow per row; data columns are summarized in the provided list below. For each flow data file, there is a JSON file with the total number of saved flows and the number of flows per service. There are also files aggregating flow counts for each week (stats-week.json) and for the entire dataset (stats-dataset.json). The following list describes flow data fields in CSV files:
ID: Unique identifier
SRC_IP: Source IP address
DST_IP: Destination IP address
DST_ASN: Destination Autonomous System number
SRC_PORT: Source port
DST_PORT: Destination port
PROTOCOL: Transport protocol
FLAG_CWR: Presence of the CWR flag
FLAG_CWR_REV: Presence of the CWR flag in the reverse direction
FLAG_ECE: Presence of the ECE flag
FLAG_ECE_REV: Presence of the ECE flag in the reverse direction
FLAG_URG: Presence of the URG flag
FLAG_URG_REV: Presence of the URG flag in the reverse direction
FLAG_ACK: Presence of the ACK flag
FLAG_ACK_REV: Presence of the ACK flag in the reverse direction
FLAG_PSH: Presence of the PSH flag
FLAG_PSH_REV: Presence of the PSH flag in the reverse direction
FLAG_RST: Presence of the RST flag
FLAG_RST_REV: Presence of the RST flag in the reverse direction
FLAG_SYN: Presence of the SYN flag
FLAG_SYN_REV: Presence of the SYN flag in the reverse direction
FLAG_FIN: Presence of the FIN flag
FLAG_FIN_REV: Presence of the FIN flag in the reverse direction
TLS_SNI: Server Name Indication domain
TLS_JA3: JA3 fingerprint of TLS client
TIME_FIRST: Timestamp of the first packet in format YYYY-MM-DDTHH-MM-SS.ffffff
TIME_LAST: Timestamp of the last packet in format YYYY-MM-DDTHH-MM-SS.ffffff
DURATION: Duration of the flow in seconds
BYTES: Number of transmitted bytes from client to server
BYTES_REV: Number of transmitted bytes from server to client
PACKETS: Number of packets transmitted from client to server
PACKETS_REV: Number of packets transmitted from server to client
PPI: Packet sequence in the format: [[inter-packet times], [packet directions], [packet sizes], [push flags]]
PPI_LEN: Number of packets in the PPI sequence
PPI_DURATION: Duration of the PPI sequence in seconds
PPI_ROUNDTRIPS: Number of roundtrips in the PPI sequence
PHIST_SRC_SIZES: Histogram of packet sizes from client to server
PHIST_DST_SIZES: Histogram of packet sizes from server to client
PHIST_SRC_IPT: Histogram of inter-packet times from client to server
PHIST_DST_IPT: Histogram of inter-packet times from server to client
APP: Web service label
CATEGORY: Service category
FLOW_ENDREASON_IDLE: Flow was terminated because it was idle
FLOW_ENDREASON_ACTIVE: Flow was terminated because it reached the active timeout
FLOW_ENDREASON_END: Flow ended with the TCP connection termination
FLOW_ENDREASON_OTHER: Flow was terminated for other reasons
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Dataset of resulting files from capturing VR traffic in Wi-Fi 6 of a fork of the Air Light Virtual Reality (ALVR) software, used to stream games from a PC to a VR HMD in real time. The dataset includes:
Parsed Wireshark captures in csv format, both captured from server and network emulator, and corresponding ALVR session log are found for each experiment. In each folder, all files of netem, server or ALVR are found (with names corresponding to the emulated network effect, which is applied via the netem computer). We are using Constant BitRate (CBR) for each test, at 100 Mbps. The plots are added in the corresponding folder for each effect, and a metric comparison between WS and ALVR.
ALVR session logs for a comparison on the logged metrics under tests of Mobility, using different strategies for bitrate adaptation: CBR, ABR and our own contribution.
ALVR session logs for a comparison on the logged metrics under tests of emulated capacity drops, using different strategies for bitrate adaptation: CBR, ABR and our own contribution.
The Wireshark captures have been parsed from UDP packets in a pcapng into a csv file (via tshark) containing the principal fields of each packet separated by a space, since the pcap captures were over 1 GB each, we keep only a subset of the first bytes of the payload and the main fields, and discard the rest. There are additional CSV files for TCP UL packets, which we parsed separately from the same captures for us to validate the measured RTT of ALVR.
The ALVR session logs contain raw json strings in .txt format, logged from the server using our fork of ALVR. We're using some additional events from the ones ALVR originally used, in order to log our metrics at arbitrary points in the code.
The first 22 bytes of the payload in each packet are used to parse into the StreamSocket fields that ALVR uses, and record timestamps to validate the metrics of ALVR manually; which can be used to reproduce our results. Namely, each row of the csv (frame.time_relative, ip.src, ip.dst, frame.len, data.data) contains the timestamp of the packet, its IP source, destination, length and first 22 bytes of the payload as a hexadecimal string.
Facebook
Twitterhttps://choosealicense.com/licenses/gpl/https://choosealicense.com/licenses/gpl/
Dataset Card: Africa Brain Tumor Scans with Synthetic EHR (Bundled Parquet)
This dataset bundles single-slice brain MRI scans and richly structured, synthetic EHR data into a single Parquet file suitable for multimodal ML research. Each row contains an image struct (bytes + path), a source label column, and an EHR payload with both a full JSON record and convenient summary columns. The synthetic EHRs are Africa-focused: they encode country, urban/rural, facility level, insurance… See the full description on the dataset page: https://huggingface.co/datasets/electricsheepafrica/brain-tumor-single-slice-MRI-scan-with-synthetic-ehr-africa.
Facebook
TwitterAttribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
The INDDOS24 Dataset is a comprehensive and synthetic dataset designed for analyzing Distributed Denial of Service (DDoS) attacks in Internet of Things (IoT) networks. The dataset spans a period from January 1, 2019, to July 1, 2024, capturing hourly network traffic from various IoT devices, including cameras, sensors, and smart appliances. This dataset simulates realistic traffic dynamics, including both normal operations and attack scenarios, providing researchers and practitioners with a rich resource to develop and evaluate machine learning and deep learning-based DDoS detection models.
Key Features of INDDOS24 Dataset Timestamp: The date and time of each network event, recorded hourly, covering more than five years of traffic data.
Source IP: The IP address from which the network traffic originates, representing the source device in the network.
Destination IP: The IP address to which the network traffic is directed, representing the target device.
Source Port: The port number used by the source device for communication.
Destination Port: The port number used by the target device for receiving traffic.
Protocol: The communication protocol used, including TCP, UDP, and ICMP.
Packet Size: The size of each network packet in bytes, ranging from small control packets to large data transmissions.
Payload Length: The length of the payload in the network packets, representing actual data being transmitted.
Flow Duration: The duration of the network flow in seconds, capturing the session length between devices.
Bytes in Flow: The total number of bytes transmitted during the flow.
Packets in Flow: The total number of packets transmitted during the flow.
Average Packet Size: The average size of packets within a flow, useful for distinguishing attack patterns from normal traffic.
Inter-Arrival Time: The time interval between successive packets in a flow, capturing traffic burstiness.
Rate of Packets: The rate of packets per second, highlighting high-rate traffic scenarios typical of DDoS attacks.
Unique Source Count: The number of unique source IP addresses observed in the flow.
Unique Destination Count: The number of unique destination IP addresses in the flow.
Anomaly Score: A computed score indicating the likelihood of anomalous or malicious activity within the traffic.
Device Type: The type of IoT device generating the traffic, such as Camera, Sensor, or Smart Appliance.
Operating System: The operating system of the IoT device, including Linux, Windows, or RTOS (Real-Time Operating System).
Firmware Version: The firmware version running on the device, reflecting device configuration.
Attack Type: The type of attack, if detected, including "SYN Flood," "UDP Flood," "Application Layer Attack," or "No Attack."
Attack Duration: The duration of the detected attack in seconds, where applicable.
Target Device: The specific device targeted by the attack, if applicable, or "None" for normal traffic.
Labels: Multi-label annotations for each record, indicating attack types or normal traffic. Labels are unbalanced to simulate real-world distributions, with "Normal" traffic dominating.
Key Highlights Multi-Label Annotations: Each record can have multiple labels to capture complex scenarios where different attack types may occur simultaneously.
Realistic Traffic Simulation: The dataset reflects both the prevalence of normal traffic and the intermittent nature of DDoS attacks in IoT environments.
Diverse Features: With over 20 features, the dataset supports detailed traffic analysis and the development of robust anomaly detection systems.
Unbalanced Distribution: Mimics real-world IoT networks where normal traffic significantly outweighs malicious activities.
The INDDOS24 dataset serves as a valuable resource for advancing IoT network security, particularly in detecting and mitigating DDoS attacks. It is suitable for researchers, data scientists, and engineers developing machine learning and deep learning-based models for intrusion detection and network anomaly analysis.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Please refer to the original data article for further data description: Jan Luxemburk et al. CESNET-QUIC22: A large one-month QUIC network traffic dataset from backbone lines, Data in Brief, 2023, 108888, ISSN 2352-3409, https://doi.org/10.1016/j.dib.2023.108888. We recommend using the CESNET DataZoo python library, which facilitates the work with large network traffic datasets. More information about the DataZoo project can be found in the GitHub repository https://github.com/CESNET/cesnet-datazoo. The QUIC (Quick UDP Internet Connection) protocol has the potential to replace TLS over TCP, which is the standard choice for reliable and secure Internet communication. Due to its design that makes the inspection of QUIC handshakes challenging and its usage in HTTP/3, there is an increasing demand for research in QUIC traffic analysis. This dataset contains one month of QUIC traffic collected in an ISP backbone network, which connects 500 large institutions and serves around half a million people. The data are delivered as enriched flows that can be useful for various network monitoring tasks. The provided server names and packet-level information allow research in the encrypted traffic classification area. Moreover, included QUIC versions and user agents (smartphone, web browser, and operating system identifiers) provide information for large-scale QUIC deployment studies. Data capture The data was captured in the flow monitoring infrastructure of the CESNET2 network. The capturing was done for four weeks between 31.10.2022 and 27.11.2022. The following list provides per-week flow count, capture period, and uncompressed size:
W-2022-44
Uncompressed Size: 19 GB Capture Period: 31.10.2022 - 6.11.2022 Number of flows: 32.6M W-2022-45
Uncompressed Size: 25 GB Capture Period: 7.11.2022 - 13.11.2022 Number of flows: 42.6M W-2022-46
Uncompressed Size: 20 GB Capture Period: 14.11.2022 - 20.11.2022 Number of flows: 33.7M W-2022-47
Uncompressed Size: 25 GB Capture Period: 21.11.2022 - 27.11.2022 Number of flows: 44.1M CESNET-QUIC22
Uncompressed Size: 89 GB Capture Period: 31.10.2022 - 27.11.2022 Number of flows: 153M
Data description The dataset consists of network flows describing encrypted QUIC communications. Flows were created using ipfixprobe flow exporter and are extended with packet metadata sequences, packet histograms, and with fields extracted from the QUIC Initial Packet, which is the first packet of the QUIC connection handshake. The extracted handshake fields are the Server Name Indication (SNI) domain, the used version of the QUIC protocol, and the user agent string that is available in a subset of QUIC communications. Packet Sequences Flows in the dataset are extended with sequences of packet sizes, directions, and inter-packet times. For the packet sizes, we consider payload size after transport headers (UDP headers for the QUIC case). Packet directions are encoded as ±1, +1 meaning a packet sent from client to server, and -1 a packet from server to client. Inter-packet times depend on the location of communicating hosts, their distance, and on the network conditions on the path. However, it is still possible to extract relevant information that correlates with user interactions and, for example, with the time required for an API/server/database to process the received data and generate the response to be sent in the next packet. Packet metadata sequences have a length of 30, which is the default setting of the used flow exporter. We also derive three fields from each packet sequence: its length, time duration, and the number of roundtrips. The roundtrips are counted as the number of changes in the communication direction (from packet directions data); in other words, each client request and server response pair counts as one roundtrip. Flow statistics Flows also include standard flow statistics, which represent aggregated information about the entire bidirectional flow. The fields are: the number of transmitted bytes and packets in both directions, the duration of flow, and packet histograms. Packet histograms include binned counts of packet sizes and inter-packet times of the entire flow in both directions (more information in the PHISTS plugin documentation There are eight bins with a logarithmic scale; the intervals are 0-15, 16-31, 32-63, 64-127, 128-255, 256-511, 512-1024, >1024 [ms or B]. The units are milliseconds for inter-packet times and bytes for packet sizes. Moreover, each flow has its end reason - either it was idle, reached the active timeout, or ended due to other reasons. This corresponds with the official IANA IPFIX-specified values. The FLOW_ENDREASON_OTHER field represents the forced end and lack of resources reasons. The end of flow detected reason is not considered because it is not relevant for UDP connections. Dataset structure The dataset flows are delivered in compressed CSV files. CSV files contain one flow per row; data columns are summarized in the provided list below. For each flow data file, there is a JSON file with the number of saved and seen (before sampling) flows per service and total counts of all received (observed on the CESNET2 network), service (belonging to one of the dataset's services), and saved (provided in the dataset) flows. There is also the stats-week.json file aggregating flow counts of a whole week and the stats-dataset.json file aggregating flow counts for the entire dataset. Flow counts before sampling can be used to compute sampling ratios of individual services and to resample the dataset back to the original service distribution. Moreover, various dataset statistics, such as feature distributions and value counts of QUIC versions and user agents, are provided in the dataset-statistics folder. The mapping between services and service providers is provided in the servicemap.csv file, which also includes SNI domains used for ground truth labeling. The following list describes flow data fields in CSV files:
ID: Unique identifier SRC_IP: Source IP address DST_IP: Destination IP address DST_ASN: Destination Autonomous System number SRC_PORT: Source port DST_PORT: Destination port PROTOCOL: Transport protocol QUIC_VERSION QUIC: protocol version QUIC_SNI: Server Name Indication domain QUIC_USER_AGENT: User agent string, if available in the QUIC Initial Packet TIME_FIRST: Timestamp of the first packet in format YYYY-MM-DDTHH-MM-SS.ffffff TIME_LAST: Timestamp of the last packet in format YYYY-MM-DDTHH-MM-SS.ffffff DURATION: Duration of the flow in seconds BYTES: Number of transmitted bytes from client to server BYTES_REV: Number of transmitted bytes from server to client PACKETS: Number of packets transmitted from client to server PACKETS_REV: Number of packets transmitted from server to client PPI: Packet metadata sequence in the format: [[inter-packet times], [packet directions], [packet sizes]] PPI_LEN: Number of packets in the PPI sequence PPI_DURATION: Duration of the PPI sequence in seconds PPI_ROUNDTRIPS: Number of roundtrips in the PPI sequence PHIST_SRC_SIZES: Histogram of packet sizes from client to server PHIST_DST_SIZES: Histogram of packet sizes from server to client PHIST_SRC_IPT: Histogram of inter-packet times from client to server PHIST_DST_IPT: Histogram of inter-packet times from server to client APP: Web service label CATEGORY: Service category FLOW_ENDREASON_IDLE: Flow was terminated because it was idle FLOW_ENDREASON_ACTIVE: Flow was terminated because it reached the active timeout FLOW_ENDREASON_OTHER: Flow was terminated for other reasons
Link to other CESNET datasets
https://www.liberouter.org/technology-v2/tools-services-datasets/datasets/ https://github.com/CESNET/cesnet-datazoo Please cite the original data article:
@article{CESNETQUIC22, author = {Jan Luxemburk and Karel Hynek and Tomáš Čejka and Andrej Lukačovič and Pavel Šiška}, title = {CESNET-QUIC22: a large one-month QUIC network traffic dataset from backbone lines}, journal = {Data in Brief}, pages = {108888}, year = {2023}, issn = {2352-3409}, doi = {https://doi.org/10.1016/j.dib.2023.108888}, url = {https://www.sciencedirect.com/science/article/pii/S2352340923000069} }
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Facebook
TwitterDLC values