9 datasets found

Event Log Sampling Datasets
figshare.com
zip
Updated Jul 22, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
CONG LIU (2022). Event Log Sampling Datasets [Dataset]. http://doi.org/10.6084/m9.figshare.20354505.v1
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.20354505.v1
Dataset updated
Jul 22, 2022
Dataset provided by
figshare
Authors
CONG LIU
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This datasets includes 9 event logs, which can be used to experiment with log completeness-oriented event log sampling methods.

· exercise.xes: The dataset is a simulation log generated by the paper review process model, and each trace clearly describes the process of reviewing papers in detail.

· training_log_1/3/8.xes: These 3 datasets are human-trained simulation logs for the 2016 Process Discovery Competition (PDC 2016). Each trace consists of two values, the name of the process model activity referenced by the event and the identifier of the case to which the event belongs.

· Production.xes: This dataset includes process data from production processes, and each track includes data for cases, activities, resources, timestamps, and more data fields.

· BPIC_2012_A/O/W.xes: These 3 dataset are derived from the personal loan application process of a financial institution in the Netherlands. The process represented in the event log is the application process of a personal loan or overdraft in a global financing organization. Each trace describes the process of applying for a personal loan for different customers.

· CrossHospital.xes: The dataset includes the treatment process data of emergency patients in the hospital, and each track represents the treatment process of an emergency patient in the hospital.
AIT Log Data Set V2.0
zenodo.org
data.niaid.nih.gov
+1more
zip
Updated Jun 28, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Max Landauer; Florian Skopik; Maximilian Frank; Wolfgang Hotwagner; Markus Wurzenberger; Andreas Rauber; Max Landauer; Florian Skopik; Maximilian Frank; Wolfgang Hotwagner; Markus Wurzenberger; Andreas Rauber (2024). AIT Log Data Set V2.0 [Dataset]. http://doi.org/10.5281/zenodo.5789064
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.5789064
Dataset updated
Jun 28, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Max Landauer; Florian Skopik; Maximilian Frank; Wolfgang Hotwagner; Markus Wurzenberger; Andreas Rauber; Max Landauer; Florian Skopik; Maximilian Frank; Wolfgang Hotwagner; Markus Wurzenberger; Andreas Rauber
License
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Description
AIT Log Data Sets

This repository contains synthetic log data suitable for evaluation of intrusion detection systems, federated learning, and alert aggregation. A detailed description of the dataset is available in [1]. The logs were collected from eight testbeds that were built at the Austrian Institute of Technology (AIT) following the approach by [2]. Please cite these papers if the data is used for academic publications.

In brief, each of the datasets corresponds to a testbed representing a small enterprise network including mail server, file share, WordPress server, VPN, firewall, etc. Normal user behavior is simulated to generate background noise over a time span of 4-6 days. At some point, a sequence of attack steps is launched against the network. Log data is collected from all hosts and includes Apache access and error logs, authentication logs, DNS logs, VPN logs, audit logs, Suricata logs, network traffic packet captures, horde logs, exim logs, syslog, and system monitoring logs. Separate ground truth files are used to label events that are related to the attacks. Compared to the AIT-LDSv1.1, a more complex network and diverse user behavior is simulated, and logs are collected from all hosts in the network. If you are only interested in network traffic analysis, we also provide the AIT-NDS containing the labeled netflows of the testbed networks. We also provide the AIT-ADS, an alert data set derived by forensically applying open-source intrusion detection systems on the log data.

The datasets in this repository have the following structure:

The gather directory contains all logs collected from the testbed. Logs collected from each host are located in gather/.

The labels directory contains the ground truth of the dataset that indicates which events are related to attacks. The directory mirrors the structure of the gather directory so that each label files is located at the same path and has the same name as the corresponding log file. Each line in the label files references the log event corresponding to an attack by the line number counted from the beginning of the file ("line"), the labels assigned to the line that state the respective attack step ("labels"), and the labeling rules that assigned the labels ("rules"). An example is provided below.

The processing directory contains the source code that was used to generate the labels.

The rules directory contains the labeling rules.

The environment directory contains the source code that was used to deploy the testbed and run the simulation using the Kyoushi Testbed Environment.

The dataset.yml file specifies the start and end time of the simulation.

The following table summarizes relevant properties of the datasets:

fox

Simulation time: 2022-01-15 00:00 - 2022-01-20 00:00

Attack time: 2022-01-18 11:59 - 2022-01-18 13:15

Scan volume: High

Unpacked size: 26 GB

harrison

Simulation time: 2022-02-04 00:00 - 2022-02-09 00:00

Attack time: 2022-02-08 07:07 - 2022-02-08 08:38

Scan volume: High

Unpacked size: 27 GB

russellmitchell

Simulation time: 2022-01-21 00:00 - 2022-01-25 00:00

Attack time: 2022-01-24 03:01 - 2022-01-24 04:39

Scan volume: Low

Unpacked size: 14 GB

santos

Simulation time: 2022-01-14 00:00 - 2022-01-18 00:00

Attack time: 2022-01-17 11:15 - 2022-01-17 11:59

Scan volume: Low

Unpacked size: 17 GB

shaw

Simulation time: 2022-01-25 00:00 - 2022-01-31 00:00

Attack time: 2022-01-29 14:37 - 2022-01-29 15:21

Scan volume: Low

Data exfiltration is not visible in DNS logs

Unpacked size: 27 GB

wardbeck

Simulation time: 2022-01-19 00:00 - 2022-01-24 00:00

Attack time: 2022-01-23 12:10 - 2022-01-23 12:56

Scan volume: Low

Unpacked size: 26 GB

wheeler

Simulation time: 2022-01-26 00:00 - 2022-01-31 00:00

Attack time: 2022-01-30 07:35 - 2022-01-30 17:53

Scan volume: High

No password cracking in attack chain

Unpacked size: 30 GB

wilson

Simulation time: 2022-02-03 00:00 - 2022-02-09 00:00

Attack time: 2022-02-07 10:57 - 2022-02-07 11:49

Scan volume: High

Unpacked size: 39 GB

The following attacks are launched in the network:

Scans (nmap, WPScan, dirb)

Webshell upload (CVE-2020-24186)

Password cracking (John the Ripper)

Privilege escalation

Remote command execution

Data exfiltration (DNSteal)

Note that attack parameters and their execution orders vary in each dataset. Labeled log files are trimmed to the simulation time to ensure that their labels (which reference the related event by the line number in the file) are not misleading. Other log files, however, also contain log events generated before or after the simulation time and may therefore be affected by testbed setup or data collection. It is therefore recommended to only consider logs with timestamps within the simulation time for analysis.

The structure of labels is explained using the audit logs from the intranet server in the russellmitchell data set as an example in the following. The first four labels in the labels/intranet_server/logs/audit/audit.log file are as follows:

{"line": 1860, "labels": ["attacker_change_user", "escalate"], "rules": {"attacker_change_user": ["attacker.escalate.audit.su.login"], "escalate": ["attacker.escalate.audit.su.login"]}}

{"line": 1861, "labels": ["attacker_change_user", "escalate"], "rules": {"attacker_change_user": ["attacker.escalate.audit.su.login"], "escalate": ["attacker.escalate.audit.su.login"]}}

{"line": 1862, "labels": ["attacker_change_user", "escalate"], "rules": {"attacker_change_user": ["attacker.escalate.audit.su.login"], "escalate": ["attacker.escalate.audit.su.login"]}}

{"line": 1863, "labels": ["attacker_change_user", "escalate"], "rules": {"attacker_change_user": ["attacker.escalate.audit.su.login"], "escalate": ["attacker.escalate.audit.su.login"]}}

Each JSON object in this file assigns a label to one specific log line in the corresponding log file located at gather/intranet_server/logs/audit/audit.log. The field "line" in the JSON objects specify the line number of the respective event in the original log file, while the field "labels" comprise the corresponding labels. For example, the lines in the sample above provide the information that lines 1860-1863 in the gather/intranet_server/logs/audit/audit.log file are labeled with "attacker_change_user" and "escalate" corresponding to the attack step where the attacker receives escalated privileges. Inspecting these lines shows that they indeed correspond to the user authenticating as root:

type=USER_AUTH msg=audit(1642999060.603:2226): pid=27950 uid=33 auid=4294967295 ses=4294967295 msg='op=PAM:authentication acct="jhall" exe="/bin/su" hostname=? addr=? terminal=/dev/pts/1 res=success'

type=USER_ACCT msg=audit(1642999060.603:2227): pid=27950 uid=33 auid=4294967295 ses=4294967295 msg='op=PAM:accounting acct="jhall" exe="/bin/su" hostname=? addr=? terminal=/dev/pts/1 res=success'

type=CRED_ACQ msg=audit(1642999060.615:2228): pid=27950 uid=33 auid=4294967295 ses=4294967295 msg='op=PAM:setcred acct="jhall" exe="/bin/su" hostname=? addr=? terminal=/dev/pts/1 res=success'

type=USER_START msg=audit(1642999060.627:2229): pid=27950 uid=33 auid=4294967295 ses=4294967295 msg='op=PAM:session_open acct="jhall" exe="/bin/su" hostname=? addr=? terminal=/dev/pts/1 res=success'

The same applies to all other labels for this log file and all other log files. There are no labels for logs generated by "normal" (i.e., non-attack) behavior; instead, all log events that have no corresponding JSON object in one of the files from the labels directory, such as the lines 1-1859 in the example above, can be considered to be labeled as "normal". This means that in order to figure out the labels for the log data it is necessary to store the line numbers when processing the original logs from the gather directory and see if these line numbers also appear in the corresponding file in the labels directory.

Beside the attack labels, a general overview of the exact times when specific attack steps are launched are available in gather/attacker_0/logs/attacks.log. An enumeration of all hosts and their IP addresses is stated in processing/config/servers.yml. Moreover, configurations of each host are provided in gather/ and gather/.

Version history:

AIT-LDS-v1.x: Four datasets, logs from single host, fine-granular audit logs, mail/CMS.

AIT-LDS-v2.0: Eight datasets, logs from all hosts, system logs and network traffic, mail/CMS/cloud/web.

Acknowledgements: Partially funded by the FFG projects INDICAETING (868306) and DECEPT (873980), and the EU projects GUARD (833456) and PANDORA (SI2.835928).

If you use the dataset, please cite the following publications:

[1] M. Landauer, F. Skopik, M. Frank, W. Hotwagner,
Utah FORGE: Well 58-32 Schlumberger FMI Logs DLIS and XML files
gdr.openei.org
data.openei.org
+2more
archive, data +1
Updated Nov 17, 2017
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Greg Nash; Joe Moore; Greg Nash; Joe Moore (2017). Utah FORGE: Well 58-32 Schlumberger FMI Logs DLIS and XML files [Dataset]. http://doi.org/10.15121/1464529
Explore at:
archive, website, dataAvailable download formats
Unique identifier
https://doi.org/10.15121/1464529
Dataset updated
Nov 17, 2017
Dataset provided by
United States Department of Energyhttp://energy.gov/
Geothermal Data Repository
Energy and Geoscience Institute at the University of Utah
Authors
Greg Nash; Joe Moore; Greg Nash; Joe Moore
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This zipped data set includes Schlumberger FMI logs DLIS and XML files from Utah FORGE deep well 58-32. These include runs 1 (2226-7550 ft) and 2 (7440-7550 ft). Run 3 (7390-7527ft) was acquired during phase 2C.
f
Event Logs CSV
figshare.com
rar
Updated Dec 9, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Dina Bayomie (2019). Event Logs CSV [Dataset]. http://doi.org/10.6084/m9.figshare.11342063.v1
Explore at:
rarAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.11342063.v1
Dataset updated
Dec 9, 2019
Dataset provided by
figshare
Authors
Dina Bayomie
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The event logs in CSV format. The dataset contains both correlated and uncorrelated logs
f
OpenStack log files
figshare.com
zip
Updated Nov 16, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mbasa MOLO; Victor Akande; Nzanzu Vingi Patrick; Joke Badejo; Emmanuel Adetiba (2021). OpenStack log files [Dataset]. http://doi.org/10.6084/m9.figshare.17025353.v1
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.17025353.v1
Dataset updated
Nov 16, 2021
Dataset provided by
figshare
Authors
Mbasa MOLO; Victor Akande; Nzanzu Vingi Patrick; Joke Badejo; Emmanuel Adetiba
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset contains log information of a cloud computing infrastructure based on OpenStack.Three different files are available, including the nova, cinder, and glance log files. Due to the fact that the data is unbalanced, a CSV file containing log information of the three OpenStack applications is provided. This can be used for testing in case the log files are used for a machine learning purpose. These data were collected from the Federated Genominc (FEDGEN) cloud computing infrastructure hosted in Covenant Unversity under the Covenant Applied Informatics and Communication Africa Centre of Excellence (CApIC-ACE) project funded by the World Bank.
LoRaWAN Traffic Analysis Dataset
zenodo.org
zip
Updated Aug 28, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ales Povalac; Ales Povalac; Jan Kral; Jan Kral (2023). LoRaWAN Traffic Analysis Dataset [Dataset]. http://doi.org/10.5281/zenodo.7919213
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.7919213
Dataset updated
Aug 28, 2023
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Ales Povalac; Ales Povalac; Jan Kral; Jan Kral
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset was created by a LoRaWAN sniffer and contains packets, which are thoroughly analyzed in the paper Exploring LoRaWAN Traffic: In-Depth Analysis of IoT Network Communications (not yet published). Data from the LoRaWAN sniffer was collected in four cities: Liege (Belgium), Graz (Austria), Vienna (Austria), and Brno (Czechia).

Gateway ID: b827ebafac000001

Uplink reception (end-device => gateway)

Only packets containing CRC, inverted IQ

RX0: 867.1 MHz, 867.3 MHz, 867.5 MHz, 867.7 MHz, 867.9 MHz - BW 125 kHz and all SF

RX1: 868.1 MHz, 868.3 MHz, 868.5 MHz - BW 125 kHz and all SF

Gateway ID: b827ebafac000002

Downlink reception (gateway => end-device)

Includes packets without CRC, non-inverted IQ

RX0: 867.1 MHz, 867.3 MHz, 867.5 MHz, 867.7 MHz, 867.9 MHz - BW 125 kHz and all SF

RX1: 868.1 MHz, 868.3 MHz, 868.5 MHz - BW 125 kHz and all SF

Gateway ID: b827ebafac000003

Downlink reception (gateway => end-device) and Class-B beacon on 869.525 MHz

Includes packets without CRC, non-inverted IQ

RX0: 869.525 MHz - BW 125 kHz and all SF, BW 125 kHz and SF9 with implicit header, CR 4/5 and length 17 B

To open the pcap files, you need Wireshark with current support for LoRaTap and LoRaWAN protocols. This support will be available in the official 4.1.0 release. A working version for Windows is accessible in the automated build system.

The source data is available in the log.zip file, which contains the complete dataset obtained by the sniffer. A set of conversion tools for log processing is available on Github. The converted logs, available in Wireshark format, are stored in pcap.zip. For the LoRaWAN decoder, you can use the attached root and session keys. The processed outputs are stored in csv.zip, and graphical statistics are available in png.zip.

This data represents a unique, geographically identifiable selection from the full log, cleaned of any errors. The records from Brno include communication between the gateway and a node with known keys.

Test file :: 00_Test

short test file for parser verification

comparison of LoRaTap version 0 and version 1 formats

Brno, Czech Republic :: 01_Brno

49.22685N, 16.57536E, ASL 306m

lines 150873 to 529796

time 1.8.2022 15:04:28 to 17.8.2022 13:05:32

preliminary experiment

experimental device

Device EUI: 70b3d5cee0000042

Application key: d494d49a7b4053302bdcf96f1defa65a

Device address: 00d85395

Network session key: c417540b8b2afad8930c82fcf7ea54bb

Application session key: 421fea9bedd2cc497f63303edf5adf8e

Liege, Belgium :: 02_Liege :: evaluated in the paper

50.66445N, 5.59276E, ASL 151m

lines 636205 to 886868

time 25.8.2022 10:12:24 to 12.9.2022 06:20:48

Brno, Czech Republic :: 03_Brno_join

49.22685N, 16.57536E, ASL 306m

lines 947787 to 979382

time 30.9.2022 15:21:27 to 4.10.2022 10:46:31

record contains OTAA activation (Join Request / Join Accept)

experimental device:

Device EUI: 70b3d5cee0000042

Application key: d494d49a7b4053302bdcf96f1defa65a

Device address: 01e65ddc

Network session key: e2898779a03de59e2317b149abf00238

Application session key: 59ca1ac91922887093bc7b236bd1b07f

Graz, Austria :: 04_Graz :: evaluated in the paper

47.07049N, 15.44506E, ASL 364m

lines 1015139 to 1178855

time 26.10.2022 06:21:07 to 29.11.2022 10:03:00

Vienna, Austria :: 05_Wien :: evaluated in the paper

48.19666N, 16.37101E, ASL 204m

lines 1179308 to 3657105

time 1.12.2022 10:42:19 to 4.1.2023 14:00:05

contains a total of 14 short restarts (under 90 seconds)

Brno, Czech Republic :: 07_Brno :: evaluated in the paper

49.22685N, 16.57536E, ASL 306m

lines 4969648 to 6919392

time 16.2.2023 8:53:43 to 30.3.2023 9:00:11
n
Respiration_chambers/raw_log_files and combined datasets of biomass and...
cmr.earthdata.nasa.gov
researchdata.edu.au
+1more
Updated Dec 18, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2018). Respiration_chambers/raw_log_files and combined datasets of biomass and chamber data, and physical parameters [Dataset]. http://doi.org/10.26179/5c1827d5d6711
Explore at:
Unique identifier
https://doi.org/10.26179/5c1827d5d6711
Dataset updated
Dec 18, 2018
Time period covered
Jan 27, 2015 - Feb 23, 2015
Area covered
Description
General overview The following datasets are described by this metadata record, and are available for download from the provided URL.

Raw log files, physical parameters raw log files

Raw excel files, respiration/PAM chamber raw excel spreadsheets

Processed and cleaned excel files, respiration chamber biomass data

Raw rapid light curve excel files (this is duplicated from Raw log files), combined dataset pH, temperature, oxygen, salinity, velocity for experiment

Associated R script file for pump cycles of respirations chambers

####

Physical parameters raw log files

Raw log files 1) DATE= 2) Time= UTC+11 3) PROG=Automated program to control sensors and collect data 4) BAT=Amount of battery remaining 5) STEP=check aquation manual 6) SPIES=check aquation manual 7) PAR=Photoactive radiation 8) Levels=check aquation manual 9) Pumps= program for pumps 10) WQM=check aquation manual

####

Respiration/PAM chamber raw excel spreadsheets

Abbreviations in headers of datasets Note: Two data sets are provided in different formats. Raw and cleaned (adj). These are the same data with the PAR column moved over to PAR.all for analysis. All headers are the same. The cleaned (adj) dataframe will work with the R syntax below, alternative add code to do cleaning in R.

Date: ISO 1986 - Check Time:UTC+11 unless otherwise stated DATETIME: UTC+11 unless otherwise stated ID (of instrument in respiration chambers) ID43=Pulse amplitude fluoresence measurement of control ID44=Pulse amplitude fluoresence measurement of acidified chamber ID=1 Dissolved oxygen ID=2 Dissolved oxygen ID3= PAR ID4= PAR PAR=Photo active radiation umols F0=minimal florescence from PAM Fm=Maximum fluorescence from PAM Yield=(F0 – Fm)/Fm rChl=an estimate of chlorophyll (Note this is uncalibrated and is an estimate only) Temp=Temperature degrees C PAR=Photo active radiation PAR2= Photo active radiation2 DO=Dissolved oxygen %Sat= Saturation of dissolved oxygen Notes=This is the program of the underwater submersible logger with the following abreviations: Notes-1) PAM= Notes-2) PAM=Gain level set (see aquation manual for more detail) Notes-3) Acclimatisation= Program of slowly introducing treatment water into chamber Notes-4) Shutter start up 2 sensors+sample…= Shutter PAMs automatic set up procedure (see aquation manual) Notes-5) Yield step 2=PAM yield measurement and calculation of control Notes-6) Yield step 5= PAM yield measurement and calculation of acidified Notes-7) Abatus respiration DO and PAR step 1= Program to measure dissolved oxygen and PAR (see aquation manual). Steps 1-4 are different stages of this program including pump cycles, DO and PAR measurements.

8) Rapid light curve data Pre LC: A yield measurement prior to the following measurement After 10.0 sec at 0.5% to 8%: Level of each of the 8 steps of the rapid light curve Odessey PAR (only in some deployments): An extra measure of PAR (umols) using an Odessey data logger Dataflow PAR: An extra measure of PAR (umols) using a Dataflow sensor. PAM PAR: This is copied from the PAR or PAR2 column PAR all: This is the complete PAR file and should be used Deployment: Identifying which deployment the data came from

####

Respiration chamber biomass data

The data is chlorophyll a biomass from cores from the respiration chambers. The headers are: Depth (mm) Treat (Acidified or control) Chl a (pigment and indicator of biomass) Core (5 cores were collected from each chamber, three were analysed for chl a), these are psudoreplicates/subsamples from the chambers and should not be treated as replicates.

####

Associated R script file for pump cycles of respirations chambers

Associated respiration chamber data to determine the times when respiration chamber pumps delivered treatment water to chambers. Determined from Aquation log files (see associated files). Use the chamber cut times to determine net production rates. Note: Users need to avoid the times when the respiration chambers are delivering water as this will give incorrect results. The headers that get used in the attached/associated R file are start regression and end regression. The remaining headers are not used unless called for in the associated R script. The last columns of these datasets (intercept, ElapsedTimeMincoef) are determined from the linear regressions described below.

To determine the rate of change of net production, coefficients of the regression of oxygen consumption in discrete 180 minute data blocks were determined. R squared values for fitted regressions of these coefficients were consistently high (greater than 0.9). We make two assumptions with calculation of net production rates: the first is that heterotrophic community members do not change their metabolism under OA; and the second is that the heterotrophic communities are similar between treatments.

####

Combined dataset pH, temperature, oxygen, salinity, velocity for experiment

This data is rapid light curve data generated from a Shutter PAM fluorimeter. There are eight steps in each rapid light curve. Note: The software component of the Shutter PAM fluorimeter for sensor 44 appeared to be damaged and would not cycle through the PAR cycles. Therefore the rapid light curves and recovery curves should only be used for the control chambers (sensor ID43).

The headers are PAR: Photoactive radiation relETR: F0/Fm x PAR Notes: Stage/step of light curve Treatment: Acidified or control

The associated light treatments in each stage. Each actinic light intensity is held for 10 seconds, then a saturating pulse is taken (see PAM methods).

After 10.0 sec at 0.5% = 1 umols PAR After 10.0 sec at 0.7% = 1 umols PAR After 10.0 sec at 1.1% = 0.96 umols PAR After 10.0 sec at 1.6% = 4.32 umols PAR After 10.0 sec at 2.4% = 4.32 umols PAR After 10.0 sec at 3.6% = 8.31 umols PAR After 10.0 sec at 5.3% =15.78 umols PAR After 10.0 sec at 8.0% = 25.75 umols PAR

This dataset appears to be missing data, note D5 rows potentially not useable information

See the word document in the download file for more information.
Network Traffic Dataset
kaggle.com
Updated Oct 31, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ravikumar Gattu (2023). Network Traffic Dataset [Dataset]. https://www.kaggle.com/datasets/ravikumargattu/network-traffic-dataset
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Oct 31, 2023
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Ravikumar Gattu
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Context

The data presented here was obtained in a Kali Machine from University of Cincinnati,Cincinnati,OHIO by carrying out packet captures for 1 hour during the evening on Oct 9th,2023 using Wireshark.This dataset consists of 394137 instances were obtained and stored in a CSV (Comma Separated Values) file.This large dataset could be used utilised for different machine learning applications for instance classification of Network traffic,Network performance monitoring,Network Security Management , Network Traffic Management ,network intrusion detection and anomaly detection.

The dataset can be used for a variety of machine learning tasks, such as network intrusion detection, traffic classification, and anomaly detection.

Content :

This network traffic dataset consists of 7 features.Each instance contains the information of source and destination IP addresses, The majority of the properties are numeric in nature, however there are also nominal and date kinds due to the Timestamp.

The network traffic flow statistics (No. Time Source Destination Protocol Length Info) were obtained using Wireshark (https://www.wireshark.org/).

Dataset Columns:

No : Number of Instance. Timestamp : Timestamp of instance of network traffic Source IP: IP address of Source Destination IP: IP address of Destination Portocol: Protocol used by the instance Length: Length of Instance Info: Information of Traffic Instance

Acknowledgements :

I would like thank University of Cincinnati for giving the infrastructure for generation of network traffic data set.

Ravikumar Gattu , Susmitha Choppadandi

Inspiration : This dataset goes beyond the majority of network traffic classification datasets, which only identify the type of application (WWW, DNS, ICMP,ARP,RARP) that an IP flow contains. Instead, it generates machine learning models that can identify specific applications (like Tiktok,Wikipedia,Instagram,Youtube,Websites,Blogs etc.) from IP flow statistics (there are currently 25 applications in total).

**Dataset License: ** CC0: Public Domain

Dataset Usages : This dataset can be used for different machine learning applications in the field of cybersecurity such as classification of Network traffic,Network performance monitoring,Network Security Management , Network Traffic Management ,network intrusion detection and anomaly detection.

ML techniques benefits from this Dataset :

This dataset is highly useful because it consists of 394137 instances of network traffic data obtained by using the 25 applications on a public,private and Enterprise networks.Also,the dataset consists of very important features that can be used for most of the applications of Machine learning in cybersecurity.Here are few of the potential machine learning applications that could be benefited from this dataset are :

Network Performance Monitoring : This large network traffic data set can be utilised for analysing the network traffic to identifying the network patterns in the network .This help in designing the network security algorithms for minimise the network probelms.

Anamoly Detection : Large network traffic dataset can be utilised training the machine learning models for finding the irregularitues in the traffic which could help identify the cyber attacks.

3.Network Intrusion Detection : This large dataset could be utilised for machine algorithms training and designing the models for detection of the traffic issues,Malicious traffic network attacks and DOS attacks as well.
r
Evaluated Artifact for "Quantifying Software Reliability via Model-Counting"...
radar-service.eu
radar.kit.edu
tar
Updated Jun 23, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Alexander Weigl; Samuel Teuber (2023). Evaluated Artifact for "Quantifying Software Reliability via Model-Counting" [Dataset]. http://doi.org/10.35097/1520
Explore at:
tar(677317632 bytes)Available download formats
Unique identifier
https://doi.org/10.35097/1520
Dataset updated
Jun 23, 2023
Dataset provided by
Teuber, Samuel
Karlsruhe Institute of Technology
Authors
Alexander Weigl; Samuel Teuber
Description
counterSharp Experiment and Play Environment

This repository contains the reproducible experimental evaluation of the counterSharp tool. The repository contains a Docker file which configures the counterSharp tool, two model counters (ApproxMC and Ganak) as well as the tool by Dimovski et al. for our experiments. Furthermore the repository contains the benchmarks on which we ran our experiments as well as the logs of our experiments and scripts for transforming the log files into LaTeX tables.

Getting Started

In order to pull and run the docker container from docker hub you should execute docker run. Or, you can load the archived and evaluated artifact into docker with docker load < countersharp-experiments.tar.gz If the image is loaded, docker run opens a shell allowing the execution of further commands: bash docker run -it -v `pwd`/results:/experiments/results samweb/countersharp-experiments By using a volume, the results are written to the host system rather than the docker container You can remove the volume mounting option (-v ...), and create /experiments/results inside the container if you can spare the results. If you are using the volume and run into permission problems, then you need to give rights via SELinux: chcon -Rt svirt_sandbox_file_tpwd/results. This will create a writable folder results in your current folder which will hold any logs from experiments. A minimal example can be executed by running (this takes approximately 70 seconds): bash ./showcase.sh This will create benchmark log files for the benchmarks for_bounded_loop1.c and overflow.c in the folder results. For example, /experiments/results/for_bounded_loop1.c/0X/ contains five folder for the five repeated runs of the experiments on this file. Each folder /experiments/results/for_bounded_loop1.c/0X/ contains the folders for the different tools, which includes the log and output files. A full run can be executed by running (this takes approximately a little under 2 days): bash ./run-all.sh Additionally single benchmarks can be executed through the following commands: bash run-instance approx program.c "[counterSharp arguments]" # Runs countersharp with ApproxMC on program.c run-instance ganak program.c "[counterSharp arguments]" # Runs countersharp with Ganak on program.c Probab.native -single -domain polyhedra program.c # Runs the tool by Dimovski et al. for deterministic programs Probab.native -single -domain polyhedra -nondet program.c # Runs the tool by Dimovski et al. for nondeterministic programs For example we can execute run-instance approx /experiments/benchmarks/confidence.c "--function testfun --unwind 1" to obtain the outcome of counterSharp and ApproxMC for the benchmark confidence.c. Note that the time information produced by runlim is always only for one part of the entire execution (i.e. for counterSharp or one ApproxMC run or one Ganak run). The script run-instance is straightforwarded, we have the call to our tool counterSharp: bash python3 -m counterSharp --amm /tmp/amm.dimacs --amh /tmp/amh.dimacs --asm /tmp/asm.dimacs --ash /tmp/ash.dimacs --con /tmp/con.dimacs -d $3 $2 which is followed by the call of ApproxMC oder ganak.

Benchmarks

The benchmarks are contained in the folder benchmarks which also includes an overview on the sources and modifications to the benchmarks
Note that benchmark versions for the tool by Dimovski et al. are contained in folder benchmarks-dimovski

Benchmark Results

The results are contained in the folder results in which all logs from benchmark runs reside. The log files from the evaluation are not available in the Docker Image, but just on GitHub. The logs are split-up by benchmark instance (first level folder), run number (second level folder) and tool (third level folder)
For example, the file results/bwd_loop1a.c/01/approxmc/stdout.log contains the stdout and stderr of running approxmc on the instance bwd_loop1a.c in run 01

Machine Details

All runs were executed on a Linux machine housing an Intel(R) Core(TM) i5-6500 CPU (3.20GHz) and 16GB of memory. Note that for every benchmark log 01/counterSharp/init.log contains information on the machine used for benchmark execution as well as on the commits used in the experiments.

Running benchmarks

For all cases of automated benchmark execution we assume a CSV file containing relevant information on the instances to run: The first column is the benchmark's name, the second column are parameters passed to counterSharp (see instances.csv) or the tool by Dimovski (see instances-dimovski.csv). All scripts produce benchmarking results for "missing" instances, i.e. instances for which no folder can be found in the results folder. - Run counterSharp on the benchmarks:
run-counterSharp instances.csv - Run ApproxMC on benchmarks:
run-approxmc instances.csv - Only after counterSharp has been run - Run GANAK on benchmarks:
run-ganak instances.csv - Only after counterSharp has been run - Run Dimovski's tool on benchmarks:
run-dimovski instances-dimovski.csv

Log summarization

Summarization is possible through the python script in logParsing/parse.py within the container. The script takes as input a list of benchmarks to process and returns (parts of) a LaTeX table. Note, that there must exist logs for all benchmarks provided in the CSV file for the call to succeed! - To obtain (sorted) results for deterministic benchmarks:
cat logParsing/deterministic-sorted.csv| python3 logParsing/parse.py results aggregate2 - To obtain (sorted) results for nondeterministic benchmarks:
cat logParsing/nondeterministic-sorted.csv| python3 logParsing/parse.py results nondet

Building the docker container

All tools are packaged into a Dockerfile which makes any installation unnecessary. There is, however, the need for a running Docker installation. The Dockerfile build depends on the accessibility of the following GitHub Repositories: - CryptoMiniSat - ApproxMC - Ganak - Probab_Analyzer - counterSharp The Docker image is hosted at Dockerhub.
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

CONG LIU (2022). Event Log Sampling Datasets [Dataset]. http://doi.org/10.6084/m9.figshare.20354505.v1

Event Log Sampling Datasets

Explore at:

52 scholarly articles cite this dataset (View in Google Scholar)

zipAvailable download formats

Unique identifier

https://doi.org/10.6084/m9.figshare.20354505.v1

Dataset updated

Jul 22, 2022

Dataset provided by

figshare

Authors

CONG LIU

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

This datasets includes 9 event logs, which can be used to experiment with log completeness-oriented event log sampling methods.

· exercise.xes: The dataset is a simulation log generated by the paper review process model, and each trace clearly describes the process of reviewing papers in detail.

· training_log_1/3/8.xes: These 3 datasets are human-trained simulation logs for the 2016 Process Discovery Competition (PDC 2016). Each trace consists of two values, the name of the process model activity referenced by the event and the identifier of the case to which the event belongs.

· Production.xes: This dataset includes process data from production processes, and each track includes data for cases, activities, resources, timestamps, and more data fields.

· BPIC_2012_A/O/W.xes: These 3 dataset are derived from the personal loan application process of a financial institution in the Netherlands. The process represented in the event log is the application process of a personal loan or overdraft in a global financing organization. Each trace describes the process of applying for a personal loan for different customers.

· CrossHospital.xes: The dataset includes the treatment process data of emergency patients in the hospital, and each track represents the treatment process of an emergency patient in the hospital.

Clear search

Close search

Google apps

Main menu

Event Log Sampling Datasets

AIT Log Data Set V2.0

Utah FORGE: Well 58-32 Schlumberger FMI Logs DLIS and XML files

Event Logs CSV

OpenStack log files

LoRaWAN Traffic Analysis Dataset

Respiration_chambers/raw_log_files and combined datasets of biomass and...

Network Traffic Dataset

Evaluated Artifact for "Quantifying Software Reliability via Model-Counting"...

counterSharp Experiment and Play Environment

Getting Started

Benchmarks

Benchmark Results

Machine Details

Running benchmarks

Log summarization

Building the docker container

Event Log Sampling DatasetsSee More Versions

`counterSharp` Experiment and Play Environment

Event Log Sampling Datasets