100+ datasets found
  1. Z

    AIT Log Data Set V1.1

    • data.niaid.nih.gov
    • explore.openaire.eu
    • +1more
    Updated Oct 18, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rauber, Andreas (2023). AIT Log Data Set V1.1 [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_3723082
    Explore at:
    Dataset updated
    Oct 18, 2023
    Dataset provided by
    Wurzenberger, Markus
    Hotwagner, Wolfgang
    Landauer, Max
    Skopik, Florian
    Rauber, Andreas
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Description

    AIT Log Data Sets

    This repository contains synthetic log data suitable for evaluation of intrusion detection systems. The logs were collected from four independent testbeds that were built at the Austrian Institute of Technology (AIT) following the approach by Landauer et al. (2020) [1]. Please refer to the paper for more detailed information on automatic testbed generation and cite it if the data is used for academic publications. In brief, each testbed simulates user accesses to a webserver that runs Horde Webmail and OkayCMS. The duration of the simulation is six days. On the fifth day (2020-03-04) two attacks are launched against each web server.

    The archive AIT-LDS-v1_0.zip contains the directories "data" and "labels".

    The data directory is structured as follows. Each directory mail..com contains the logs of one web server. Each directory user- contains the logs of one user host machine, where one or more users are simulated. Each file log.log in the user- directories contains the activity logs of one particular user.

    Setup details of the web servers:

    OS: Debian Stretch 9.11.6

    Services:

    Apache2

    PHP7

    Exim 4.89

    Horde 5.2.22

    OkayCMS 2.3.4

    Suricata

    ClamAV

    MariaDB

    Setup details of user machines:

    OS: Ubuntu Bionic

    Services:

    Chromium

    Firefox

    User host machines are assigned to web servers in the following way:

    mail.cup.com is accessed by users from host machines user-{0, 1, 2, 6}

    mail.spiral.com is accessed by users from host machines user-{3, 5, 8}

    mail.insect.com is accessed by users from host machines user-{4, 9}

    mail.onion.com is accessed by users from host machines user-{7, 10}

    The following attacks are launched against the web servers (different starting times for each web server, please check the labels for exact attack times):

    Attack 1: multi-step attack with sequential execution of the following attacks:

    nmap scan

    nikto scan

    smtp-user-enum tool for account enumeration

    hydra brute force login

    webshell upload through Horde exploit (CVE-2019-9858)

    privilege escalation through Exim exploit (CVE-2019-10149)

    Attack 2: webshell injection through malicious cookie (CVE-2019-16885)

    Attacks are launched from the following user host machines. In each of the corresponding directories user-, logs of the attack execution are found in the file attackLog.txt:

    user-6 attacks mail.cup.com

    user-5 attacks mail.spiral.com

    user-4 attacks mail.insect.com

    user-7 attacks mail.onion.com

    The log data collected from the web servers includes

    Apache access and error logs

    syscall logs collected with the Linux audit daemon

    suricata logs

    exim logs

    auth logs

    daemon logs

    mail logs

    syslogs

    user logs

    Note that due to their large size, the audit/audit.log files of each server were compressed in a .zip-archive. In case that these logs are needed for analysis, they must first be unzipped.

    Labels are organized in the same directory structure as logs. Each file contains two labels for each log line separated by a comma, the first one based on the occurrence time, the second one based on similarity and ordering. Note that this does not guarantee correct labeling for all lines and that no manual corrections were conducted.

    Version history and related data sets:

    AIT-LDS-v1.0: Four datasets, logs from single host, fine-granular audit logs, mail/CMS.

    AIT-LDS-v1.1: Removed carriage return of line endings in audit.log files.

    AIT-LDS-v2.0: Eight datasets, logs from all hosts, system logs and network traffic, mail/CMS/cloud/web.

    Acknowledgements: Partially funded by the FFG projects INDICAETING (868306) and DECEPT (873980), and the EU project GUARD (833456).

    If you use the dataset, please cite the following publication:

    [1] M. Landauer, F. Skopik, M. Wurzenberger, W. Hotwagner and A. Rauber, "Have it Your Way: Generating Customized Log Datasets With a Model-Driven Simulation Testbed," in IEEE Transactions on Reliability, vol. 70, no. 1, pp. 402-415, March 2021, doi: 10.1109/TR.2020.3031317. [PDF]

  2. Z

    AIT Log Data Set V2.0

    • data.niaid.nih.gov
    • explore.openaire.eu
    • +2more
    Updated Jun 28, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rauber, Andreas (2024). AIT Log Data Set V2.0 [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_5789063
    Explore at:
    Dataset updated
    Jun 28, 2024
    Dataset provided by
    Frank, Maximilian
    Wurzenberger, Markus
    Hotwagner, Wolfgang
    Landauer, Max
    Skopik, Florian
    Rauber, Andreas
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Description

    AIT Log Data Sets

    This repository contains synthetic log data suitable for evaluation of intrusion detection systems, federated learning, and alert aggregation. A detailed description of the dataset is available in [1]. The logs were collected from eight testbeds that were built at the Austrian Institute of Technology (AIT) following the approach by [2]. Please cite these papers if the data is used for academic publications.

    In brief, each of the datasets corresponds to a testbed representing a small enterprise network including mail server, file share, WordPress server, VPN, firewall, etc. Normal user behavior is simulated to generate background noise over a time span of 4-6 days. At some point, a sequence of attack steps is launched against the network. Log data is collected from all hosts and includes Apache access and error logs, authentication logs, DNS logs, VPN logs, audit logs, Suricata logs, network traffic packet captures, horde logs, exim logs, syslog, and system monitoring logs. Separate ground truth files are used to label events that are related to the attacks. Compared to the AIT-LDSv1.1, a more complex network and diverse user behavior is simulated, and logs are collected from all hosts in the network. If you are only interested in network traffic analysis, we also provide the AIT-NDS containing the labeled netflows of the testbed networks. We also provide the AIT-ADS, an alert data set derived by forensically applying open-source intrusion detection systems on the log data.

    The datasets in this repository have the following structure:

    The gather directory contains all logs collected from the testbed. Logs collected from each host are located in gather//logs/.

    The labels directory contains the ground truth of the dataset that indicates which events are related to attacks. The directory mirrors the structure of the gather directory so that each label files is located at the same path and has the same name as the corresponding log file. Each line in the label files references the log event corresponding to an attack by the line number counted from the beginning of the file ("line"), the labels assigned to the line that state the respective attack step ("labels"), and the labeling rules that assigned the labels ("rules"). An example is provided below.

    The processing directory contains the source code that was used to generate the labels.

    The rules directory contains the labeling rules.

    The environment directory contains the source code that was used to deploy the testbed and run the simulation using the Kyoushi Testbed Environment.

    The dataset.yml file specifies the start and end time of the simulation.

    The following table summarizes relevant properties of the datasets:

    fox

    Simulation time: 2022-01-15 00:00 - 2022-01-20 00:00

    Attack time: 2022-01-18 11:59 - 2022-01-18 13:15

    Scan volume: High

    Unpacked size: 26 GB

    harrison

    Simulation time: 2022-02-04 00:00 - 2022-02-09 00:00

    Attack time: 2022-02-08 07:07 - 2022-02-08 08:38

    Scan volume: High

    Unpacked size: 27 GB

    russellmitchell

    Simulation time: 2022-01-21 00:00 - 2022-01-25 00:00

    Attack time: 2022-01-24 03:01 - 2022-01-24 04:39

    Scan volume: Low

    Unpacked size: 14 GB

    santos

    Simulation time: 2022-01-14 00:00 - 2022-01-18 00:00

    Attack time: 2022-01-17 11:15 - 2022-01-17 11:59

    Scan volume: Low

    Unpacked size: 17 GB

    shaw

    Simulation time: 2022-01-25 00:00 - 2022-01-31 00:00

    Attack time: 2022-01-29 14:37 - 2022-01-29 15:21

    Scan volume: Low

    Data exfiltration is not visible in DNS logs

    Unpacked size: 27 GB

    wardbeck

    Simulation time: 2022-01-19 00:00 - 2022-01-24 00:00

    Attack time: 2022-01-23 12:10 - 2022-01-23 12:56

    Scan volume: Low

    Unpacked size: 26 GB

    wheeler

    Simulation time: 2022-01-26 00:00 - 2022-01-31 00:00

    Attack time: 2022-01-30 07:35 - 2022-01-30 17:53

    Scan volume: High

    No password cracking in attack chain

    Unpacked size: 30 GB

    wilson

    Simulation time: 2022-02-03 00:00 - 2022-02-09 00:00

    Attack time: 2022-02-07 10:57 - 2022-02-07 11:49

    Scan volume: High

    Unpacked size: 39 GB

    The following attacks are launched in the network:

    Scans (nmap, WPScan, dirb)

    Webshell upload (CVE-2020-24186)

    Password cracking (John the Ripper)

    Privilege escalation

    Remote command execution

    Data exfiltration (DNSteal)

    Note that attack parameters and their execution orders vary in each dataset. Labeled log files are trimmed to the simulation time to ensure that their labels (which reference the related event by the line number in the file) are not misleading. Other log files, however, also contain log events generated before or after the simulation time and may therefore be affected by testbed setup or data collection. It is therefore recommended to only consider logs with timestamps within the simulation time for analysis.

    The structure of labels is explained using the audit logs from the intranet server in the russellmitchell data set as an example in the following. The first four labels in the labels/intranet_server/logs/audit/audit.log file are as follows:

    {"line": 1860, "labels": ["attacker_change_user", "escalate"], "rules": {"attacker_change_user": ["attacker.escalate.audit.su.login"], "escalate": ["attacker.escalate.audit.su.login"]}}

    {"line": 1861, "labels": ["attacker_change_user", "escalate"], "rules": {"attacker_change_user": ["attacker.escalate.audit.su.login"], "escalate": ["attacker.escalate.audit.su.login"]}}

    {"line": 1862, "labels": ["attacker_change_user", "escalate"], "rules": {"attacker_change_user": ["attacker.escalate.audit.su.login"], "escalate": ["attacker.escalate.audit.su.login"]}}

    {"line": 1863, "labels": ["attacker_change_user", "escalate"], "rules": {"attacker_change_user": ["attacker.escalate.audit.su.login"], "escalate": ["attacker.escalate.audit.su.login"]}}

    Each JSON object in this file assigns a label to one specific log line in the corresponding log file located at gather/intranet_server/logs/audit/audit.log. The field "line" in the JSON objects specify the line number of the respective event in the original log file, while the field "labels" comprise the corresponding labels. For example, the lines in the sample above provide the information that lines 1860-1863 in the gather/intranet_server/logs/audit/audit.log file are labeled with "attacker_change_user" and "escalate" corresponding to the attack step where the attacker receives escalated privileges. Inspecting these lines shows that they indeed correspond to the user authenticating as root:

    type=USER_AUTH msg=audit(1642999060.603:2226): pid=27950 uid=33 auid=4294967295 ses=4294967295 msg='op=PAM:authentication acct="jhall" exe="/bin/su" hostname=? addr=? terminal=/dev/pts/1 res=success'

    type=USER_ACCT msg=audit(1642999060.603:2227): pid=27950 uid=33 auid=4294967295 ses=4294967295 msg='op=PAM:accounting acct="jhall" exe="/bin/su" hostname=? addr=? terminal=/dev/pts/1 res=success'

    type=CRED_ACQ msg=audit(1642999060.615:2228): pid=27950 uid=33 auid=4294967295 ses=4294967295 msg='op=PAM:setcred acct="jhall" exe="/bin/su" hostname=? addr=? terminal=/dev/pts/1 res=success'

    type=USER_START msg=audit(1642999060.627:2229): pid=27950 uid=33 auid=4294967295 ses=4294967295 msg='op=PAM:session_open acct="jhall" exe="/bin/su" hostname=? addr=? terminal=/dev/pts/1 res=success'

    The same applies to all other labels for this log file and all other log files. There are no labels for logs generated by "normal" (i.e., non-attack) behavior; instead, all log events that have no corresponding JSON object in one of the files from the labels directory, such as the lines 1-1859 in the example above, can be considered to be labeled as "normal". This means that in order to figure out the labels for the log data it is necessary to store the line numbers when processing the original logs from the gather directory and see if these line numbers also appear in the corresponding file in the labels directory.

    Beside the attack labels, a general overview of the exact times when specific attack steps are launched are available in gather/attacker_0/logs/attacks.log. An enumeration of all hosts and their IP addresses is stated in processing/config/servers.yml. Moreover, configurations of each host are provided in gather//configs/ and gather//facts.json.

    Version history:

    AIT-LDS-v1.x: Four datasets, logs from single host, fine-granular audit logs, mail/CMS.

    AIT-LDS-v2.0: Eight datasets, logs from all hosts, system logs and network traffic, mail/CMS/cloud/web.

    Acknowledgements: Partially funded by the FFG projects INDICAETING (868306) and DECEPT (873980), and the EU projects GUARD (833456) and PANDORA (SI2.835928).

    If you use the dataset, please cite the following publications:

    [1] M. Landauer, F. Skopik, M. Frank, W. Hotwagner, M. Wurzenberger, and A. Rauber. "Maintainable Log Datasets for Evaluation of Intrusion Detection Systems". IEEE Transactions on Dependable and Secure Computing, vol. 20, no. 4, pp. 3466-3482, doi: 10.1109/TDSC.2022.3201582. [PDF]

    [2] M. Landauer, F. Skopik, M. Wurzenberger, W. Hotwagner and A. Rauber, "Have it Your Way: Generating Customized Log Datasets With a Model-Driven Simulation Testbed," in IEEE Transactions on Reliability, vol. 70, no. 1, pp. 402-415, March 2021, doi: 10.1109/TR.2020.3031317. [PDF]

  3. Kyoushi Log Data Set

    • zenodo.org
    • explore.openaire.eu
    • +1more
    zip
    Updated Apr 24, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Max Landauer; Maximilian Frank; Florian Skopik; Wolfgang Hotwagner; Markus Wurzenberger; Andreas Rauber; Max Landauer; Maximilian Frank; Florian Skopik; Wolfgang Hotwagner; Markus Wurzenberger; Andreas Rauber (2025). Kyoushi Log Data Set [Dataset]. http://doi.org/10.5281/zenodo.5779411
    Explore at:
    zipAvailable download formats
    Dataset updated
    Apr 24, 2025
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Max Landauer; Maximilian Frank; Florian Skopik; Wolfgang Hotwagner; Markus Wurzenberger; Andreas Rauber; Max Landauer; Maximilian Frank; Florian Skopik; Wolfgang Hotwagner; Markus Wurzenberger; Andreas Rauber
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Description

    This repository contains synthetic log data suitable for evaluation of intrusion detection systems. The logs were collected from a testbed that was built at the Austrian Institute of Technology (AIT) following the approaches by [1], [2], and [3]. Please refer to these papers for more detailed information on the dataset and cite them if the data is used for academic publications. Other than the related AIT-LDSv1.1, this dataset involves a more complex network structure, makes use of a different attack scenario, and collects log data from multiple hosts in the network. In brief, the testbed simulates a small enterprise network including mail server, file share, WordPress server, VPN, firewall, etc. Normal user behavior is simulated to generate background noise. After some days, two attack scenarios are launched against the network. Note that the AIT-LDSv2.0 extends this dataset with additional attack cases and variations of attack parameters.

    The archives have the following structure. The gather directory contains the raw log data from each host in the network, as well as their system configurations. The labels directory contains the ground truth for those log files that are labeled. The processing directory contains configurations for the labeling procedure and the rules directory contains the labeling rules. Labeling of events that are related to the attacks is carried out with the Kyoushi Labeling Framework.

    Each dataset contains traces of a specific attack scenario:

    • Scenario 1 (see gather/attacker_0/logs/sm.log for detailed attack log):
      • nmap scan
      • WPScan
      • dirb scan
      • webshell upload through wpDiscuz exploit (CVE-2020-24186)
      • privilege escalation
    • Scenario 2 (see gather/attacker_0/logs/dnsteal.log for detailed attack log):
      • DNSteal data exfiltration

    The log data collected from the servers includes

    • Apache access and error logs (labeled)
    • audit logs (labeled)
    • auth logs (labeled)
    • VPN logs (labeled)
    • DNS logs (labeled)
    • syslog
    • suricata logs
    • exim logs
    • horde logs
    • mail logs

    Note that only log files from affected servers are labeled. Label files and the directories in which they are located have the same name as their corresponding log file in the gather directory. Labels are in JSON format and comprise the following attributes: line (number of line in corresponding log file), labels (list of labels assigned to that log line), rules (names of labeling rules matching that log line). Note that not all attack traces are labeled in all log files; please refer to the labeling rules in case that some labels are not clear.

    Acknowledgements: Partially funded by the FFG projects INDICAETING (868306) and DECEPT (873980), and the EU project GUARD (833456).

    If you use the dataset, please cite the following publications:

    [1] M. Landauer, F. Skopik, M. Wurzenberger, W. Hotwagner and A. Rauber, "Have it Your Way: Generating Customized Log Datasets With a Model-Driven Simulation Testbed," in IEEE Transactions on Reliability, vol. 70, no. 1, pp. 402-415, March 2021, doi: 10.1109/TR.2020.3031317.

    [2] M. Landauer, M. Frank, F. Skopik, W. Hotwagner, M. Wurzenberger, and A. Rauber, "A Framework for Automatic Labeling of Log Datasets from Model-driven Testbeds for HIDS Evaluation". ACM Workshop on Secure and Trustworthy Cyber-Physical Systems (ACM SaT-CPS 2022), April 27, 2022, Baltimore, MD, USA. ACM.

    [3] M. Frank, "Quality improvement of labels for model-driven benchmark data generation for intrusion detection systems", Master's Thesis, Vienna University of Technology, 2021.

  4. d

    Statistics of Log Production And Forest Revenue Type - Dataset - MAMPU

    • archive.data.gov.my
    Updated Oct 25, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2018). Statistics of Log Production And Forest Revenue Type - Dataset - MAMPU [Dataset]. https://archive.data.gov.my/data/dataset/statistics-of-log-production-and-forest-revenue-type
    Explore at:
    Dataset updated
    Oct 25, 2018
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Statistics of Log Production And Forest Revenue Type No. of Views : 83

  5. Event Log Sampling Datasets

    • figshare.com
    zip
    Updated Jul 22, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CONG LIU (2022). Event Log Sampling Datasets [Dataset]. http://doi.org/10.6084/m9.figshare.20354505.v1
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jul 22, 2022
    Dataset provided by
    figshare
    Authors
    CONG LIU
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This datasets includes 9 event logs, which can be used to experiment with log completeness-oriented event log sampling methods.

    · exercise.xes: The dataset is a simulation log generated by the paper review process model, and each trace clearly describes the process of reviewing papers in detail.

    · training_log_1/3/8.xes: These 3 datasets are human-trained simulation logs for the 2016 Process Discovery Competition (PDC 2016). Each trace consists of two values, the name of the process model activity referenced by the event and the identifier of the case to which the event belongs.

    · Production.xes: This dataset includes process data from production processes, and each track includes data for cases, activities, resources, timestamps, and more data fields.

    · BPIC_2012_A/O/W.xes: These 3 dataset are derived from the personal loan application process of a financial institution in the Netherlands. The process represented in the event log is the application process of a personal loan or overdraft in a global financing organization. Each trace describes the process of applying for a personal loan for different customers.

    · CrossHospital.xes: The dataset includes the treatment process data of emergency patients in the hospital, and each track represents the treatment process of an emergency patient in the hospital.

  6. Logging industries, principal statistics by industry classification, total...

    • open.canada.ca
    • www150.statcan.gc.ca
    • +2more
    csv, html, xml
    Updated Jan 17, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statistics Canada (2023). Logging industries, principal statistics by industry classification, total and 6-digit level, annual [Dataset]. https://open.canada.ca/data/en/dataset/b7568682-b13a-428b-ac38-6c65cead9efe
    Explore at:
    html, xml, csvAvailable download formats
    Dataset updated
    Jan 17, 2023
    Dataset provided by
    Statistics Canadahttps://statcan.gc.ca/en
    License

    Open Government Licence - Canada 2.0https://open.canada.ca/en/open-government-licence-canada
    License information was derived automatically

    Description

    Logging industries, annual 21 principal statistics (revenues, expenses, salaries, employment, stocks, etc.), by North American Industry Classification System (NAICS), total and 6-digit level.

  7. Z

    Data from: LogChunks: A Data Set for Build Log Analysis

    • data.niaid.nih.gov
    Updated Jan 31, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Panichella, Annibale (2020). LogChunks: A Data Set for Build Log Analysis [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_3632350
    Explore at:
    Dataset updated
    Jan 31, 2020
    Dataset provided by
    Brandt, Carolin
    Beller, Moritz
    Panichella, Annibale
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    We collected 797 Travis CI logs from a wide range of 80 GitHub repositories from 29 different main development languages. You can find our collection tool in log-collection and the logs sorted by language and repository in logs.

    We manually labeled the part (chunk) of the log describing why the build failed.In addition, the chunks are annotated with keywords that we would use to search for them and categorized according to their structural representation within the log. You can find this data in an xml-file for each repository in build-failure-reason.

  8. Principal Statistics of Forestry and Logging, Malaysia - Dataset - MAMPU

    • archive.data.gov.my
    Updated Apr 29, 2016
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    archive.data.gov.my (2016). Principal Statistics of Forestry and Logging, Malaysia - Dataset - MAMPU [Dataset]. https://archive.data.gov.my/data/dataset/principal-statistics-of-forestry-and-logging-malaysia
    Explore at:
    Dataset updated
    Apr 29, 2016
    Dataset provided by
    Data.govhttps://data.gov/
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Malaysia
    Description

    Principal Statistics of Forestry and Logging, 1947 – 2018, Malaysia Footnote (i) Data which only includes Peninsular Malaysia and Sabah - Forest area (1947 – 1986) - Production of logs (1947 – 1963) (ii) Data which includes only Peninsular Malaysia - Number of workers in wood-based industries (1947 – 1975) - Number of workers in wood-based industries (2015) (iii) Data which include only Peninsular Malaysia and Sarawak - Number of workers in wood-based industries (1976 – 1979) No. of Views : 2196

  9. PNG Forest cover, log exports and concessions statistics

    • png-data.sprep.org
    • pacificdata.org
    • +1more
    xlsx
    Updated Nov 2, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    PNG Department of National Planning & Monitoring (2022). PNG Forest cover, log exports and concessions statistics [Dataset]. https://png-data.sprep.org/dataset/png-forest-cover-log-exports-and-concessions-statistics
    Explore at:
    xlsx(23829), xlsx(13416), xlsx(225142), xlsx(27802), xlsx(20378), xlsx(13660)Available download formats
    Dataset updated
    Nov 2, 2022
    Dataset provided by
    Papua New Guinea Forestry Authority
    PNG Department of Agriculture and Livestock
    PNG Department of National Planning & Monitoring
    Climate Change and Development Authority in PNG
    PNG Conservation and Environment Protection Authority
    License

    Public Domain Mark 1.0https://creativecommons.org/publicdomain/mark/1.0/
    License information was derived automatically

    Area covered
    Papua New Guinea
    Description

    A summary of various datasets on logging concessions, exports, forest cover are presented here.

  10. ACES LOG DATA V1

    • data.staging.idas-ds1.appdat.jsc.nasa.gov
    • datasets.ai
    • +5more
    Updated Feb 19, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    nasa.gov (2025). ACES LOG DATA V1 [Dataset]. https://data.staging.idas-ds1.appdat.jsc.nasa.gov/dataset/aces-log-data-v1
    Explore at:
    Dataset updated
    Feb 19, 2025
    Dataset provided by
    NASAhttp://nasa.gov/
    Description

    The ALTUS Cloud Electrification Study (ACES) was based at the Naval Air Facility Key West in Florida. During August 2002, ACES researchers conducted overflights of thunderstorms over the southwestern corner of Florida. For the first time in NASA research, an uninhabited aerial vehicle (UAV) named ALTUS was used to collect cloud electrification data. Carrying field mills, optical sensors, electric field sensors and other instruments, ALTUS allowed scientists to collect cloud electrification data for the first time from above the storm, from its birth through dissipation. This experiment allowed scientists to achieve the dual goals of gathering weather data safely and testing new aircraft technology. This dataset consists of log data from each flight, and yields instrument and aircraft status throughout the flight.

  11. Logging industries, principal statistics by North American Industry...

    • www150.statcan.gc.ca
    Updated Dec 18, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Government of Canada, Statistics Canada (2024). Logging industries, principal statistics by North American Industry Classification System (NAICS) (x 1,000) [Dataset]. http://doi.org/10.25318/1610011401-eng
    Explore at:
    Dataset updated
    Dec 18, 2024
    Dataset provided by
    Statistics Canadahttps://statcan.gc.ca/en
    Area covered
    Canada
    Description

    This table contains data described by the following dimensions (Not all combinations are available): Geography (16 items: Canada; Atlantic Region; Newfoundland and Labrador; Prince Edward Island; ...) Principal statistics (16 items: Total revenue; Revenue from logging activities; Total expenses; Total salaries and wages, direct and indirect labour; ...) North American Industry Classification System (NAICS) (3 items: Logging; Logging (except contract); Contract Logging).

  12. EDGAR Log File Data Sets

    • catalog.data.gov
    Updated Jul 22, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    EDGAR Business Office (2025). EDGAR Log File Data Sets [Dataset]. https://catalog.data.gov/dataset/edgar-log-file-data-set
    Explore at:
    Dataset updated
    Jul 22, 2025
    Dataset provided by
    Electronic Data Gathering, Analysis, and Retrievalhttp://www.sec.gov/edgar.shtml
    Description

    The data sets provide information on internet search traffic for EDGAR filings through SEC.gov.

  13. A

    Log Management Market Study by Solutions and Services for IT & ITeS,...

    • factmr.com
    csv, pdf
    Updated Jun 19, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Fact.MR (2024). Log Management Market Study by Solutions and Services for IT & ITeS, Banking, Financial Services & Insurance, Healthcare, Retail & e-Commerce, Telecom, and Education from 2024 to 2034 [Dataset]. https://www.factmr.com/report/log-management-market
    Explore at:
    pdf, csvAvailable download formats
    Dataset updated
    Jun 19, 2024
    License

    https://www.factmr.com/privacy-policyhttps://www.factmr.com/privacy-policy

    Time period covered
    2024 - 2034
    Area covered
    Worldwide
    Description

    Revenue from the global log management market is estimated to reach US$ 3.31 billion in 2024. The market is projected to climb to a value of US$ 11.03 billion by the end of 2034, expanding at a remarkable CAGR of 12.8% over the next ten years (2024 to 2034).

    Report AttributeDetails
    Log Management Market Size (2024E)US$ 3.31 Billion
    Projected Market Value (2034F)US$ 11.03 Billion
    Global Market Growth Rate (2024 to 2034)12.8% CAGR
    East Asia Market Growth Rate (2024 to 2034)14.3% CAGR
    North America Market Growth Rate (2034)11.9% CAGR
    Solutions Segment Market Value (2024)US$ 2.65 Billion
    Cloud-Based Segment Value (2024)US$ 2.25 Billion
    Key Companies ProfiledSolarWinds; IBM; Micro Focus; Rapid7; Intel Security; Blackstratus; Solarwinds Worldwide; IBM Corporation; Veriato Inc.; Splunk Inc.; Loggly Inc.

    Country-wise Insights

    AttributeUnited States
    Market Value (2024E)US$ 905.1 Million
    Growth Rate (2024 to 2034)11.6% CAGR
    Projected Value (2034F)US$ 2.7 Billion
    AttributeJapan
    Market Value (2024E)US$ 210.3 Million
    Growth Rate (2024 to 2034)14.4% CAGR
    Projected Value (2034F)US$ 810.9 Million

    Category-wise Insights

    AttributeSolutions
    Segment Value (2024E)US$ 2.65 Billion
    Growth Rate (2024 to 2034)12.2% CAGR
    Projected Value (2034F)US$ 8.38 Billion
    AttributeCloud-Based Log Management
    Segment Value (2024E)US$ 2.25 Billion
    Growth Rate (2024 to 2034)12.5% CAGR
    Projected Value (2034F)US$ 7.28 Billion
  14. i

    Moodle Log Data

    • ieee-dataport.org
    Updated Jun 6, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    A M Abirami (2024). Moodle Log Data [Dataset]. https://ieee-dataport.org/documents/moodle-log-data
    Explore at:
    Dataset updated
    Jun 6, 2024
    Authors
    A M Abirami
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Time

  15. d

    Data from: Utah FORGE: Updated FMI Fracture Log from Well 16A(78)-32

    • catalog.data.gov
    • gdr.openei.org
    • +3more
    Updated Jan 20, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Energy and Geoscience Institute at the University of Utah (2025). Utah FORGE: Updated FMI Fracture Log from Well 16A(78)-32 [Dataset]. https://catalog.data.gov/dataset/utah-forge-updated-fmi-fracture-log-from-well-16a78-32-af87f
    Explore at:
    Dataset updated
    Jan 20, 2025
    Dataset provided by
    Energy and Geoscience Institute at the University of Utah
    Description

    This dataset consists of an Excel spreadsheet detailing the fracture picks from a reinterpretation of the formation micro-imaging (FMI) log from Utah FORGE well 16A(78)-32. The provided information details fracture location, geometry, and type. Also included here is a link to the original raw and processed FMI logs, as well as other data from the 2021 well logging.

  16. h

    Data from: hostnames

    • huggingface.co
    Updated Sep 28, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    nyuuzyou (2024). hostnames [Dataset]. https://huggingface.co/datasets/nyuuzyou/hostnames
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Sep 28, 2024
    Authors
    nyuuzyou
    License

    https://choosealicense.com/licenses/cc0-1.0/https://choosealicense.com/licenses/cc0-1.0/

    Description

    CT Log Archive: Download Hostnames from Inactive Logs

      Warning
    

    The data was obtained by parsing CT with modified software, which will be discussed below. I cannot guarantee the accuracy of the data, as it is possible that some data was lost due to a variety of factors, such as the data from Let's Encrypt Oak 2022. If you want to do passive data analysis and you want maximum accuracy for each specific CT, then I highly recommend that you use self-written software that has… See the full description on the dataset page: https://huggingface.co/datasets/nyuuzyou/hostnames.

  17. F

    All Employees, Logging

    • fred.stlouisfed.org
    json
    Updated Aug 1, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). All Employees, Logging [Dataset]. https://fred.stlouisfed.org/series/CEU1011330001
    Explore at:
    jsonAvailable download formats
    Dataset updated
    Aug 1, 2025
    License

    https://fred.stlouisfed.org/legal/#copyright-public-domainhttps://fred.stlouisfed.org/legal/#copyright-public-domain

    Description

    Graph and download economic data for All Employees, Logging (CEU1011330001) from Jan 1947 to Jul 2025 about logging, mining, establishment survey, employment, and USA.

  18. Z

    Data from: Traffic and Log Data Captured During a Cyber Defense Exercise

    • data.niaid.nih.gov
    • zenodo.org
    Updated Jun 12, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Stanislav Špaček (2020). Traffic and Log Data Captured During a Cyber Defense Exercise [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_3746128
    Explore at:
    Dataset updated
    Jun 12, 2020
    Dataset provided by
    Daniel Tovarňák
    Stanislav Špaček
    Jan Vykopal
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset was acquired during Cyber Czech – a hands-on cyber defense exercise (Red Team/Blue Team) held in March 2019 at Masaryk University, Brno, Czech Republic. Network traffic flows and a high variety of event logs were captured in an exercise network deployed in the KYPO Cyber Range Platform.

    Contents

    The dataset covers two distinct time intervals, which correspond to the official schedule of the exercise. The timestamps provided below are in the ISO 8601 date format.

    Day 1, March 19, 2019

    Start: 2019-03-19T11:00:00.000000+01:00

    End: 2019-03-19T18:00:00.000000+01:00

    Day 2, March 20, 2019

    Start: 2019-03-20T08:00:00.000000+01:00

    End: 2019-03-20T15:30:00.000000+01:00

    The captured and collected data were normalized into three distinct event types and they are stored as structured JSON. The data are sorted by a timestamp, which represents the time they were observed. Each event type includes a raw payload ready for further processing and analysis. The description of the respective event types and the corresponding data files follows.

    cz.muni.csirt.IpfixEntry.tgz – an archive of IPFIX traffic flows enriched with an additional payload of parsed application protocols in raw JSON.

    cz.muni.csirt.SyslogEntry.tgz – an archive of Linux Syslog entries with the payload of corresponding text-based log messages.

    cz.muni.csirt.WinlogEntry.tgz – an archive of Windows Event Log entries with the payload of original events in raw XML.

    Each archive listed above includes a directory of the same name with the following four files, ready to be processed.

    data.json.gz – the actual data entries in a single gzipped JSON file.

    dictionary.yml – data dictionary for the entries.

    schema.ddl – data schema for Apache Spark analytics engine.

    schema.jsch – JSON schema for the entries.

    Finally, the exercise network topology is described in a machine-readable NetJSON format and it is a part of a set of auxiliary files archive – auxiliary-material.tgz – which includes the following.

    global-gateway-config.json – the network configuration of the global gateway in the NetJSON format.

    global-gateway-routing.json – the routing configuration of the global gateway in the NetJSON format.

    redteam-attack-schedule.{csv,odt} – the schedule of the Red Team attacks in CSV and ODT format. Source for Table 2.

    redteam-reserved-ip-ranges.{csv,odt} – the list of IP segments reserved for the Red Team in CSV and ODT format. Source for Table 1.

    topology.{json,pdf,png} – the topology of the complete Cyber Czech exercise network in the NetJSON, PDF and PNG format.

    topology-small.{pdf,png} – simplified topology in the PDF and PNG format. Source for Figure 1.

  19. i

    Log summary dataset.

    • ieee-dataport.org
    Updated Jul 17, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yuzhe Zhang (2022). Log summary dataset. [Dataset]. https://ieee-dataport.org/documents/log-summary-dataset
    Explore at:
    Dataset updated
    Jul 17, 2022
    Authors
    Yuzhe Zhang
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    HPC

  20. F

    All Employees, Mining and Logging

    • fred.stlouisfed.org
    json
    Updated Aug 1, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). All Employees, Mining and Logging [Dataset]. https://fred.stlouisfed.org/series/CEU1000000001
    Explore at:
    jsonAvailable download formats
    Dataset updated
    Aug 1, 2025
    License

    https://fred.stlouisfed.org/legal/#copyright-public-domainhttps://fred.stlouisfed.org/legal/#copyright-public-domain

    Description

    Graph and download economic data for All Employees, Mining and Logging (CEU1000000001) from Jan 1939 to Jul 2025 about logging, mining, establishment survey, employment, and USA.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Rauber, Andreas (2023). AIT Log Data Set V1.1 [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_3723082

AIT Log Data Set V1.1

Explore at:
Dataset updated
Oct 18, 2023
Dataset provided by
Wurzenberger, Markus
Hotwagner, Wolfgang
Landauer, Max
Skopik, Florian
Rauber, Andreas
License

Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically

Description

AIT Log Data Sets

This repository contains synthetic log data suitable for evaluation of intrusion detection systems. The logs were collected from four independent testbeds that were built at the Austrian Institute of Technology (AIT) following the approach by Landauer et al. (2020) [1]. Please refer to the paper for more detailed information on automatic testbed generation and cite it if the data is used for academic publications. In brief, each testbed simulates user accesses to a webserver that runs Horde Webmail and OkayCMS. The duration of the simulation is six days. On the fifth day (2020-03-04) two attacks are launched against each web server.

The archive AIT-LDS-v1_0.zip contains the directories "data" and "labels".

The data directory is structured as follows. Each directory mail..com contains the logs of one web server. Each directory user- contains the logs of one user host machine, where one or more users are simulated. Each file log.log in the user- directories contains the activity logs of one particular user.

Setup details of the web servers:

OS: Debian Stretch 9.11.6

Services:

Apache2

PHP7

Exim 4.89

Horde 5.2.22

OkayCMS 2.3.4

Suricata

ClamAV

MariaDB

Setup details of user machines:

OS: Ubuntu Bionic

Services:

Chromium

Firefox

User host machines are assigned to web servers in the following way:

mail.cup.com is accessed by users from host machines user-{0, 1, 2, 6}

mail.spiral.com is accessed by users from host machines user-{3, 5, 8}

mail.insect.com is accessed by users from host machines user-{4, 9}

mail.onion.com is accessed by users from host machines user-{7, 10}

The following attacks are launched against the web servers (different starting times for each web server, please check the labels for exact attack times):

Attack 1: multi-step attack with sequential execution of the following attacks:

nmap scan

nikto scan

smtp-user-enum tool for account enumeration

hydra brute force login

webshell upload through Horde exploit (CVE-2019-9858)

privilege escalation through Exim exploit (CVE-2019-10149)

Attack 2: webshell injection through malicious cookie (CVE-2019-16885)

Attacks are launched from the following user host machines. In each of the corresponding directories user-, logs of the attack execution are found in the file attackLog.txt:

user-6 attacks mail.cup.com

user-5 attacks mail.spiral.com

user-4 attacks mail.insect.com

user-7 attacks mail.onion.com

The log data collected from the web servers includes

Apache access and error logs

syscall logs collected with the Linux audit daemon

suricata logs

exim logs

auth logs

daemon logs

mail logs

syslogs

user logs

Note that due to their large size, the audit/audit.log files of each server were compressed in a .zip-archive. In case that these logs are needed for analysis, they must first be unzipped.

Labels are organized in the same directory structure as logs. Each file contains two labels for each log line separated by a comma, the first one based on the occurrence time, the second one based on similarity and ordering. Note that this does not guarantee correct labeling for all lines and that no manual corrections were conducted.

Version history and related data sets:

AIT-LDS-v1.0: Four datasets, logs from single host, fine-granular audit logs, mail/CMS.

AIT-LDS-v1.1: Removed carriage return of line endings in audit.log files.

AIT-LDS-v2.0: Eight datasets, logs from all hosts, system logs and network traffic, mail/CMS/cloud/web.

Acknowledgements: Partially funded by the FFG projects INDICAETING (868306) and DECEPT (873980), and the EU project GUARD (833456).

If you use the dataset, please cite the following publication:

[1] M. Landauer, F. Skopik, M. Wurzenberger, W. Hotwagner and A. Rauber, "Have it Your Way: Generating Customized Log Datasets With a Model-Driven Simulation Testbed," in IEEE Transactions on Reliability, vol. 70, no. 1, pp. 402-415, March 2021, doi: 10.1109/TR.2020.3031317. [PDF]

Search
Clear search
Close search
Google apps
Main menu