100+ datasets found
  1. f

    Production Analysis with Process Mining Technology

    • figshare.com
    • data.4tu.nl
    zip
    Updated Jul 28, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dafna Levy (2020). Production Analysis with Process Mining Technology [Dataset]. http://doi.org/10.4121/uuid:68726926-5ac5-4fab-b873-ee76ea412399
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jul 28, 2020
    Dataset provided by
    4TU.ResearchData
    Authors
    Dafna Levy
    License

    https://doi.org/10.4121/resource:terms_of_usehttps://doi.org/10.4121/resource:terms_of_use

    Description

    The comma separated value dataset contains process data from a production process, including data on cases, activities, resources, timestamps and more data fields.

  2. Z

    Process Models obtained from event logs with with different...

    • data.niaid.nih.gov
    • zenodo.org
    Updated Jan 24, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sander J.J. Leemans (2020). Process Models obtained from event logs with with different information-preserving abstractions [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_3243987
    Explore at:
    Dataset updated
    Jan 24, 2020
    Dataset provided by
    Sander J.J. Leemans
    Dirk Fahland
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset contains results of the experiment to analyze information preservation and recovery by different event log abstractions in process mining described in: Sander J.J. Leemans, Dirk Fahland "Information-Preserving Abstractions of Event Data in Process Mining" Knowledge and Information Systems, ISSN: 0219-1377 (Print) 0219-3116 (Online), accepted May 2019

    The experiment results were obtained with: https://doi.org/10.5281/zenodo.3243981

  3. u

    Public benchmark dataset for Conformance Checking in Process Mining

    • figshare.unimelb.edu.au
    • melbourne.figshare.com
    xml
    Updated Jan 30, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Daniel Reissner (2022). Public benchmark dataset for Conformance Checking in Process Mining [Dataset]. http://doi.org/10.26188/5cd91d0d3adaa
    Explore at:
    xmlAvailable download formats
    Dataset updated
    Jan 30, 2022
    Dataset provided by
    The University of Melbourne
    Authors
    Daniel Reissner
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset contains a variety of publicly available real-life event logs. We derived two types of Petri nets for each event log with two state-of-the-art process miners : Inductive Miner (IM) and Split Miner (SM). Each event log-Petri net pair is intended for evaluating the scalability of existing conformance checking techniques.We used this data-set to evaluate the scalability of the S-Component approach for measuring fitness. The dataset contains tables of descriptive statistics of both process models and event logs. In addition, this dataset includes the results in terms of time performance measured in milliseconds for several approaches for both multi-threaded and single-threaded executions. Last, the dataset contains a cost-comparison of different approaches and reports on the degree of over-approximation of the S-Components approach. The description of the compared conformance checking techniques can be found here: https://arxiv.org/abs/1910.09767. Update:The dataset has been extended with the event logs of the BPIC18 and BPIC19 logs. BPIC19 is actually a collection of four different processes and thus was split into four event logs. For each of the additional five event logs, again, two process models have been mined with inductive and split miner. We used the extended dataset to test the scalability of our tandem repeats approach for measuring fitness. The dataset now contains updated tables of log and model statistics as well as tables of the conducted experiments measuring execution time and raw fitness cost of various fitness approaches. The description of the compared conformance checking techniques can be found here: https://arxiv.org/abs/2004.01781.Update: The dataset has also been used to measure the scalability of a new Generalization measure based on concurrent and repetitive patterns. : A concurrency oracle is used in tandem with partial orders to identify concurrent patterns in the log that are tested against parallel blocks in the process model. Tandem repeats are used with various trace reduction and extensions to define repetitive patterns in the log that are tested against loops in the process model. Each pattern is assigned a partial fulfillment. The generalization is then the average of pattern fulfillments weighted by the trace counts for which the patterns have been observed. The dataset no includes the time results and a breakdown of Generalization values for the dataset.

  4. f

    Event log with data attributes

    • figshare.com
    xml
    Updated Aug 31, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dina Bayomie (2022). Event log with data attributes [Dataset]. http://doi.org/10.6084/m9.figshare.20736706.v1
    Explore at:
    xmlAvailable download formats
    Dataset updated
    Aug 31, 2022
    Dataset provided by
    figshare
    Authors
    Dina Bayomie
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    These 60 event log varies over the number of cases and the density of the overlapping cases. The log has the following event attributes: event id, case id, activity, timestamp, loan type, amount, resources, and status. And the BPMN scenarios were used to simulate the process.

  5. f

    Data-driven Process Discovery - Artificial Event Log

    • figshare.com
    zip
    Updated Jun 2, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Felix Mannhardt (2023). Data-driven Process Discovery - Artificial Event Log [Dataset]. http://doi.org/10.4121/uuid:32cad43f-8bb9-46af-8333-48aae2bea037
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jun 2, 2023
    Dataset provided by
    4TU.ResearchData
    Authors
    Felix Mannhardt
    License

    https://doi.org/10.4121/resource:terms_of_usehttps://doi.org/10.4121/resource:terms_of_use

    Description

    A synthetic event log with 100,000 traces and 900,000 events that was generated by simulating a simple artificial process model. There are three data attributes in the event log: Priority, Nurse, and Type. Some paths in the model are recorded infrequently based on the value of these attributes. Noise is added by randomly adding one additional event to an increasing number of traces. CPN Tools (http://cpntools.org) was used to generate the event log and inject the noise.

  6. T

    CeLOE event log sample

    • dataverse.telkomuniversity.ac.id
    tsv
    Updated Apr 20, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Telkom University Dataverse (2022). CeLOE event log sample [Dataset]. http://doi.org/10.34820/FK2/9FT77M
    Explore at:
    tsv(10066), tsv(19847)Available download formats
    Dataset updated
    Apr 20, 2022
    Dataset provided by
    Telkom University Dataverse
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    This study analyses an event log, automatically generated by the CeLOE LMS, that records student and lecturer activities in learning. The event log is mined to obtain a process model representing learning behaviours of the lecturers and students during the learning process. The case study in this research is learning in the study program 365 during the first semester of 2020/2021.

  7. Z

    Differentially Private Event Logs for Process Mining: Supplementary Material...

    • data.niaid.nih.gov
    • zenodo.org
    Updated Jul 19, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Gamal Elkoumy (2024). Differentially Private Event Logs for Process Mining: Supplementary Material [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_4601138
    Explore at:
    Dataset updated
    Jul 19, 2024
    Dataset provided by
    Marlon Dumas
    Alisa Pankova
    Gamal Elkoumy
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    In this archive, we provide supplementary material for our paper entitled "Mine Me but Don’t Single Me Out: DifferentiallyPrivate Event Logs for Process Mining". We list the selected event logs and their characteristics and descriptive statistics. Also, this archive contains the anonymized event logs as the result of the experiments. The source code is available on GitHub.

  8. f

    Validation of Precision Measures - Event Logs and Process Models

    • figshare.com
    • data.4tu.nl
    zip
    Updated Jun 8, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Niek Tax (2023). Validation of Precision Measures - Event Logs and Process Models [Dataset]. http://doi.org/10.4121/uuid:991753f7-a240-4ba6-a8a8-67174a08c51b
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jun 8, 2023
    Dataset provided by
    4TU.ResearchData
    Authors
    Niek Tax
    License

    https://doi.org/10.4121/resource:terms_of_usehttps://doi.org/10.4121/resource:terms_of_use

    Description

    This collection contains the event logs and process models described and used in the paper "The Imprecisions of Precision Measures in Process Mining"

  9. f

    Dataset of mHealth event logs

    • figshare.com
    pdf
    Updated May 1, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Raoul Nuijten; Pieter Van Gorp (2022). Dataset of mHealth event logs [Dataset]. http://doi.org/10.6084/m9.figshare.19688730.v2
    Explore at:
    pdfAvailable download formats
    Dataset updated
    May 1, 2022
    Dataset provided by
    figshare
    Authors
    Raoul Nuijten; Pieter Van Gorp
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    How does Facebook always seems to know what the next funny video should be to sustain your attention with the platform? Facebook has not asked you whether you like videos of cats doing something funny: They just seem to know. In fact, FaceBook learns through your behavior on the platform (e.g., how long have you engaged with similar movies, what posts have you previously liked or commented on, etc.). As a result, Facebook is able to sustain the attention of their user for a long time. On the other hand, the typical mHealth apps suffer from rapidly collapsing user engagement levels. To sustain engagement levels, mHealth apps nowadays employ all sorts of intervention strategies. Of course, it would be powerful to know—like Facebook knows—what strategy should be presented to what individual to sustain their engagement. To be able to do that, the first step could be to be able to cluster similar users (and then derive intervention strategies from there). This dataset was collected through a single mHealth app over 8 different mHealth campaigns (i.e., scientific studies). Using this dataset, one could derive clusters from app user event data. One approach could be to differentiate between two phases: a process mining phase and a clustering phase. In the process mining phase one may derive from the dataset the processes (i.e., sequences of app actions) that users undertake. In the clustering phase, based on the processes different users engaged in, one may cluster similar users (i.e., users that perform similar sequences of app actions).

    List of files

    0-list-of-variables.pdf includes an overview of different variables within the dataset. 1-description-of-endpoints.pdf includes a description of the unique endpoints that appear in the dataset. 2-requests.csv includes the dataset with actual app user event data. 2-requests-by-session.csv includes the dataset with actual app user event data with a session variable, to differentiate between user requests that were made in the same session.

  10. Data from: An Empirical Evaluation of Unsupervised Event Log Abstraction...

    • zenodo.org
    • data.niaid.nih.gov
    bin
    Updated Nov 27, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Greg Van Houdt; Greg Van Houdt; Massimiliano de Leoni; Benoît Depaire; Benoît Depaire; Niels Martin; Niels Martin; Massimiliano de Leoni (2023). An Empirical Evaluation of Unsupervised Event Log Abstraction Techniques in Process Mining [Dataset]. http://doi.org/10.5281/zenodo.6793544
    Explore at:
    binAvailable download formats
    Dataset updated
    Nov 27, 2023
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Greg Van Houdt; Greg Van Houdt; Massimiliano de Leoni; Benoît Depaire; Benoît Depaire; Niels Martin; Niels Martin; Massimiliano de Leoni
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This upload contains the event logs, generated by L-Sim, on which the experiments of the related paper were performed.

    The related paper is accepted in the journal Information Systems.

  11. A Collection of Event Logs of Blockchain-based Applications

    • zenodo.org
    • data.niaid.nih.gov
    Updated Apr 24, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Fouzia Alzhrani; Fouzia Alzhrani (2025). A Collection of Event Logs of Blockchain-based Applications [Dataset]. http://doi.org/10.5281/zenodo.6637059
    Explore at:
    Dataset updated
    Apr 24, 2025
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Fouzia Alzhrani; Fouzia Alzhrani
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    A set of event logs of 101 blockchain-based applications (DApps). For each DApp, there are two event log files. The first one is a raw version where data is encoded by blockchain. The second file is a decoded version where data is decoded into a human-readable format. If a DApp has multiple versions on different blockchain networks, then there are two event log files (encoded and decoded) for each version. In addition, the event registry file includes a comprehensive list of event names and their corresponding signatures obtained from contract ABIs of the 101 DApps.

  12. Z

    Simplified Event Logs for Sepsis Patient Trajectories

    • data.niaid.nih.gov
    • zenodo.org
    Updated Jul 19, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ghahremani, Sona (2024). Simplified Event Logs for Sepsis Patient Trajectories [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_3989589
    Explore at:
    Dataset updated
    Jul 19, 2024
    Dataset provided by
    Sakizloglou, Lucas
    Ghahremani, Sona
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset contains a simplified excerpt from a real event log that tracks the trajectories of patients admitted to a hospital to be treated for sepsis, a life-threatening condition. The log has been recorded by the Enterprise Resource Planning of the hospital. Additionally, the dataset contains three synthetic logs that increase the number of trajectories within the original log timespan, while maintaining other statistical characteristics.

    In total, the dataset contains four files in .zip format and a companion that describes the statistical method used to synthesize the logs as well as the dataset content in detail. The dataset can be used in testing the performance of event-based process-mining and log (runtime) monitoring tools against an increasing load of events.

  13. Artificial datasets for multi-perspective Declare analysis

    • zenodo.org
    zip
    Updated Jan 24, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Andrea Burattin; Andrea Burattin (2020). Artificial datasets for multi-perspective Declare analysis [Dataset]. http://doi.org/10.5281/zenodo.20030
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jan 24, 2020
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Andrea Burattin; Andrea Burattin
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    This file contains the dataset we used for the evaluation of the multi-perspective Declare analysis.

    Logs

    In particular, it contains logs with different sizes and different trace lengths. We generated traces with 10, 20, 30, 40, and 50 events and, for each of these lengths, we generated logs with 25000, 50000, 75000, and 100000 traces. Therefore, in total, there are 20 logs.

    Declare models

    In addition, the dataset contains 10 Declare models. In particular, we prepared two models with 10 constraints, one only containing constraints on the control-flow (without conditions on data and time), and another one including real multi-perspective constraints (with conditions on time and data). We followed the same procedure to create models with 20, 30, 40, and 50 constraints.

  14. Z

    Data from: An IoT-Enriched Event Log for Process Mining in Smart Factories

    • data.niaid.nih.gov
    • zenodo.org
    Updated Jun 10, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Malburg, Lukas (2024). An IoT-Enriched Event Log for Process Mining in Smart Factories [Dataset]. https://data.niaid.nih.gov/resources?id=ZENODO_7795547
    Explore at:
    Dataset updated
    Jun 10, 2024
    Dataset provided by
    Malburg, Lukas
    Bergmann, Ralph
    Grüger, Joscha
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    DEPRECATED - current version: https://figshare.com/articles/dataset/Dataset_An_IoT-Enriched_Event_Log_for_Process_Mining_in_Smart_Factories/20130794

    Modern technologies such as the Internet of Things (IoT) are becoming increasingly important in various domains, including Business Process Management (BPM) research. One main research area in BPM is process mining, which can be used to analyze event logs, e.g., for checking the conformance of running processes. However, there are only a few IoT-based event logs available for research purposes. Some of them are artificially generated, and the problem occurs that they do not always completely reflect the actual physical properties of smart environments. In this paper, we present an IoT-enriched XES event log that is generated by a physical smart factory. For this purpose, we created the DataStream XES extension for representing IoT-data in event logs. Finally, we present some preliminary analysis and properties of the log.

  15. n

    Activities of daily living of several individuals

    • narcis.nl
    • data.4tu.nl
    • +2more
    Updated Nov 3, 2015
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Timo Sztyler; J. (Josep) Carmona (2015). Activities of daily living of several individuals [Dataset]. http://doi.org/10.4121/uuid:01eaba9f-d3ed-4e04-9945-b8b302764176
    Explore at:
    media types: application/x-gzip, application/zip, text/plainAvailable download formats
    Dataset updated
    Nov 3, 2015
    Dataset provided by
    University of Mannheim, Germany
    Authors
    Timo Sztyler; J. (Josep) Carmona
    Description

    This dataset comprises event logs (XES = Extensible Event Stream) regarding the activities of daily living performed by several individuals. The event logs were derived from sensor data which was collected in different scenarios and represent activities of daily living performed by several individuals. These include e.g., sleeping, meal preparation, and washing. The event logs show the different behavior of people in their own homes but also common patterns. The attached event logs were created with Fluxicon Disco ({http://fluxicon.com/disco/}).

  16. Statechart Workbench and Alignments Software Event Log

    • search.datacite.org
    Updated Aug 31, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Maikel Leemans (2018). Statechart Workbench and Alignments Software Event Log [Dataset]. http://doi.org/10.4121/uuid:7f787965-da13-4bb8-a3fd-242f08aef9c4
    Explore at:
    Dataset updated
    Aug 31, 2018
    Dataset provided by
    DataCitehttps://www.datacite.org/
    Eindhoven University of Technology
    Authors
    Maikel Leemans
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Extensible Event Stream (XES) software event log obtained through instrumenting the Statechart Workbench ProM plugin using the tool available at {https://svn.win.tue.nl/repos/prom/XPort/}. This event log contains method-call level events describing a workbench run invoking the Alignments algorithm using the BPI Challenge 2012 event log available and documented at {https://doi.org/10.4121/uuid:3926db30-f712-4394-aebc-75976070e91f} . Note that the life-cycle information in this log corresponds to method call (start) and return (complete), and captures a method-call hierarchy.

  17. u

    Human-Computer Interaction Logs

    • indigo.uic.edu
    zip
    Updated May 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Julian Theis; Houshang Darabi (2023). Human-Computer Interaction Logs [Dataset]. http://doi.org/10.25417/uic.11923386.v1
    Explore at:
    zipAvailable download formats
    Dataset updated
    May 30, 2023
    Dataset provided by
    University of Illinois Chicago
    Authors
    Julian Theis; Houshang Darabi
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset comprises ten human-computer interaction logs of real participants who solved a given task in a Windows environment. The participants were allowed to use the standard notepad, calculator, and file explorer. All recordings are anonymized and do not contain any private information.Simple:Each of the five log files in the folder simple contains Human-Computer Interaction recordings of a participant solving a simple task. Participants were provided 30 raw text files where each one contained data about the revenue and expenses of a single product for a given time period. In total 15 summaries were asked to be created by summarizing the data of two files and calculating the combined revenue, expenses, and profit. Complex:Each of the five log files in the folder complex contains Human-Computer Interaction recordings of a participant solving a more advanced task. In particular, participants were given a folder of text documents and were asked to create summary documents that contain the total revenue and expenses of the quarter, profit, and, where applicable, profit improvement compared to the previous quarter and the same quarter of the previous year. Each quarter’s data comprised multiple text files.The logging application that has been used is the one described inJulian Theis and Houshang Darabi. 2019. Behavioral Petri Net Mining and Automated Analysis for Human-Computer Interaction Recommendations in Multi-Application Environments. Proc. ACM Hum.-Comput. Interact. 3, EICS, Article 13 (June 2019), 16 pages. DOI: https://doi.org/10.1145/3331155Please refer to Table 1 and Table 2 of this publication regarding the structure of the log files. The first column corresponds to the timestamp in milliseconds, the second column represents the event key, and the third column contains additional event-specific information.

  18. f

    Environmental permit application process (‘WABO’), CoSeLoG project –...

    • figshare.com
    • data.4tu.nl
    txt
    Updated Jun 9, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    J.C.A.M. Buijs (2023). Environmental permit application process (‘WABO’), CoSeLoG project – Municipality 5 [Dataset]. http://doi.org/10.4121/uuid:c399c768-d995-4086-adda-c0bc72ad02bc
    Explore at:
    txtAvailable download formats
    Dataset updated
    Jun 9, 2023
    Dataset provided by
    4TU.ResearchData
    Authors
    J.C.A.M. Buijs
    License

    https://doi.org/10.4121/resource:terms_of_usehttps://doi.org/10.4121/resource:terms_of_use

    Description

    This data originates from the CoSeLoG project executed under NWO project number 638.001.211. Within the CoSeLoG project the (dis)similarities between several processes of different municipalities in the Netherlands has been investigated. This data is part of a collection of 5 event logs that record the execution of a building permit application process in five different anonymous municipalities. The recording of these processes is comparable which means that activity labels in the different event logs refer to the same activities performed in the five municipalities. The event log contains additional data on the case/trace and event level. In total there are 16 trace level attributes which include the parts the permit consists of, the planned end date and the responsible actor. The event attributes contain the activity label, the timestamp of execution and the human resource involved. Furthermore, information regarding the planned date of execution and the due date is included. Additionally events may contain additional process data. Some attributes contain Dutch terms and phrases. Parent item: Environmental permit application process (‘WABO’), CoSeLoG project This data originates from the CoSeLoG project executed under NWO project number 638.001.211. Within the CoSeLoG project the (dis)similarities between several processes of different municipalities in the Netherlands has been investigated. The dataset consists of 5 event logs that record the execution of a building permit application process in five different anonymous municipalities. The recording of these processes is comparable which means that activity labels in the different event logs refer to the same activities performed in the five municipalities.

  19. BPM Synthetic UI Logs Collection

    • zenodo.org
    zip
    Updated Sep 9, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Antonio Martínez-Rojas; Antonio Martínez-Rojas; Hajo A. Reijers; Hajo A. Reijers; José González Enríquez; José González Enríquez (2024). BPM Synthetic UI Logs Collection [Dataset]. http://doi.org/10.5281/zenodo.8202749
    Explore at:
    zipAvailable download formats
    Dataset updated
    Sep 9, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Antonio Martínez-Rojas; Antonio Martínez-Rojas; Hajo A. Reijers; Hajo A. Reijers; José González Enríquez; José González Enríquez
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This data package described in the BPM Demos&Resources publication entitled: "BPM Hub: An Open Collection of UI Logs", consists of synthetic UI logs along with corresponding screenshots. The UI logs closely resemble real-world use cases within the administrative domain. They exhibit varying levels of complexity, measured by the number of activities, process variants, and visual features that influence the outcome of decision points. For its generation, the BPM Log Generator tool has been used, which requires the following initial generation configuration:

    Initial Generation Configuration

    • Seed log: Includes a single instance for each process variant and their associated screenshots.
    • Variability configuration:
      • Case-level: Refers to variations in the content that can be introduced or modified by the user, such as variations in the text inputs, selectable options, checkboxes, etc.
      • Scenario-level: Refers to varying the GUI (Graphical User Interface) components related to the look and feel of the different applications appearing in the process screenshots.

    Data Package Contents

    The data package comprises three distinct processes, P1, P2, P3, for which their initial configuration is provided, i.e., a tuple of

    P1. Client Creation

    • Activities: 5
    • Variants: 2
    • Decision point: Revolves around the presence of an attachment in the reception of an email.

    P2. Client Deletion. User's presence in the system

    • Activities: 7
    • Variants: 2
    • Decision point: Based on the result of the user's search in the Customer Management System (CRM), represented by a checkbox.

    P3. Client Deletion. Validation of customer payments

    • Activities: 7
    • Variants: 4
    • Decision: Involves two conditions:
      1. The presence of an attachment justifying the payment of the invoices in the email.
      2. The existence of pending invoices in the user CRM profile.

    These problems depict processes with a single decision point, without cycles, and executed sequentially to ensure a non-interleaved execution pattern. Particularly, P3 shows higher complexity as its decision point is determined by two visual characteristics.

    Generation of UI Logs

    For each problem, case-level variations have been applied to generate logs with different sizes in the range of {10, 25, 50, 100} events. In cases where the log exceeds the desired size, the last instance is removed to maintain completeness. Each log size has its associated balanced and unbalanced log. Balanced logs have an approximately equal distribution of instances across variants, while unbalanced logs have a frequency difference of more than 20% between the most frequent and least frequent variants.

    Scenarios

    To ensure the reliability of the obtained results, 30 scenarios are generated for each tuple

    Additional Artefacts

    In addition, each problem includes two more artefacts:

    • initial_generation_configuration folder: Holds the data needed for problem data generation using the [5] tool.
    • decision.json file: Specifies the condition driving the decision made at the decision point.

    decision.json

    The decision.json acts as a testing oracle, serving as a label for validating mined data. It contains two main sections: "UICompos" and "decision". The "UICompos" section includes a key for each activity related to the decision, storing key-value pairs that represent the UI components involved, along with their bounding box coordinates. The "decision" section defines the condition for a case to match a specific variant based on the mentioned UI components.

  20. 4

    Data underlying the publication: A Ground Truth Approach for Assessing...

    • data.4tu.nl
    zip
    Updated Feb 4, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dominique Sommers (2025). Data underlying the publication: A Ground Truth Approach for Assessing Process Mining Techniques [Dataset]. http://doi.org/10.4121/bc43e334-74e1-44ff-abf1-ed32847250c9.v1
    Explore at:
    zipAvailable download formats
    Dataset updated
    Feb 4, 2025
    Dataset provided by
    4TU.ResearchData
    Authors
    Dominique Sommers
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    This folder contains the synthetically generated dataset (process model and event logs) containing process data of a synthetically designed package delivery process, as described in [1]. The event logs present simulations of a process model, each with an incorporated issue, be it a behavioral deviation, i.e., where the process is differently exhibited with regard to the expected behavior described by the process model, or a recording error, i.e., where the execution of the process is recorded differently with regard to how it is exhibited. Each issue is added to the process model through a model transformation providing ground truth to the discrepancies introduced in the simulated event log.


    The package delivery process starts with the choice of home or depot delivery, after which the package queues for a warehouse employee to pick and load it into a van. In case of home delivery, a courier drives off and rings a door after which he continues to either immediately hand over the package, or deliver it at the corresponding depot after registration, where it is left for collection. Alternatively, for depot delivery, "ringing" and therefore also "deliver at home" is omitted in the subprocess.

    models/delivery_base_model.json contains the specification of the process model that incorporates this "expected behavior", and is depicted in models/delivery_base_model.pdf.


    On top of this, six patterns of behavioral deviations (BI) and six patterns of recording errors (RI) are applied to the base model:

    BI5: Overtaking in the FIFO queue for picking packages;

    BI7: Switching roles from a courier to that of a warehouse employee;

    BI10: Batching is ignored, leaving with a delivery van before it was fully loaded;

    BI3: Skipping the activity of ringing, modeling behavior where e.g., the door was already opened upon arrival;

    BI9: Different resource memory where the package is delivered to a different depot than where it is registered;

    BI2: Multitasking of couriers during the delivery of multiple packages, modeling interruption of a delivery;

    RI1: Incorrect event, recording an order for depot delivery when it was intended for home delivery;

    RI2: Incorrect event, vice versa, i.e., recording an order for home delivery when it was intended for depot delivery;

    RI3: Missing event for the activity of loading a package in a truck;

    RI4: Missing object of the involved van for loading, e.g., due to a temporary connection failure of a recording device;

    RI5: Incorrect object of the involved courier when ringing, e.g., due to not logging out by the courier on the previous shift;

    RI6: Missing positions for the recording of the delivery and the collection at a depot, e.g., due to coarse timestamp logging.


    The behavior of each deviation pattern is added separately to the base model, resulting in twelve process models, accordingly named models/package_delivery_

    Each model is simulated resulting in twelve logs, accordingly named logs/package_delivery_


    All models and corresponding generated logs with the applied patterns are also available at gitlab.com/dominiquesommers/mira/-/tree/main/mira/simulation, which additionally includes scripts to load and process the data.


    We refer to [1] for more information on the dataset.


    [1] Dominique Sommers, Natalia Sidorova, Boudewijn F. van Dongen. A ground truth approach for assessing process mining techniques. arXiv preprint, https://doi.org/10.48550/arXiv.2501.14345, 2025.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Dafna Levy (2020). Production Analysis with Process Mining Technology [Dataset]. http://doi.org/10.4121/uuid:68726926-5ac5-4fab-b873-ee76ea412399

Production Analysis with Process Mining Technology

Explore at:
32 scholarly articles cite this dataset (View in Google Scholar)
zipAvailable download formats
Dataset updated
Jul 28, 2020
Dataset provided by
4TU.ResearchData
Authors
Dafna Levy
License

https://doi.org/10.4121/resource:terms_of_usehttps://doi.org/10.4121/resource:terms_of_use

Description

The comma separated value dataset contains process data from a production process, including data on cases, activities, resources, timestamps and more data fields.

Search
Clear search
Close search
Google apps
Main menu