69 datasets found
  1. u

    Public benchmark dataset for Conformance Checking in Process Mining

    • figshare.unimelb.edu.au
    • melbourne.figshare.com
    xml
    Updated Jan 30, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Daniel Reissner (2022). Public benchmark dataset for Conformance Checking in Process Mining [Dataset]. http://doi.org/10.26188/5cd91d0d3adaa
    Explore at:
    xmlAvailable download formats
    Dataset updated
    Jan 30, 2022
    Dataset provided by
    The University of Melbourne
    Authors
    Daniel Reissner
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset contains a variety of publicly available real-life event logs. We derived two types of Petri nets for each event log with two state-of-the-art process miners : Inductive Miner (IM) and Split Miner (SM). Each event log-Petri net pair is intended for evaluating the scalability of existing conformance checking techniques.We used this data-set to evaluate the scalability of the S-Component approach for measuring fitness. The dataset contains tables of descriptive statistics of both process models and event logs. In addition, this dataset includes the results in terms of time performance measured in milliseconds for several approaches for both multi-threaded and single-threaded executions. Last, the dataset contains a cost-comparison of different approaches and reports on the degree of over-approximation of the S-Components approach. The description of the compared conformance checking techniques can be found here: https://arxiv.org/abs/1910.09767. Update:The dataset has been extended with the event logs of the BPIC18 and BPIC19 logs. BPIC19 is actually a collection of four different processes and thus was split into four event logs. For each of the additional five event logs, again, two process models have been mined with inductive and split miner. We used the extended dataset to test the scalability of our tandem repeats approach for measuring fitness. The dataset now contains updated tables of log and model statistics as well as tables of the conducted experiments measuring execution time and raw fitness cost of various fitness approaches. The description of the compared conformance checking techniques can be found here: https://arxiv.org/abs/2004.01781.Update: The dataset has also been used to measure the scalability of a new Generalization measure based on concurrent and repetitive patterns. : A concurrency oracle is used in tandem with partial orders to identify concurrent patterns in the log that are tested against parallel blocks in the process model. Tandem repeats are used with various trace reduction and extensions to define repetitive patterns in the log that are tested against loops in the process model. Each pattern is assigned a partial fulfillment. The generalization is then the average of pattern fulfillments weighted by the trace counts for which the patterns have been observed. The dataset no includes the time results and a breakdown of Generalization values for the dataset.

  2. Z

    Process Models obtained from event logs with with different...

    • data.niaid.nih.gov
    • zenodo.org
    Updated Jan 24, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dirk Fahland (2020). Process Models obtained from event logs with with different information-preserving abstractions [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_3243987
    Explore at:
    Dataset updated
    Jan 24, 2020
    Dataset provided by
    Sander J.J. Leemans
    Dirk Fahland
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset contains results of the experiment to analyze information preservation and recovery by different event log abstractions in process mining described in: Sander J.J. Leemans, Dirk Fahland "Information-Preserving Abstractions of Event Data in Process Mining" Knowledge and Information Systems, ISSN: 0219-1377 (Print) 0219-3116 (Online), accepted May 2019

    The experiment results were obtained with: https://doi.org/10.5281/zenodo.3243981

  3. Event log with data attributes

    • figshare.com
    xml
    Updated Aug 31, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dina Bayomie (2022). Event log with data attributes [Dataset]. http://doi.org/10.6084/m9.figshare.20736706.v1
    Explore at:
    xmlAvailable download formats
    Dataset updated
    Aug 31, 2022
    Dataset provided by
    figshare
    Figsharehttp://figshare.com/
    Authors
    Dina Bayomie
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    These 60 event log varies over the number of cases and the density of the overlapping cases. The log has the following event attributes: event id, case id, activity, timestamp, loan type, amount, resources, and status. And the BPMN scenarios were used to simulate the process.

  4. Differentially Private Event Logs for Process Mining: Supplementary Material...

    • zenodo.org
    • data.niaid.nih.gov
    csv, pdf, zip
    Updated Jul 19, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Gamal Elkoumy; Gamal Elkoumy; Alisa Pankova; Marlon Dumas; Marlon Dumas; Alisa Pankova (2024). Differentially Private Event Logs for Process Mining: Supplementary Material [Dataset]. http://doi.org/10.5281/zenodo.4601139
    Explore at:
    zip, pdf, csvAvailable download formats
    Dataset updated
    Jul 19, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Gamal Elkoumy; Gamal Elkoumy; Alisa Pankova; Marlon Dumas; Marlon Dumas; Alisa Pankova
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    In this archive, we provide supplementary material for our paper entitled "Mine Me but Don’t Single Me Out: DifferentiallyPrivate Event Logs for Process Mining". We list the selected event logs and their characteristics and descriptive statistics. Also, this archive contains the anonymized event logs as the result of the experiments. The source code is available on GitHub.

  5. Dataset of mHealth event logs

    • figshare.com
    pdf
    Updated May 1, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Raoul Nuijten; Pieter Van Gorp (2022). Dataset of mHealth event logs [Dataset]. http://doi.org/10.6084/m9.figshare.19688730.v2
    Explore at:
    pdfAvailable download formats
    Dataset updated
    May 1, 2022
    Dataset provided by
    figshare
    Authors
    Raoul Nuijten; Pieter Van Gorp
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    How does Facebook always seems to know what the next funny video should be to sustain your attention with the platform? Facebook has not asked you whether you like videos of cats doing something funny: They just seem to know. In fact, FaceBook learns through your behavior on the platform (e.g., how long have you engaged with similar movies, what posts have you previously liked or commented on, etc.). As a result, Facebook is able to sustain the attention of their user for a long time. On the other hand, the typical mHealth apps suffer from rapidly collapsing user engagement levels. To sustain engagement levels, mHealth apps nowadays employ all sorts of intervention strategies. Of course, it would be powerful to know—like Facebook knows—what strategy should be presented to what individual to sustain their engagement. To be able to do that, the first step could be to be able to cluster similar users (and then derive intervention strategies from there). This dataset was collected through a single mHealth app over 8 different mHealth campaigns (i.e., scientific studies). Using this dataset, one could derive clusters from app user event data. One approach could be to differentiate between two phases: a process mining phase and a clustering phase. In the process mining phase one may derive from the dataset the processes (i.e., sequences of app actions) that users undertake. In the clustering phase, based on the processes different users engaged in, one may cluster similar users (i.e., users that perform similar sequences of app actions).

    List of files

    0-list-of-variables.pdf includes an overview of different variables within the dataset. 1-description-of-endpoints.pdf includes a description of the unique endpoints that appear in the dataset. 2-requests.csv includes the dataset with actual app user event data. 2-requests-by-session.csv includes the dataset with actual app user event data with a session variable, to differentiate between user requests that were made in the same session.

  6. Z

    Data from: An IoT-Enriched Event Log for Process Mining in Smart Factories

    • data.niaid.nih.gov
    Updated Jun 10, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Grüger, Joscha (2024). An IoT-Enriched Event Log for Process Mining in Smart Factories [Dataset]. https://data.niaid.nih.gov/resources?id=ZENODO_7795547
    Explore at:
    Dataset updated
    Jun 10, 2024
    Dataset provided by
    Bergmann, Ralph
    Malburg, Lukas
    Grüger, Joscha
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    DEPRECATED - current version: https://figshare.com/articles/dataset/Dataset_An_IoT-Enriched_Event_Log_for_Process_Mining_in_Smart_Factories/20130794

    Modern technologies such as the Internet of Things (IoT) are becoming increasingly important in various domains, including Business Process Management (BPM) research. One main research area in BPM is process mining, which can be used to analyze event logs, e.g., for checking the conformance of running processes. However, there are only a few IoT-based event logs available for research purposes. Some of them are artificially generated, and the problem occurs that they do not always completely reflect the actual physical properties of smart environments. In this paper, we present an IoT-enriched XES event log that is generated by a physical smart factory. For this purpose, we created the DataStream XES extension for representing IoT-data in event logs. Finally, we present some preliminary analysis and properties of the log.

  7. Data from: An Empirical Evaluation of Unsupervised Event Log Abstraction...

    • zenodo.org
    • data.niaid.nih.gov
    bin
    Updated Nov 27, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Greg Van Houdt; Greg Van Houdt; Massimiliano de Leoni; Benoît Depaire; Benoît Depaire; Niels Martin; Niels Martin; Massimiliano de Leoni (2023). An Empirical Evaluation of Unsupervised Event Log Abstraction Techniques in Process Mining [Dataset]. http://doi.org/10.5281/zenodo.6793544
    Explore at:
    binAvailable download formats
    Dataset updated
    Nov 27, 2023
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Greg Van Houdt; Greg Van Houdt; Massimiliano de Leoni; Benoît Depaire; Benoît Depaire; Niels Martin; Niels Martin; Massimiliano de Leoni
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This upload contains the event logs, generated by L-Sim, on which the experiments of the related paper were performed.

    The related paper is accepted in the journal Information Systems.

  8. f

    Document Processing Event Logs

    • figshare.com
    • data.4tu.nl
    xml
    Updated Jun 11, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Almir Djedović (2023). Document Processing Event Logs [Dataset]. http://doi.org/10.4121/uuid:6df27e59-6221-4ca2-9cc4-65c66588c6eb
    Explore at:
    xmlAvailable download formats
    Dataset updated
    Jun 11, 2023
    Dataset provided by
    4TU.ResearchData
    Authors
    Almir Djedović
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    The given set of data contains information about the process of document processing. The process of processing documents contains the following activities: Receiving a Document, Creating a new Case, Investing Document into a new Case and so on. The data set contains information about the event name, event type, time of the event's execution and the participant whose execution the event is related to. The data is formatted in the MXML format in order to be used for the process mining analysis using tools such as ProM and so on.

  9. Z

    A Collection of Event Logs of Blockchain-based Applications

    • data.niaid.nih.gov
    Updated Jan 29, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Alzhrani, Fouzia (2023). A Collection of Event Logs of Blockchain-based Applications [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_6637058
    Explore at:
    Dataset updated
    Jan 29, 2023
    Dataset authored and provided by
    Alzhrani, Fouzia
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    A set of event logs of 101 blockchain-based applications (DApps). For each DApp, there are two event log files. The first one is a raw version where data is encoded by blockchain. The second file is a decoded version where data is decoded into a human-readable format. If a DApp has multiple versions on different blockchain networks, then there are two event log files (encoded and decoded) for each version. In addition, the event registry file includes a comprehensive list of event names and their corresponding signatures obtained from contract ABIs of the 101 DApps.

  10. n

    Activities of daily living of several individuals

    • narcis.nl
    • figshare.com
    Updated Nov 3, 2015
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Timo Sztyler; J. (Josep) Carmona (2015). Activities of daily living of several individuals [Dataset]. http://doi.org/10.4121/uuid:01eaba9f-d3ed-4e04-9945-b8b302764176
    Explore at:
    media types: application/x-gzip, application/zip, text/plainAvailable download formats
    Dataset updated
    Nov 3, 2015
    Dataset provided by
    University of Mannheim, Germany
    Authors
    Timo Sztyler; J. (Josep) Carmona
    Description

    This dataset comprises event logs (XES = Extensible Event Stream) regarding the activities of daily living performed by several individuals. The event logs were derived from sensor data which was collected in different scenarios and represent activities of daily living performed by several individuals. These include e.g., sleeping, meal preparation, and washing. The event logs show the different behavior of people in their own homes but also common patterns. The attached event logs were created with Fluxicon Disco ({http://fluxicon.com/disco/}).

  11. f

    Statechart Workbench and Alignments Software Event Log

    • figshare.com
    • 4tu.edu.hpc.n-helix.com
    • +1more
    zip
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Maikel Leemans (2023). Statechart Workbench and Alignments Software Event Log [Dataset]. http://doi.org/10.4121/uuid:7f787965-da13-4bb8-a3fd-242f08aef9c4
    Explore at:
    zipAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    4TU.ResearchData
    Authors
    Maikel Leemans
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Extensible Event Stream (XES) software event log obtained through instrumenting the Statechart Workbench ProM plugin using the tool available at {https://svn.win.tue.nl/repos/prom/XPort/}. This event log contains method-call level events describing a workbench run invoking the Alignments algorithm using the BPI Challenge 2012 event log available and documented at {https://doi.org/10.4121/uuid:3926db30-f712-4394-aebc-75976070e91f} . Note that the life-cycle information in this log corresponds to method call (start) and return (complete), and captures a method-call hierarchy.

  12. f

    Benchmarking logs to test scalability of process discovery algorithms

    • figshare.com
    • data.4tu.nl
    zip
    Updated Jul 28, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Wil van der Aalst (2020). Benchmarking logs to test scalability of process discovery algorithms [Dataset]. http://doi.org/10.4121/uuid:1cc41f8a-3557-499a-8b34-880c1251bd6e
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jul 28, 2020
    Dataset provided by
    4TU.ResearchData
    Authors
    Wil van der Aalst
    License

    https://doi.org/10.4121/resource:terms_of_usehttps://doi.org/10.4121/resource:terms_of_use

    Description

    The set of event logs included, are aimed to support the evaluation of the performance of process discovery algorithms. The largest event logs in this data set have millions of events. If you need even bigger datasets, you can generate these yourself using the CPN Tools sources files included (*.cpn). Each file has two parameters nofcases (i.e., the number of process instances) and nofdupl (i.e., the number of times a process is replicated with unique new names).

  13. Evaluation datasets and results of the paper "Efficient Online Computation...

    • zenodo.org
    • data.niaid.nih.gov
    zip
    Updated Sep 1, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    David Chapela-Campa; David Chapela-Campa; Marlon Dumas; Marlon Dumas (2024). Evaluation datasets and results of the paper "Efficient Online Computation of Business Process State From Trace Prefixes via N-Gram Indexing" [Dataset]. http://doi.org/10.5281/zenodo.13625880
    Explore at:
    zipAvailable download formats
    Dataset updated
    Sep 1, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    David Chapela-Campa; David Chapela-Campa; Marlon Dumas; Marlon Dumas
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Event logs, process models, and results corresponding to the paper "Efficient Online Computation of Business Process State From Trace Prefixes via N-Gram Indexing".

    Inputs: preprocessed event logs and discovered process models (and their characteristics) used in the evaluation.

    • Real-life: preprocessed event logs (xes and csv) corresponding to the real-life processes used in the evaluation. Process models (bpmn and pnml) discovered with the Inductive Miner infrequent for thresholds of 10%, 20%, and 50%. Characteristics (txt) of the event logs and process models. Ongoing cases result of splitting each case in the preprocessed event logs (under folder split).
    • Synthetic: simulated event logs (csv) corresponding to the synthetic processes used in the evaluation. Designed process models (bpmn and pnml). Ongoing cases result of splitting each case in the preprocessed event logs (under folder split). Ongoing cases with injected noise as described in the publication (under folders noise_1, noise_2, and noise_3).
  14. 4

    BPI Challenge 2014: Activity log for incidents

    • data.4tu.nl
    Updated Apr 23, 2014
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rabobank Nederland (2014). BPI Challenge 2014: Activity log for incidents [Dataset]. https://data.4tu.nl/articles/dataset/BPI_Challenge_2014_Activity_log_for_incidents/12706424/1
    Explore at:
    Dataset updated
    Apr 23, 2014
    Dataset provided by
    Rabobank Nederland
    Description

    BPI Challenge 2014. This particular file contains the activity log for the incidents Parent item: BPI Challenge 2014 BPI Challenge 2014: Similar to other ICT companies, Rabobank Group ICT has to implement an increasing number of software releases, while the time to market is decreasing. Rabobank Group ICT has implemented the ITIL-processes and therefore uses the Change-proces for implementing these so called planned changes. Rabobank Group ICT is looking for fact-based insight into sub questions, concerning the impact of changes in the past, to predict the workload at the Service Desk and/or IT Operations after future changes. The challenge is to design a (draft) predictive model, which can be used to implement in a BI environment. The purpose of this predictive model will be to support Business Change Management in implementing software releases with less impact on the Service Desk and/or IT Operations. We have prepared several case-files with anonymous information from Rabobank Netherlands Group ICT for this challenge. The files contain record details from an ITIL Service Management tool called HP Service Manager. We provide you with extracts in CSV with the Interaction-, Incident- or Change-number as case ID. Next to these case-files, we provide you with an Activity-log, related to the Incident-cases. There is also a document detailing the data in the CSV file and providing background to the Service Management tool.

  15. S

    Sepsis Cases - Event Log

    • data.4tu.nl
    • figshare.com
    zip
    Updated Dec 7, 2016
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Felix Mannhardt (2016). Sepsis Cases - Event Log [Dataset]. http://doi.org/10.4121/uuid:915d2bfb-7e84-49ad-a286-dc35f063a460
    Explore at:
    zipAvailable download formats
    Dataset updated
    Dec 7, 2016
    Dataset provided by
    Eindhoven University of Technology
    Authors
    Felix Mannhardt
    License

    https://doi.org/10.4121/resource:terms_of_usehttps://doi.org/10.4121/resource:terms_of_use

    Time period covered
    Nov 7, 2013 - Jun 5, 2015
    Description

    This real-life event log contains events of sepsis cases from a hospital. Sepsis is a life threatening condition typically caused by an infection. One case represents the pathway through the hospital. The events were recorded by the ERP (Enterprise Resource Planning) system of the hospital. There are about 1000 cases with in total 15,000 events that were recorded for 16 different activities. Moreover, 39 data attributes are recorded, e.g., the group responsible for the activity, the results of tests and information from checklists. Events and attribute values have been anonymized. The time stamps of events have been randomized, but the time between events within a trace has not been altered.

  16. f

    Receipt phase of an environmental permit application process (WABO), CoSeLoG...

    • figshare.com
    • data.4tu.nl
    bin
    Updated May 30, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Joos Buijs (2023). Receipt phase of an environmental permit application process (WABO), CoSeLoG project [Dataset]. http://doi.org/10.4121/12709127.v2
    Explore at:
    binAvailable download formats
    Dataset updated
    May 30, 2023
    Dataset provided by
    4TU.ResearchData
    Authors
    Joos Buijs
    License

    https://doi.org/10.4121/resource:terms_of_usehttps://doi.org/10.4121/resource:terms_of_use

    Description

    This data originates from the CoSeLoG project executed under NWO project number 638.001.211. Within the CoSeLoG project the (dis)similarities between several processes of different municipalities in the Netherlands has been investigated. This event log contains the records of the execution of the receiving phase of the building permit application process in an anonymous municipality.

  17. d

    Synthetic event logs for multi-perspective trace clustering - Dataset -...

    • b2find.dkrz.de
    Updated Nov 4, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2023). Synthetic event logs for multi-perspective trace clustering - Dataset - B2FIND [Dataset]. https://b2find.dkrz.de/dataset/15295ba6-b2a8-524e-8f31-c0b89b73976d
    Explore at:
    Dataset updated
    Nov 4, 2023
    Description

    The data set contains a set of event logs for evaluating multi-perspective trace clustering approaches in process mining. Event logs were randomly generated from 5 different process models of different complexity levels. The attribute "cluster" refers to the ground truth label. Clusters can only be correctly identified when considering both, the data and the control flow perspective (attributes and trace).

  18. Artificial dataset for "Automatic Determination of Parameters Values for...

    • zenodo.org
    • data.niaid.nih.gov
    bz2
    Updated Jan 21, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Andrea Burattin; Andrea Burattin (2020). Artificial dataset for "Automatic Determination of Parameters Values for Heuristics Miner++" [Dataset]. http://doi.org/10.5281/zenodo.19227
    Explore at:
    bz2Available download formats
    Dataset updated
    Jan 21, 2020
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Andrea Burattin; Andrea Burattin
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    This set of processes, built for test purposes [1], is composed of 125 process models. These processes were created using PLG [2, 3]. The generation of the random processes is based on some basic “process patterns”, like the AND-split/join, XORsplit/join, the sequence of two activities, and so on.

    For each of the 125 process models, two logs were generated: one with 250 traces and one with 500 traces. In these logs, the 75% of the activities are expressed as time intervals (the other ones are instantaneous) and 5% of the traces are noise. In this context “noise” is considered either a swap between two activities or removal of an activity.

    References

    1. A. Burattin, A. Sperduti. "Automatic Determination of Parameters Values' for Heuristics Miner++". In Proceedings of IEEE Congress on Evolutionary Computation (IEEE WCCI CEC 2010); 10.1109/CEC.2010.5586208
    2. A. Burattin. “PLG2: Multiperspective Processes Randomization and Simulation for Online and Offline Settings”. In CoRR abs/1506.08415, Jun. 2015.
    3. A. Burattin and A. Sperduti. “PLG: a Framework for the Generation of Business Process Models and their Execution Logs”. In Proc. of the 6th Int. Workshop on Business Process Intelligence (BPI 2010); 2010.10.1007/978-3-642-20511-8_20.
  19. u

    Data from: Dataset for Collective Intelligence Architecture for IoT Using...

    • produccioncientifica.uca.es
    Updated 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rosa-Bilbao, Jesús; Reina Quintero, Antonia M.; Varela-Vaca, Angel Jesus; Gómez-López, María Teresa; Rosa-Bilbao, Jesús; Reina Quintero, Antonia M.; Varela-Vaca, Angel Jesus; Gómez-López, María Teresa (2025). Dataset for Collective Intelligence Architecture for IoT Using Federated Process Mining [Dataset]. https://produccioncientifica.uca.es/documentos/67bc32b6478fbf5d29390c94
    Explore at:
    Dataset updated
    2025
    Authors
    Rosa-Bilbao, Jesús; Reina Quintero, Antonia M.; Varela-Vaca, Angel Jesus; Gómez-López, María Teresa; Rosa-Bilbao, Jesús; Reina Quintero, Antonia M.; Varela-Vaca, Angel Jesus; Gómez-López, María Teresa
    Description

    This dataset contains the key elements used in the paper Collective Intelligence Architecture for IoT Using Federated Process Mining which range from complex event processing to process mining applied over multiple datasets. The information included is organized into the following sections:

    1.- CEPApp.siddhi: It contains the rules and configurations used for pattern detection and real-time event processing.

    2.- ProcessStorage.sol: Smart contract code used in the case study implemented on solidity using Polygon blockchain platform.

    3.- Datasets Used ({adlinterweave_dataset, adlmr_dataset, twor_dataset}.zip): Three datasets used in the study, each with events that have been processed using the CEP engine. The datasets are divided according to the rooms of the house:

    _room.csv: CSV file with the data related to the interactions of the room stay.

    _bathroom.csv: CSV file with the data related to the interactions of the bathroom stay.

    _other.csv: CSV file with the data related to the interactions of the rest of the rooms.

    4.- CEP Engine Processing Results ({cepresult_adlinterweave, cepresult_adlmr, cepresult_twor}.json): Output generated by the Siddhi CEP engine, stored in JSON format. The data is categorized into different files based on the type of detected activity:

    _room.json: Contains the events related to the stay in the room.

    _bathroom.json: Contains the events related to the bathing stay.

    _other.json: Contains the events related to the rest of the rooms.

    5.- Federated Event Logs ({xesresult_adlinterweave, xesresult_adlmr, xesresult_twor}.xes): Federated event logs in XES format, standard in process mining. Contains event traces obtained after the execution of the Event Log Integrator.

    6.- Process Mining Results: Models generated from the processed event logs:

    Process Trees ({procestree_adlinterweave, procestree_adlmr, procestree_twor}.svg): structured representation of the detected workflows.

    Petri Nets ({petrinet_adlinterweave, petrinet_adlmr, petrinet_twor}.svg): Mathematical model of the discovered processes, useful for compliance analysis and simulations.

    Disco Results ({disco_adlinterweave, disco_adlmr, disco_twor}.pdf): Process models discovered with the Disco tool.

    ProM Results ({prom_adlinterweave, prom_adlmr, prom_twor}.pdf): Models generated with ProM tool.

  20. n

    Generated Petri Net Markup Language (PNML) models and log traces (in XES)

    • narcis.nl
    • data.4tu.nl
    • +1more
    Updated Apr 9, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Vincent Bloemen (2018). Generated Petri Net Markup Language (PNML) models and log traces (in XES) [Dataset]. http://doi.org/10.4121/uuid:a6709ee4-2aa3-49a3-92db-247e8b5bf340
    Explore at:
    media types: application/msword, application/zip, text/plain, text/xmlAvailable download formats
    Dataset updated
    Apr 9, 2018
    Dataset provided by
    University of Twente
    Authors
    Vincent Bloemen
    Description

    Generated set of 4,320 Petri net models, each is combined with a single log trace, and the models exhibit various Petri net characteristics. The models were generated using PTandLogGenerator. Used in the paper "Symbolically Aligning Observed and Modelled Behaviour" - ACSD'18

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Daniel Reissner (2022). Public benchmark dataset for Conformance Checking in Process Mining [Dataset]. http://doi.org/10.26188/5cd91d0d3adaa

Public benchmark dataset for Conformance Checking in Process Mining

Explore at:
5 scholarly articles cite this dataset (View in Google Scholar)
xmlAvailable download formats
Dataset updated
Jan 30, 2022
Dataset provided by
The University of Melbourne
Authors
Daniel Reissner
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

This dataset contains a variety of publicly available real-life event logs. We derived two types of Petri nets for each event log with two state-of-the-art process miners : Inductive Miner (IM) and Split Miner (SM). Each event log-Petri net pair is intended for evaluating the scalability of existing conformance checking techniques.We used this data-set to evaluate the scalability of the S-Component approach for measuring fitness. The dataset contains tables of descriptive statistics of both process models and event logs. In addition, this dataset includes the results in terms of time performance measured in milliseconds for several approaches for both multi-threaded and single-threaded executions. Last, the dataset contains a cost-comparison of different approaches and reports on the degree of over-approximation of the S-Components approach. The description of the compared conformance checking techniques can be found here: https://arxiv.org/abs/1910.09767. Update:The dataset has been extended with the event logs of the BPIC18 and BPIC19 logs. BPIC19 is actually a collection of four different processes and thus was split into four event logs. For each of the additional five event logs, again, two process models have been mined with inductive and split miner. We used the extended dataset to test the scalability of our tandem repeats approach for measuring fitness. The dataset now contains updated tables of log and model statistics as well as tables of the conducted experiments measuring execution time and raw fitness cost of various fitness approaches. The description of the compared conformance checking techniques can be found here: https://arxiv.org/abs/2004.01781.Update: The dataset has also been used to measure the scalability of a new Generalization measure based on concurrent and repetitive patterns. : A concurrency oracle is used in tandem with partial orders to identify concurrent patterns in the log that are tested against parallel blocks in the process model. Tandem repeats are used with various trace reduction and extensions to define repetitive patterns in the log that are tested against loops in the process model. Each pattern is assigned a partial fulfillment. The generalization is then the average of pattern fulfillments weighted by the trace counts for which the patterns have been observed. The dataset no includes the time results and a breakdown of Generalization values for the dataset.

Search
Clear search
Close search
Google apps
Main menu