100+ datasets found
  1. Event logs for process mining

    • kaggle.com
    zip
    Updated Apr 11, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Alberto (2023). Event logs for process mining [Dataset]. https://www.kaggle.com/datasets/carlosalvite/car-insurance-claims-event-log-for-process-mining
    Explore at:
    zip(4892593 bytes)Available download formats
    Dataset updated
    Apr 11, 2023
    Authors
    Alberto
    License

    http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/

    Description

    Description This event log has been artificially generated and curated to provide a comprehensive view of car insurance claims, allowing users to discover and identify bottlenecks, automation opportunities, conformance issues, reworks, and potential fraudulent cases using any process mining software.

    You can find more event logs here: https://processminingdata.com/JfVPOR

    Standard Process flow: “First Notification of Loss (FNOL)” -> “Assign Claim” -> “Claim Decision” -> “Set Reserve” -> “Payment Sent” -> “Close Claim”

    Attributes: - case ID - activity name - timestamp - claimant name - agent name - adjuster name - claim amount - claimant age - type of policy - car make - car model - car year - date and time of the accident - type of accident - user type

    Total number of claims: 30,000

    Dates: Claims belong to years 2020, 2021, and 2022.

    Disclaimer: Personal names are fake.

  2. Process Mining Event Log - Incident Management

    • kaggle.com
    zip
    Updated Apr 20, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Alberto P (2025). Process Mining Event Log - Incident Management [Dataset]. https://www.kaggle.com/datasets/albertopmd/process-mining-event-log-incident-management
    Explore at:
    zip(2301112 bytes)Available download formats
    Dataset updated
    Apr 20, 2025
    Authors
    Alberto P
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    This realistic incident management event log simulates a common IT service process and includes key inefficiencies found in real-world operations. You'll uncover SLA violations, multiple reassignments, bottlenecks, and conformance issues—making it an ideal dataset for hands-on process mining, root cause analysis, and performance optimization exercises.

    You can find more event logs + use case handbooks to guide your analysis here: https://processminingdata.com/

    Standard Process Flow: Ticket Created -> Ticket Assigned to Level 1 Support -> WIP - Level 1 Support -> Level 1 Escalates to Level 2 Support -> WIP - Level 2 Support -> Ticket Solved by Level 2 Support -> Customer Feedback Received -> Ticket Closed

    Total Number of Incident Tickets: 31,000+

    Process Variants: 13

    Number of Events: 242,000+

    Year: 2023

    File Format: CSV

    File Size: 65MB

  3. Event Log Datasets

    • figshare.com
    csv
    Updated Jul 15, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Qi Mo (2025). Event Log Datasets [Dataset]. http://doi.org/10.6084/m9.figshare.29568722.v1
    Explore at:
    csvAvailable download formats
    Dataset updated
    Jul 15, 2025
    Dataset provided by
    figshare
    Figsharehttp://figshare.com/
    Authors
    Qi Mo
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Twelve public event log datasets for Collaborative Business Processes (CBPs) are presented here. Among them, the event logs Log-01~Log-04 are collected from some CBPs for the treatment of certain diseases such as gastric ulcer and diabetes in the hospital, the event logs Log-05~Log-07 are collected from some CBPs for designing and manufacturing products such as automobiles, the event logs Log-08~Log-09 are collected from some CBPs for financial services such as bank loans, and the event logs Log-10~Log-12 are collected from some CBPs in e-commerce such as return processing in online stores.

  4. u

    Public benchmark dataset for Conformance Checking in Process Mining

    • figshare.unimelb.edu.au
    • melbourne.figshare.com
    xml
    Updated Jan 30, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Daniel Reissner (2022). Public benchmark dataset for Conformance Checking in Process Mining [Dataset]. http://doi.org/10.26188/5cd91d0d3adaa
    Explore at:
    xmlAvailable download formats
    Dataset updated
    Jan 30, 2022
    Dataset provided by
    The University of Melbourne
    Authors
    Daniel Reissner
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset contains a variety of publicly available real-life event logs. We derived two types of Petri nets for each event log with two state-of-the-art process miners : Inductive Miner (IM) and Split Miner (SM). Each event log-Petri net pair is intended for evaluating the scalability of existing conformance checking techniques.We used this data-set to evaluate the scalability of the S-Component approach for measuring fitness. The dataset contains tables of descriptive statistics of both process models and event logs. In addition, this dataset includes the results in terms of time performance measured in milliseconds for several approaches for both multi-threaded and single-threaded executions. Last, the dataset contains a cost-comparison of different approaches and reports on the degree of over-approximation of the S-Components approach. The description of the compared conformance checking techniques can be found here: https://arxiv.org/abs/1910.09767. Update:The dataset has been extended with the event logs of the BPIC18 and BPIC19 logs. BPIC19 is actually a collection of four different processes and thus was split into four event logs. For each of the additional five event logs, again, two process models have been mined with inductive and split miner. We used the extended dataset to test the scalability of our tandem repeats approach for measuring fitness. The dataset now contains updated tables of log and model statistics as well as tables of the conducted experiments measuring execution time and raw fitness cost of various fitness approaches. The description of the compared conformance checking techniques can be found here: https://arxiv.org/abs/2004.01781.Update: The dataset has also been used to measure the scalability of a new Generalization measure based on concurrent and repetitive patterns. : A concurrency oracle is used in tandem with partial orders to identify concurrent patterns in the log that are tested against parallel blocks in the process model. Tandem repeats are used with various trace reduction and extensions to define repetitive patterns in the log that are tested against loops in the process model. Each pattern is assigned a partial fulfillment. The generalization is then the average of pattern fulfillments weighted by the trace counts for which the patterns have been observed. The dataset no includes the time results and a breakdown of Generalization values for the dataset.

  5. 4

    Data-driven Process Discovery - Artificial Event Log

    • data.4tu.nl
    zip
    Updated Dec 8, 2016
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Felix Mannhardt (2016). Data-driven Process Discovery - Artificial Event Log [Dataset]. http://doi.org/10.4121/uuid:32cad43f-8bb9-46af-8333-48aae2bea037
    Explore at:
    zipAvailable download formats
    Dataset updated
    Dec 8, 2016
    Dataset provided by
    Eindhoven University of Technology
    Authors
    Felix Mannhardt
    License

    https://doi.org/10.4121/resource:terms_of_usehttps://doi.org/10.4121/resource:terms_of_use

    Description

    A synthetic event log with 100,000 traces and 900,000 events that was generated by simulating a simple artificial process model. There are three data attributes in the event log: Priority, Nurse, and Type. Some paths in the model are recorded infrequently based on the value of these attributes. Noise is added by randomly adding one additional event to an increasing number of traces. CPN Tools (http://cpntools.org) was used to generate the event log and inject the noise.

  6. 4

    Production Analysis with Process Mining Technology

    • data.4tu.nl
    • figshare.com
    zip
    Updated Jan 28, 2014
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dafna Levy (2014). Production Analysis with Process Mining Technology [Dataset]. http://doi.org/10.4121/uuid:68726926-5ac5-4fab-b873-ee76ea412399
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jan 28, 2014
    Dataset provided by
    NooL - Integrating People & Solutions
    Authors
    Dafna Levy
    License

    https://doi.org/10.4121/resource:terms_of_usehttps://doi.org/10.4121/resource:terms_of_use

    Description

    The comma separated value dataset contains process data from a production process, including data on cases, activities, resources, timestamps and more data fields.

  7. Dataset: An IoT-Enriched Event Log for Process Mining in Smart Factories

    • figshare.com
    txt
    Updated Jun 5, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Lukas Malburg; Joscha Grüger; Ralph Bergmann (2024). Dataset: An IoT-Enriched Event Log for Process Mining in Smart Factories [Dataset]. http://doi.org/10.6084/m9.figshare.20130794.v6
    Explore at:
    txtAvailable download formats
    Dataset updated
    Jun 5, 2024
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Lukas Malburg; Joscha Grüger; Ralph Bergmann
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Modern technologies such as the Internet of Things (IoT) are becoming increasingly important in various domains, including Business Process Management (BPM) research. One main research area in BPM is process mining, which can be used to analyze event logs, e.g., for checking the conformance of running processes. However, there are only a few IoT-based event logs available for research purposes. Some of them are artificially generated and the problem occurs that they do not always completely reflect the actual physical properties of smart environments. In this paper, we present an IoT-enriched XES event log that is generated by a physical smart factory. For this purpose, we create the DataStream/SensorStream XES extension for representing IoT-data in event logs. Finally, we present some preliminary analysis and properties of the log.

  8. Process Models obtained from event logs with with different...

    • zenodo.org
    • data.niaid.nih.gov
    zip
    Updated Jan 24, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sander J.J. Leemans; Dirk Fahland; Sander J.J. Leemans; Dirk Fahland (2020). Process Models obtained from event logs with with different information-preserving abstractions [Dataset]. http://doi.org/10.5281/zenodo.3243988
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jan 24, 2020
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Sander J.J. Leemans; Dirk Fahland; Sander J.J. Leemans; Dirk Fahland
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset contains results of the experiment to analyze information preservation and recovery by different event log abstractions in process mining described in: Sander J.J. Leemans, Dirk Fahland "Information-Preserving Abstractions of Event Data in Process Mining"
    Knowledge and Information Systems, ISSN: 0219-1377 (Print) 0219-3116 (Online), accepted May 2019

    The experiment results were obtained with: https://doi.org/10.5281/zenodo.3243981

  9. Z

    Differentially Private Event Logs for Process Mining: Supplementary Material...

    • nde-dev.biothings.io
    • data.niaid.nih.gov
    Updated Jul 19, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Gamal Elkoumy (2024). Differentially Private Event Logs for Process Mining: Supplementary Material [Dataset]. https://nde-dev.biothings.io/resources?id=zenodo_4601138
    Explore at:
    Dataset updated
    Jul 19, 2024
    Dataset provided by
    Marlon Dumas
    Gamal Elkoumy
    Alisa Pankova
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    In this archive, we provide supplementary material for our paper entitled "Mine Me but Don’t Single Me Out: DifferentiallyPrivate Event Logs for Process Mining". We list the selected event logs and their characteristics and descriptive statistics. Also, this archive contains the anonymized event logs as the result of the experiments. The source code is available on GitHub.

  10. Data from: An Empirical Evaluation of Unsupervised Event Log Abstraction...

    • zenodo.org
    • data.niaid.nih.gov
    bin
    Updated Nov 27, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Greg Van Houdt; Greg Van Houdt; Massimiliano de Leoni; Benoît Depaire; Benoît Depaire; Niels Martin; Niels Martin; Massimiliano de Leoni (2023). An Empirical Evaluation of Unsupervised Event Log Abstraction Techniques in Process Mining [Dataset]. http://doi.org/10.5281/zenodo.6793544
    Explore at:
    binAvailable download formats
    Dataset updated
    Nov 27, 2023
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Greg Van Houdt; Greg Van Houdt; Massimiliano de Leoni; Benoît Depaire; Benoît Depaire; Niels Martin; Niels Martin; Massimiliano de Leoni
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This upload contains the event logs, generated by L-Sim, on which the experiments of the related paper were performed.

    The related paper is accepted in the journal Information Systems.

  11. Z

    Simplified Event Logs for Sepsis Patient Trajectories

    • data.niaid.nih.gov
    Updated Jul 19, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sakizloglou, Lucas; Ghahremani, Sona (2024). Simplified Event Logs for Sepsis Patient Trajectories [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_3989589
    Explore at:
    Dataset updated
    Jul 19, 2024
    Dataset provided by
    Hasso Plattner Institute
    Authors
    Sakizloglou, Lucas; Ghahremani, Sona
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset contains a simplified excerpt from a real event log that tracks the trajectories of patients admitted to a hospital to be treated for sepsis, a life-threatening condition. The log has been recorded by the Enterprise Resource Planning of the hospital. Additionally, the dataset contains three synthetic logs that increase the number of trajectories within the original log timespan, while maintaining other statistical characteristics.

    In total, the dataset contains four files in .zip format and a companion that describes the statistical method used to synthesize the logs as well as the dataset content in detail. The dataset can be used in testing the performance of event-based process-mining and log (runtime) monitoring tools against an increasing load of events.

  12. (Un)Fair Process Mining Event Logs (Converted to OCEL)

    • zenodo.org
    bin, xml
    Updated Nov 6, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Alessandro Berti; Alessandro Berti (2024). (Un)Fair Process Mining Event Logs (Converted to OCEL) [Dataset]. http://doi.org/10.5281/zenodo.14043725
    Explore at:
    bin, xmlAvailable download formats
    Dataset updated
    Nov 6, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Alessandro Berti; Alessandro Berti
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Converted to OCEL 1.0 JSONOCEL and OCEL 2.0 XML from traditional event logs available at: Zenodo - Record 8059489.

    Object Types: { Person }

    Person-level Attributes:

    • (int) overallProtected: An attribute (0/1) indicating whether the person has experienced discrimination. (Note: If you're developing a fairness assessment algorithm, only use this attribute in the testing phase!)
    • (int) sumBoolDiscrFactors: Counts the number of possible discrimination factors that apply to the person.
    • (int) reworkedActivities: The total amount of rework involved in the person’s processing.
    • (float) throughputTime: The total processing time for a person.
    • (int) numOcc_ACTIVITY: Counts the number of times an activity occurs in the person’s lifecycle.

    Event-level Attributes:

    • resource: The resource involved in processing a given person.

    * Hiring

    The data describes a multifaceted recruitment process with diverse application pathways ranging from minimal processing to extensive multi-step procedures. The variability of these routes, largely dependent on numerous determinants, yields a spectrum of outcomes from instant rejection to successful job offers.

    The logs include attributes such as age, citizenship, German proficiency, gender, religion, and years of education. While these attributes may inform candidate profiles, their misuse could engender discrimination. Variables like age and education may signify experience and skills, citizenship and German language may address job logistics, but these should not unjustly eliminate applicants. Gender and religion, unrelated to job performance, must not sway hiring. Therefore, the use of these attributes must uphold fairness, avoiding any potential bias.

    * Hospital

    The data depicts a hospital treatment process that commences with registration at an Emergency Room or Family Department and advances through stages of examination, diagnosis, and treatment. Notably, unsuccessful treatments often entail repetitive diagnostic and treatment cycles, underscoring the iterative nature of healthcare provision.

    The logs incorporate patient attributes such as age, underlying condition, citizenship, German language proficiency, gender, and private insurance. These attributes, influencing the treatment process, may unveil potential discrimination. Factors like age and condition might affect case complexity and treatment path, while citizenship may highlight healthcare access disparities. German proficiency can impact provider-patient communication, thus affecting care quality. Gender could spotlight potential health disparities, while insurance status might indicate socio-economic influences on care quality or timeliness. Therefore, a comprehensive examination of these attributes vis-a-vis the treatment process could shed light on potential biases or disparities, fostering fairness in healthcare delivery.

    * Lending

    This data illustrates the steps within a loan application process. From an initial appointment request, the process navigates various stages, including information verification and underwriting, culminating in loan approval or denial. Additional steps may be required, such as co-signer enlistment or collateral assessment. Some cases experience outright appointment denial, indicating the process's variability, reflecting applicants' differing credit situations.

    The logs' attributes can aid in identifying influences on outcomes and detecting discrimination. Personal characteristics ('age', 'citizen', 'German speaking', and 'gender') and socio-economic indicators ('YearsOfEducation' and 'CreditScore') can impact the process. While 'yearsOfEducation' and 'CreditScore' can validly inform creditworthiness, 'age', 'citizen', 'language ability', and 'gender' should not bias loan decisions, ensuring these attributes are used responsibly fosters equitable loan processes.

    * Renting

    The data represents a rental process. It begins with a prospective tenant applying to view a property. Subsequent steps include an initial screening phase, viewing, decision-making, and a potential extensive screening. The process ends with the acceptance or rejection of the prospective tenant. In some cases, a tenant may apply for viewing but be rejected without the viewing occurring.

    The logs contain attributes that can shed light on potential biases in the process. 'Age', 'citizen', 'German speaking', 'gender', 'religious affiliation', and 'yearsOfEducation' might influence the rental process, leading to potential discrimination. While some attributes may provide useful insights into a potential tenant's reliability, misuse could result in discrimination. Thus, fairness must be observed in utilizing these attributes to avoid potential biases and ensure equitable treatment.

  13. 4

    Validation of Precision Measures - Event Logs and Process Models

    • data.4tu.nl
    • figshare.com
    zip
    Updated Jul 28, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Niek Tax (2020). Validation of Precision Measures - Event Logs and Process Models [Dataset]. http://doi.org/10.4121/uuid:991753f7-a240-4ba6-a8a8-67174a08c51b
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jul 28, 2020
    Dataset provided by
    4TU.ResearchData
    Authors
    Niek Tax
    License

    https://doi.org/10.4121/resource:terms_of_usehttps://doi.org/10.4121/resource:terms_of_use

    Description

    This collection contains the event logs and process models described and used in the paper "The Imprecisions of Precision Measures in Process Mining"

  14. f

    Sepsis Cases - Event Log

    • figshare.com
    • data.4tu.nl
    txt
    Updated Jun 7, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Felix Mannhardt (2023). Sepsis Cases - Event Log [Dataset]. http://doi.org/10.4121/uuid:915d2bfb-7e84-49ad-a286-dc35f063a460
    Explore at:
    txtAvailable download formats
    Dataset updated
    Jun 7, 2023
    Dataset provided by
    4TU.ResearchData
    Authors
    Felix Mannhardt
    License

    https://doi.org/10.4121/resource:terms_of_usehttps://doi.org/10.4121/resource:terms_of_use

    Description

    This real-life event log contains events of sepsis cases from a hospital. Sepsis is a life threatening condition typically caused by an infection. One case represents the pathway through the hospital. The events were recorded by the ERP (Enterprise Resource Planning) system of the hospital. There are about 1000 cases with in total 15,000 events that were recorded for 16 different activities. Moreover, 39 data attributes are recorded, e.g., the group responsible for the activity, the results of tests and information from checklists. Events and attribute values have been anonymized. The time stamps of events have been randomized, but the time between events within a trace has not been altered.

  15. Event log with data attributes

    • figshare.com
    xml
    Updated Aug 31, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dina Bayomie (2022). Event log with data attributes [Dataset]. http://doi.org/10.6084/m9.figshare.20736706.v1
    Explore at:
    xmlAvailable download formats
    Dataset updated
    Aug 31, 2022
    Dataset provided by
    figshare
    Figsharehttp://figshare.com/
    Authors
    Dina Bayomie
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    These 60 event log varies over the number of cases and the density of the overlapping cases. The log has the following event attributes: event id, case id, activity, timestamp, loan type, amount, resources, and status. And the BPMN scenarios were used to simulate the process.

  16. f

    Hospital Billing - Event Log

    • figshare.com
    • data.4tu.nl
    txt
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Felix Mannhardt (2023). Hospital Billing - Event Log [Dataset]. http://doi.org/10.4121/uuid:76c46b83-c930-4798-a1c9-4be94dfeb741
    Explore at:
    txtAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    4TU.ResearchData
    Authors
    Felix Mannhardt
    License

    https://doi.org/10.4121/resource:terms_of_usehttps://doi.org/10.4121/resource:terms_of_use

    Description

    The 'Hospital Billing' event log was obtained from the financial modules of the ERP system of a regional hospital. The event log contains events that are related to the billing of medical services that have been provided by the hospital. Each trace of the event log records the activities executed to bill a package of medical services that were bundled together. The event log does not contain information about the actual medical services provided by the hospital.

    The 100,000 traces in the event log are a random sample of process instances that were recorded over three years. Several attributes such as the 'state' of the process, the 'caseType', the underlying 'diagnosis' etc. are included in the event log. Events and attribute values have been anonymized. The time stamps of events have been randomized for this purpose, but the time between events within a trace has not been altered.

    More information about the event log can be found in the related publications.

  17. f

    Dataset of mHealth event logs

    • figshare.com
    pdf
    Updated May 1, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Raoul Nuijten; Pieter Van Gorp (2022). Dataset of mHealth event logs [Dataset]. http://doi.org/10.6084/m9.figshare.19688730.v2
    Explore at:
    pdfAvailable download formats
    Dataset updated
    May 1, 2022
    Dataset provided by
    figshare
    Authors
    Raoul Nuijten; Pieter Van Gorp
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    How does Facebook always seems to know what the next funny video should be to sustain your attention with the platform? Facebook has not asked you whether you like videos of cats doing something funny: They just seem to know. In fact, FaceBook learns through your behavior on the platform (e.g., how long have you engaged with similar movies, what posts have you previously liked or commented on, etc.). As a result, Facebook is able to sustain the attention of their user for a long time. On the other hand, the typical mHealth apps suffer from rapidly collapsing user engagement levels. To sustain engagement levels, mHealth apps nowadays employ all sorts of intervention strategies. Of course, it would be powerful to know—like Facebook knows—what strategy should be presented to what individual to sustain their engagement. To be able to do that, the first step could be to be able to cluster similar users (and then derive intervention strategies from there). This dataset was collected through a single mHealth app over 8 different mHealth campaigns (i.e., scientific studies). Using this dataset, one could derive clusters from app user event data. One approach could be to differentiate between two phases: a process mining phase and a clustering phase. In the process mining phase one may derive from the dataset the processes (i.e., sequences of app actions) that users undertake. In the clustering phase, based on the processes different users engaged in, one may cluster similar users (i.e., users that perform similar sequences of app actions).

    List of files

    0-list-of-variables.pdf includes an overview of different variables within the dataset. 1-description-of-endpoints.pdf includes a description of the unique endpoints that appear in the dataset. 2-requests.csv includes the dataset with actual app user event data. 2-requests-by-session.csv includes the dataset with actual app user event data with a session variable, to differentiate between user requests that were made in the same session.

  18. u

    Process Mining-Based Goal Recognition System Evaluation Dataset

    • figshare.unimelb.edu.au
    application/bzip2
    Updated Aug 11, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Zihang Su (2023). Process Mining-Based Goal Recognition System Evaluation Dataset [Dataset]. http://doi.org/10.26188/21749570.v4
    Explore at:
    application/bzip2Available download formats
    Dataset updated
    Aug 11, 2023
    Dataset provided by
    The University of Melbourne
    Authors
    Zihang Su
    License

    Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
    License information was derived automatically

    Description

    These datasets are used for evaluating the process mining-based goal recognition system proposed in the paper "Fast and Accurate Data-Driven Goal Recognition Using Process Mining Techniques." The datasets include a running example, an evaluation dataset for synthetic domains, and real-world business logs.running_example.tar.bz contains the traces shown in figure 2 of the paper for learning six skill models toward six goal candidates and the three walks shown in figure 1.a.synthetic_domains.tar.bz2 is the dataset for evaluating GR system in synthetic domains (IPC domains). There are two types of traces used for learning skill models, generated by the top-k planner and generated by the diverse planner. Please extract the archived domains located in topk/ and diverse/. In each domain, the sub-folder problems/ contains the dataset for learning skill models, and the sub-folder test/ contains the traces (plans) for testing the GR performance. There are five levels of observations, 10%, 30%, 50%, 70%, and 100%. For each level of observation, there are multiple problem instances, the instance ID starts from 0. A problem instance contains the synthetic domain model (PDDL files), training traces (in train/), and an observation for testing (obs.dat). The top-k and diverse planners for generating traces can be accessed here. The original PDDL models of the problem instances for the 15 IPC domains mentioned in the paper are available here.business_logs.tar.bz is the dataset for evaluating GR system in real-world domains. There are two types of problem instances: one with only two goal candidates (yes or no), referred to as "binary," and the other containing multiple goal candidates, termed "multiple." Please extract the archived files located in the directories binary/ and multiple/. The traces for learning the skill models can be found in XES files, and the traces (plans) for testing can be found in the directory goal*/.

  19. Statechart Workbench and Alignments Software Event Log

    • search.datacite.org
    • data.4tu.nl
    • +1more
    Updated Aug 31, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Maikel Leemans (2018). Statechart Workbench and Alignments Software Event Log [Dataset]. http://doi.org/10.4121/uuid:7f787965-da13-4bb8-a3fd-242f08aef9c4
    Explore at:
    Dataset updated
    Aug 31, 2018
    Dataset provided by
    DataCitehttps://www.datacite.org/
    Eindhoven University of Technology
    Authors
    Maikel Leemans
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Extensible Event Stream (XES) software event log obtained through instrumenting the Statechart Workbench ProM plugin using the tool available at {https://svn.win.tue.nl/repos/prom/XPort/}. This event log contains method-call level events describing a workbench run invoking the Alignments algorithm using the BPI Challenge 2012 event log available and documented at {https://doi.org/10.4121/uuid:3926db30-f712-4394-aebc-75976070e91f} . Note that the life-cycle information in this log corresponds to method call (start) and return (complete), and captures a method-call hierarchy.

  20. u

    Correlation Data for Species-Coverage-based Log Representativeness and TLRA

    • figshare.unimelb.edu.au
    xlsx
    Updated Aug 4, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Anandi Karunaratne; Artem Polyvyanyy (2025). Correlation Data for Species-Coverage-based Log Representativeness and TLRA [Dataset]. http://doi.org/10.26188/26410747.v1
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Aug 4, 2025
    Dataset provided by
    The University of Melbourne
    Authors
    Anandi Karunaratne; Artem Polyvyanyy
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset contains the correlation data for species-coverage-based log representativeness measure and Trace-based Log Representativeness Approximation (TLRA) across event logs of 60 generative systems and varying log sizes and noise levels.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Alberto (2023). Event logs for process mining [Dataset]. https://www.kaggle.com/datasets/carlosalvite/car-insurance-claims-event-log-for-process-mining
Organization logo

Event logs for process mining

Gain hands-on experience with a real car insurance claims use case.

Explore at:
zip(4892593 bytes)Available download formats
Dataset updated
Apr 11, 2023
Authors
Alberto
License

http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/

Description

Description This event log has been artificially generated and curated to provide a comprehensive view of car insurance claims, allowing users to discover and identify bottlenecks, automation opportunities, conformance issues, reworks, and potential fraudulent cases using any process mining software.

You can find more event logs here: https://processminingdata.com/JfVPOR

Standard Process flow: “First Notification of Loss (FNOL)” -> “Assign Claim” -> “Claim Decision” -> “Set Reserve” -> “Payment Sent” -> “Close Claim”

Attributes: - case ID - activity name - timestamp - claimant name - agent name - adjuster name - claim amount - claimant age - type of policy - car make - car model - car year - date and time of the accident - type of accident - user type

Total number of claims: 30,000

Dates: Claims belong to years 2020, 2021, and 2022.

Disclaimer: Personal names are fake.

Search
Clear search
Close search
Google apps
Main menu