100+ datasets found

f
Table_1_Data Mining Techniques in Analyzing Process Data: A Didactic.pdf
frontiersin.figshare.com
pdf
Updated Jun 7, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Xin Qiao; Hong Jiao (2023). Table_1_Data Mining Techniques in Analyzing Process Data: A Didactic.pdf [Dataset]. http://doi.org/10.3389/fpsyg.2018.02231.s001
Explore at:
pdfAvailable download formats
Unique identifier
https://doi.org/10.3389/fpsyg.2018.02231.s001
Dataset updated
Jun 7, 2023
Dataset provided by
Frontiers
Authors
Xin Qiao; Hong Jiao
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Due to increasing use of technology-enhanced educational assessment, data mining methods have been explored to analyse process data in log files from such assessment. However, most studies were limited to one data mining technique under one specific scenario. The current study demonstrates the usage of four frequently used supervised techniques, including Classification and Regression Trees (CART), gradient boosting, random forest, support vector machine (SVM), and two unsupervised methods, Self-organizing Map (SOM) and k-means, fitted to one assessment data. The USA sample (N = 426) from the 2012 Program for International Student Assessment (PISA) responding to problem-solving items is extracted to demonstrate the methods. After concrete feature generation and feature selection, classifier development procedures are implemented using the illustrated techniques. Results show satisfactory classification accuracy for all the techniques. Suggestions for the selection of classifiers are presented based on the research questions, the interpretability and the simplicity of the classifiers. Interpretations for the results from both supervised and unsupervised learning methods are provided.
d
Data Mining in Systems Health Management
catalog.data.gov
data.nasa.gov
+2more
Updated Apr 10, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Dashlink (2025). Data Mining in Systems Health Management [Dataset]. https://catalog.data.gov/dataset/data-mining-in-systems-health-management
Explore at:
Dataset updated
Apr 10, 2025
Dataset provided by
Dashlink
Description
This chapter presents theoretical and practical aspects associated to the implementation of a combined model-based/data-driven approach for failure prognostics based on particle filtering algorithms, in which the current esti- mate of the state PDF is used to determine the operating condition of the system and predict the progression of a fault indicator, given a dynamic state model and a set of process measurements. In this approach, the task of es- timating the current value of the fault indicator, as well as other important changing parameters in the environment, involves two basic steps: the predic- tion step, based on the process model, and an update step, which incorporates the new measurement into the a priori state estimate. This framework allows to estimate of the probability of failure at future time instants (RUL PDF) in real-time, providing information about time-to- failure (TTF) expectations, statistical confidence intervals, long-term predic- tions; using for this purpose empirical knowledge about critical conditions for the system (also referred to as the hazard zones). This information is of paramount significance for the improvement of the system reliability and cost-effective operation of critical assets, as it has been shown in a case study where feedback correction strategies (based on uncertainty measures) have been implemented to lengthen the RUL of a rotorcraft transmission system with propagating fatigue cracks on a critical component. Although the feed- back loop is implemented using simple linear relationships, it is helpful to provide a quick insight into the manner that the system reacts to changes on its input signals, in terms of its predicted RUL. The method is able to manage non-Gaussian pdf’s since it includes concepts such as nonlinear state estimation and confidence intervals in its formulation. Real data from a fault seeded test showed that the proposed framework was able to anticipate modifications on the system input to lengthen its RUL. Results of this test indicate that the method was able to successfully suggest the correction that the system required. In this sense, future work will be focused on the development and testing of similar strategies using different input-output uncertainty metrics.
Phylogenetics Workflows
figshare.com
zip
Updated Nov 6, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ahmed Halioui (2019). Phylogenetics Workflows [Dataset]. http://doi.org/10.6084/m9.figshare.10246952.v3
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.10246952.v3
Dataset updated
Nov 6, 2019
Dataset provided by
figshare
Authors
Ahmed Halioui
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The dataset presents input/output data of the TGOWLeR framework. TGOWLeR abstracts general patterns from workflow sequences previously extracted from texts. It comprises two modules –a workflow extractor and a pattern miner– both relying on a specific domain ontology.The input of the first module is an RDF/ZIPPED ontology (tgowler_resource_ontologies_PHAGE_1.0.rdf.zip) and a set of un-annotated articles (e.g., datastore_2018_2019.zip). The proposed pipeline (implemented in GATE 8.1) produces annotated texts (e.g., annotated_2018_2019.zip) and their corresponding workflows (e.g., WFMiner_WF_2018_2019.xml).The input of the second module is the serialized ontology ... Publications:Halioui, Ahmed, et al. "Ontology-based workflow pattern mining: Application to bioinformatics expertise acquisition." Proceedings of the Symposium on Applied Computing. ACM, 2017.Halioui, Ahmed, Petko Valtchev, and Abdoulaye Baniré Diallo. "Bioinformatic Workflow Extraction from Scientific Texts based on Word Sense Disambiguation." IEEE/ACM transactions on computational biology and bioinformatics 15.6 (2018): 1979-1990.
d
Discovering Anomalous Aviation Safety Events Using Scalable Data Mining...
catalog.data.gov
datadiscoverystudio.org
+5more
Updated Apr 10, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Dashlink (2025). Discovering Anomalous Aviation Safety Events Using Scalable Data Mining Algorithms [Dataset]. https://catalog.data.gov/dataset/discovering-anomalous-aviation-safety-events-using-scalable-data-mining-algorithms
Explore at:
Dataset updated
Apr 10, 2025
Dataset provided by
Dashlink
Description
The worldwide civilian aviation system is one of the most complex dynamical systems created. Most modern commercial aircraft have onboard flight data recorders that record several hundred discrete and continuous parameters at approximately 1Hz for the entire duration of the flight. These data contain information about the flight control systems, actuators, engines, landing gear, avionics, and pilot commands. In this paper, recent advances in the development of a novel knowledge discovery process consisting of a suite of data mining techniques for identifying precursors to aviation safety incidents are discussed. The data mining techniques include scalable multiple-kernel learning for large-scale distributed anomaly detection. A novel multivariate time-series search algorithm is used to search for signatures of discovered anomalies on massive datasets. The process can identify operationally significant events due to environmental, mechanical, and human factors issues in the high-dimensional flight operations quality assurance data. All discovered anomalies are validated by a team of independent domain experts. This novel automated knowledge discovery process is aimed at complementing the state-of-the-art human-generated exceedance-based analysis that fails to discover previously unknown aviation safety incidents. In this paper, the discovery pipeline, the methods used, and some of the significant anomalies detected on real-world commercial aviation data are discussed.
Application of data analytics and mining across procurement process globally...
statista.com
Updated Jul 7, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2023). Application of data analytics and mining across procurement process globally 2017 [Dataset]. https://www.statista.com/statistics/728137/worldwide-application-of-data-analytics-and-mining-across-procurement-process/
Explore at:
Dataset updated
Jul 7, 2023
Dataset authored and provided by
Statistahttp://statista.com/
Time period covered
2017
Area covered
Worldwide
Description
This statistic displays the various applications of data analytics and mining across procurement processes, according to chief procurement officers (CPOs) worldwide, as of 2017. Fifty-seven percent of the CPOs asked agreed that data analytics and mining had been applied to intelligent and advanced analytics for negotiations, and 40 percent of them indicated data analytics and mining had been applied to supplier portfolio optimization processes.
f
Results obtained in a data mining process applied to a database containing...
scielo.figshare.com
jpeg
Updated Jun 4, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
E.M. Ruiz Lobaina; C. P. Romero Suárez (2023). Results obtained in a data mining process applied to a database containing bibliographic information concerning four segments of science. [Dataset]. http://doi.org/10.6084/m9.figshare.20011798.v1
Explore at:
jpegAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.20011798.v1
Dataset updated
Jun 4, 2023
Dataset provided by
SciELO journals
Authors
E.M. Ruiz Lobaina; C. P. Romero Suárez
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Abstract The objective of this work is to improve the quality of the information that belongs to the database CubaCiencia, of the Institute of Scientific and Technological Information. This database has bibliographic information referring to four segments of science and is the main database of the Library Management System. The applied methodology was based on the Decision Trees, the Correlation Matrix, the 3D Scatter Plot, etc., which are techniques used by data mining, for the study of large volumes of information. The results achieved not only made it possible to improve the information in the database, but also provided truly useful patterns in the solution of the proposed objectives.
Video-to-Model Data Set
figshare.com
commons.datacite.org
xml
Updated Mar 24, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sönke Knoch; Shreeraman Ponpathirkoottam; Tim Schwartz (2020). Video-to-Model Data Set [Dataset]. http://doi.org/10.6084/m9.figshare.12026850.v1
Explore at:
xmlAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.12026850.v1
Dataset updated
Mar 24, 2020
Dataset provided by
Figsharehttp://figshare.com/
Authors
Sönke Knoch; Shreeraman Ponpathirkoottam; Tim Schwartz
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This data set belongs to the paper "Video-to-Model: Unsupervised Trace Extraction from Videos for Process Discovery and Conformance Checking in Manual Assembly", submitted on March 24, 2020, to the 18th International Conference on Business Process Management (BPM).Abstract: Manual activities are often hidden deep down in discrete manufacturing processes. For the elicitation and optimization of process behavior, complete information about the execution of Manual activities are required. Thus, an approach is presented on how execution level information can be extracted from videos in manual assembly. The goal is the generation of a log that can be used in state-of-the-art process mining tools. The test bed for the system was lightweight and scalable consisting of an assembly workstation equipped with a single RGB camera recording only the hand movements of the worker from top. A neural network based real-time object classifier was trained to detect the worker’s hands. The hand detector delivers the input for an algorithm, which generates trajectories reflecting the movement paths of the hands. Those trajectories are automatically assigned to work steps using the position of material boxes on the assembly shelf as reference points and hierarchical clustering of similar behaviors with dynamic time warping. The system has been evaluated in a task-based study with ten participants in a laboratory, but under realistic conditions. The generated logs have been loaded into the process mining toolkit ProM to discover the underlying process model and to detect deviations from both, instructions and ground truth, using conformance checking. The results show that process mining delivers insights about the assembly process and the system’s precision.The data set contains the generated and the annotated logs based on the video material gathered during the user study. In addition, the petri nets from the process discovery and conformance checking conducted with ProM (http://www.promtools.org) and the reference nets modeled with Yasper (http://www.yasper.org/) are provided.
Data Mining Software Market Report | Global Forecast From 2025 To 2033
dataintelo.com
csv, pdf, pptx
Updated Jan 7, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Dataintelo (2025). Data Mining Software Market Report | Global Forecast From 2025 To 2033 [Dataset]. https://dataintelo.com/report/data-mining-software-market
Explore at:
pdf, pptx, csvAvailable download formats
Dataset updated
Jan 7, 2025
Dataset authored and provided by
Dataintelo
License
https://dataintelo.com/privacy-and-policyhttps://dataintelo.com/privacy-and-policy
Time period covered
2024 - 2032
Area covered
Global
Description
Data Mining Software Market Outlook

The global data mining software market size was valued at USD 7.2 billion in 2023 and is projected to reach USD 15.5 billion by 2032, growing at a compound annual growth rate (CAGR) of 8.7% during the forecast period. This growth is driven primarily by the increasing adoption of big data analytics and the rising demand for business intelligence across various industries. As businesses increasingly recognize the value of data-driven decision-making, the market is expected to witness substantial growth.

One of the significant growth factors for the data mining software market is the exponential increase in data generation. With the proliferation of internet-enabled devices and the rapid advancement of technologies such as the Internet of Things (IoT), there is a massive influx of data. Organizations are now more focused than ever on harnessing this data to gain insights, improve operations, and create a competitive advantage. This has led to a surge in demand for advanced data mining tools that can process and analyze large datasets efficiently.

Another driving force is the growing need for personalized customer experiences. In industries such as retail, healthcare, and BFSI, understanding customer behavior and preferences is crucial. Data mining software enables organizations to analyze customer data, segment their audience, and deliver personalized offerings, ultimately enhancing customer satisfaction and loyalty. This drive towards personalization is further fueling the adoption of data mining solutions, contributing significantly to market growth.

The integration of artificial intelligence (AI) and machine learning (ML) technologies with data mining software is also a key growth factor. These advanced technologies enhance the capabilities of data mining tools by enabling them to learn from data patterns and make more accurate predictions. The convergence of AI and data mining is opening new avenues for businesses, allowing them to automate complex tasks, predict market trends, and make informed decisions more swiftly. The continuous advancements in AI and ML are expected to propel the data mining software market over the forecast period.

Regionally, North America holds a significant share of the data mining software market, driven by the presence of major technology companies and the early adoption of advanced analytics solutions. The Asia Pacific region is also expected to witness substantial growth due to the rapid digital transformation across various industries and the increasing investments in data infrastructure. Additionally, the growing awareness and implementation of data-driven strategies in emerging economies are contributing to the market expansion in this region.

Text Mining Software is becoming an integral part of the data mining landscape, offering unique capabilities to analyze unstructured data. As organizations generate vast amounts of textual data from various sources such as social media, emails, and customer feedback, the need for specialized tools to extract meaningful insights is growing. Text Mining Software enables businesses to process and analyze this data, uncovering patterns and trends that were previously hidden. This capability is particularly valuable in industries like marketing, customer service, and research, where understanding the nuances of language can lead to more informed decision-making. The integration of text mining with traditional data mining processes is enhancing the overall analytical capabilities of organizations, allowing them to derive comprehensive insights from both structured and unstructured data.

Component Analysis

The data mining software market is segmented by components, which primarily include software and services. The software segment encompasses various types of data mining tools that are used for analyzing and extracting valuable insights from raw data. These tools are designed to handle large volumes of data and provide advanced functionalities such as predictive analytics, data visualization, and pattern recognition. The increasing demand for sophisticated data analysis tools is driving the growth of the software segment. Enterprises are investing in these tools to enhance their data processing capabilities and derive actionable insights.

Within the software segment, the emergence of cloud-based data mining solutions is a notable trend. Cloud-based solutions offer several advantages, including s
Process mining application areas in companies in Russia 2021
statista.com
Updated Feb 15, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2022). Process mining application areas in companies in Russia 2021 [Dataset]. https://www.statista.com/statistics/1289110/process-mining-application-areas-russia/
Explore at:
Dataset updated
Feb 15, 2022
Dataset authored and provided by
Statistahttp://statista.com/
Time period covered
Sep 2021 - Oct 2021
Area covered
Russia
Description
Nearly two thirds of surveyed top managers of large companies operating in Russia viewed process mining as useful for purchasing, in 2021. Furthermore, over 60 percent of respondents saw the technology's potential in improving the customer journey map and IT processes.
Data from: An IoT-Enriched Event Log for Process Mining in Smart Factories
zenodo.org
data.niaid.nih.gov
bin, txt, zip
Updated Jun 10, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Lukas Malburg; Lukas Malburg; Joscha Grüger; Joscha Grüger; Ralph Bergmann; Ralph Bergmann (2024). An IoT-Enriched Event Log for Process Mining in Smart Factories [Dataset]. http://doi.org/10.6084/m9.figshare.20130794
Explore at:
txt, zip, binAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.20130794
Dataset updated
Jun 10, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Lukas Malburg; Lukas Malburg; Joscha Grüger; Joscha Grüger; Ralph Bergmann; Ralph Bergmann
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
DEPRECATED - current version: https://figshare.com/articles/dataset/Dataset_An_IoT-Enriched_Event_Log_for_Process_Mining_in_Smart_Factories/20130794

Modern technologies such as the Internet of Things (IoT) are becoming increasingly important in various domains, including Business Process Management (BPM) research. One main research area in BPM is process mining, which can be used to analyze event logs, e.g., for checking the conformance of running processes. However, there are only a few IoT-based event logs available for research purposes. Some of them are artificially generated, and the problem occurs that they do not always completely reflect the actual physical properties of smart environments. In this paper, we present an IoT-enriched XES event log that is generated by a physical smart factory. For this purpose, we created the DataStream XES extension for representing IoT-data in event logs. Finally, we present some preliminary analysis and properties of the log.
Data Mining Tools Market Report | Global Forecast From 2025 To 2033
dataintelo.com
csv, pdf, pptx
Updated Apr 1, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Dataintelo (2024). Data Mining Tools Market Report | Global Forecast From 2025 To 2033 [Dataset]. https://dataintelo.com/report/global-data-mining-tools-market
Explore at:
pdf, pptx, csvAvailable download formats
Dataset updated
Apr 1, 2024
Dataset authored and provided by
Dataintelo
License
https://dataintelo.com/privacy-and-policyhttps://dataintelo.com/privacy-and-policy
Time period covered
2024 - 2032
Area covered
Global
Description
Data Mining Tools Market Outlook 2032

The global data mining tools market size was USD 932 Million in 2023 and is projected to reach USD 2,584.7 Million by 2032, expanding at a CAGR of 12% during 2024–2032. The market is fueled by the rising demand for big data analytics across various industries and the increasing need for AI-integrated data mining tools for insightful decision-making.

Increasing adoption of cloud-based platforms in data mining tools fuels the market. This enhances scalability, flexibility, and cost-efficiency in data handling processes. Major tech companies are launching cloud-based data mining solutions, enabling businesses to analyze vast datasets effectively. This trend reflects the shift toward agile and scalable data analysis methods, meeting the dynamic needs of modern enterprises.

In July 2023, Microsoft launched Power Automate Process Mining. This tool, powered by advanced AI, allows companies to gain deep insights into their operations, streamline processes, and foster ongoing improvement through automation and low-code applications, marking a new era in business efficiency and process optimization.

Rising focus on predictive analytics propels the development of advanced data mining tools capable of forecasting future trends and behaviors. Industries such as finance, healthcare, and retail invest significantly in predictive analytics to gain a competitive edge, driving demand for sophisticated data mining technologies. This trend underscores the strategic importance of foresight in decision-making processes.

Visual data mining tools are gaining traction in the market, offering intuitive data exploration and interpretation capabilities. These tools enable users to uncover patterns and insights through graphical representations, making data analysis accessible to a broader audience. The launch of user-friendly visual data mining applications marks a significant step toward democratizing data analytics.

Impact of Artificial Intelligence (
m
Helpdesk
data.mendeley.com
Updated Dec 1, 2016
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ilya Verenich (2016). Helpdesk [Dataset]. http://doi.org/10.17632/39bp3vv62t.1
Explore at:
Unique identifier
https://doi.org/10.17632/39bp3vv62t.1
Dataset updated
Dec 1, 2016
Authors
Ilya Verenich
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
This dataset contains events from a ticketing management process of the help desk of an Italian software company. The process consists of 9 activities, and all cases start with the insertion of a new ticket into the ticketing management system. Each case ends when the issue is resolved and the ticket is closed. This log contains 3804 process instances (a.k.a "cases") and 13710 events
Process Mining Software Market By Enterprise Size (Large Enterprises And...
zionmarketresearch.com
pdf
Updated Jun 20, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Zion Market Research (2025). Process Mining Software Market By Enterprise Size (Large Enterprises And Small & Medium Enterprises), By Type (Enhancement, Conformance, And Discovery), By Component (Services And Software), By Application (Hidden Problems, Ongoing Monitoring & Optimization, Business Processes, And Critical Process Intersections), and By Region: Global Industry Analysis, Size, Share, Growth, Trends, and Forecast, 2024-2032- [Dataset]. https://www.zionmarketresearch.com/report/process-mining-software-market
Explore at:
pdfAvailable download formats
Dataset updated
Jun 20, 2025
Dataset provided by
Authors
Zion Market Research
License
https://www.zionmarketresearch.com/privacy-policyhttps://www.zionmarketresearch.com/privacy-policy
Time period covered
2022 - 2030
Area covered
Global
Description
Global process mining software market is expected to revenue of around USD 41.74 billion by 2032, growing at a CAGR of around 42.86% between 2024-2032.
a
Educational Process Mining (EPM): A Learning Analytics Data Set Data Set
academictorrents.com
bittorrent
Updated Feb 11, 2016
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mehrnoosh Vahdatand Luca Oneto and Davide Anguita and Mathias Funk and Matthias Rauterberg (2016). Educational Process Mining (EPM): A Learning Analytics Data Set Data Set [Dataset]. https://academictorrents.com/details/e24e083cc337695bb84a2b68707695579c0ab4d8
Explore at:
bittorrent(4934446)Available download formats
Dataset updated
Feb 11, 2016
Dataset authored and provided by
Mehrnoosh Vahdatand Luca Oneto and Davide Anguita and Mathias Funk and Matthias Rauterberg
License
https://academictorrents.com/nolicensespecifiedhttps://academictorrents.com/nolicensespecified
Description
Data Set Information: The experiments have been carried out with a group of 115 students of first-year, undergraduate Engineering major of the University of Genoa. We carried out this study over a simulation environment named Deeds (Digital Electronics Education and Design Suite) which is used for e-learning in digital electronics. The environment provides learning materials through specialized browsers for the students, and asks them to solve various problems with different levels of difficulty. For more information about the Deeds simulator used for this course look at: [Web Link] and to know more about the exercises contents of each session see exercises_info.txt . Our data set contains the students time series of activities during six sessions of laboratory sessions of the course of digital electronics. There are 6 folders containing the studentsâ€™ data per session. Each Session folder contains up to 99 CSV files each dedicated to a specific student log during that ses
o
Improving Editorial Workflow and Publication Retrievability at Springer...
ordo.open.ac.uk
xlsx
Updated Aug 21, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Angelo Salatino; Francesco Osborne (2020). Improving Editorial Workflow and Publication Retrievability at Springer Nature [Dataset]. http://doi.org/10.21954/ou.rd.7951496.v1
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.21954/ou.rd.7951496.v1
Dataset updated
Aug 21, 2020
Dataset provided by
The Open University
Authors
Angelo Salatino; Francesco Osborne
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Result of the survey for the paper: "Improving Editorial Workflow and Publication Retrievability at Springer Nature"
f
Data-driven Process Discovery - Artificial Event Log
figshare.com
zip
Updated Jun 2, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Felix Mannhardt (2023). Data-driven Process Discovery - Artificial Event Log [Dataset]. http://doi.org/10.4121/uuid:32cad43f-8bb9-46af-8333-48aae2bea037
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.4121/uuid:32cad43f-8bb9-46af-8333-48aae2bea037
Dataset updated
Jun 2, 2023
Dataset provided by
4TU.ResearchData
Authors
Felix Mannhardt
License
https://doi.org/10.4121/resource:terms_of_usehttps://doi.org/10.4121/resource:terms_of_use
Description
A synthetic event log with 100,000 traces and 900,000 events that was generated by simulating a simple artificial process model. There are three data attributes in the event log: Priority, Nurse, and Type. Some paths in the model are recorded infrequently based on the value of these attributes. Noise is added by randomly adding one additional event to an increasing number of traces. CPN Tools (http://cpntools.org) was used to generate the event log and inject the noise.
Process mining integration plans in companies in Russia 2021
statista.com
Updated Feb 15, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2022). Process mining integration plans in companies in Russia 2021 [Dataset]. https://www.statista.com/statistics/1289097/process-mining-integration-plans-russia/
Explore at:
Dataset updated
Feb 15, 2022
Dataset authored and provided by
Statistahttp://statista.com/
Time period covered
Sep 2021 - Oct 2021
Area covered
Russia
Description
One fifth of surveyed top managers of large companies operating in Russia stated that their businesses either already had integrated process mining or were in the process of implementing it in 2021. Furthermore, nearly 30 percent revealed plans to integrate it in the following three years.
Data Mining Software in Australia - Market Research Report (2015-2030)
ibisworld.com
Updated Oct 27, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
IBISWorld (2022). Data Mining Software in Australia - Market Research Report (2015-2030) [Dataset]. https://www.ibisworld.com/australia/employment/data-mining-software/5598/
Explore at:
Dataset updated
Oct 27, 2022
Dataset authored and provided by
IBISWorld
License
https://www.ibisworld.com/about/termsofuse/https://www.ibisworld.com/about/termsofuse/
Time period covered
2015 - 2030
Area covered
Australia
Description
Companies in this industry develop software for data mining. Data mining is the process of extracting patterns from large data sets.
Data from: How are software repositories mined? A systematic literature...
zenodo.org
data.niaid.nih.gov
bin
Updated Sep 2, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Anonymized for Review; Anonymized for Review (2021). How are software repositories mined? A systematic literature review of workflows, methodologies, reproducibility, and tools [Dataset]. http://doi.org/10.5281/zenodo.5274208
Explore at:
binAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.5274208
Dataset updated
Sep 2, 2021
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Anonymized for Review; Anonymized for Review
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This is the excel spreadsheet dataset containing our analysis of papers performing mining software repositories research from the conferences ICSE, ESEC/FSE, and MSR from the years 2018 - 2020. The data is broken into columns and can be explained at a high-level as follows:

Column Content

1 The paper being analyzed

2 Does the paper state the data they analyzed is available

3 Does the paper perform some sort of data analysis or sampling using data others have compiled in the past

4 Does the paper state a timestamp for when they begin their work

5 Does the paper state the use of systems pre-built to help with MSR work

6 - 18 Forms of sampling researchers may have employed to select their data

19 What datasets (if any) were used in the analysis

20 What tools (if any) were used in the analysis

21 How they performed their data sampling workflow

22 How they performed their data filtering workflow

23 How they performed their data retrieval workflow

24 Did they create any scripts in each of these workflows

25 - 33 Did they publish a replication package and what is contained within

34 Is the paper describing a tool for research or not

35 Short description of the paper read

36 A high-level category of the work performed in each paper
o
Supplementary Material: Predictive model using Cross Industry Standard...
explore.openaire.eu
data.niaid.nih.gov
+1more
Updated Apr 22, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2022). Supplementary Material: Predictive model using Cross Industry Standard Process for Data Mining [Dataset]. http://doi.org/10.5281/zenodo.6478177
Explore at:
Unique identifier
https://doi.org/10.5281/zenodo.6478177
Dataset updated
Apr 22, 2022
Description
The Supplementary Material of the paper "Supplementary Material: Predictive model using Cross Industry Standard Process for Data Mining" includes: 1) APPENDIX 1: SQL Statements for data extraction. Appendix 2: Interview for operating Staff. 2) The DataSet of the normalized data to define the predictive model.

Facebook

Twitter

Click to copy link

Link copied

Cite

Xin Qiao; Hong Jiao (2023). Table_1_Data Mining Techniques in Analyzing Process Data: A Didactic.pdf [Dataset]. http://doi.org/10.3389/fpsyg.2018.02231.s001

Table_1_Data Mining Techniques in Analyzing Process Data: A Didactic.pdf

Explore at:

pdfAvailable download formats

Unique identifier

https://doi.org/10.3389/fpsyg.2018.02231.s001

Dataset updated

Jun 7, 2023

Dataset provided by

Frontiers

Authors

Xin Qiao; Hong Jiao

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Due to increasing use of technology-enhanced educational assessment, data mining methods have been explored to analyse process data in log files from such assessment. However, most studies were limited to one data mining technique under one specific scenario. The current study demonstrates the usage of four frequently used supervised techniques, including Classification and Regression Trees (CART), gradient boosting, random forest, support vector machine (SVM), and two unsupervised methods, Self-organizing Map (SOM) and k-means, fitted to one assessment data. The USA sample (N = 426) from the 2012 Program for International Student Assessment (PISA) responding to problem-solving items is extracted to demonstrate the methods. After concrete feature generation and feature selection, classifier development procedures are implemented using the illustrated techniques. Results show satisfactory classification accuracy for all the techniques. Suggestions for the selection of classifiers are presented based on the research questions, the interpretability and the simplicity of the classifiers. Interpretations for the results from both supervised and unsupervised learning methods are provided.

Clear search

Close search

Google apps

Main menu

Table_1_Data Mining Techniques in Analyzing Process Data: A Didactic.pdf

Data Mining in Systems Health Management

Phylogenetics Workflows

Discovering Anomalous Aviation Safety Events Using Scalable Data Mining...

Application of data analytics and mining across procurement process globally...

Results obtained in a data mining process applied to a database containing...

Video-to-Model Data Set

Data Mining Software Market Report | Global Forecast From 2025 To 2033

Data Mining Software Market Outlook

Component Analysis

Process mining application areas in companies in Russia 2021

Data from: An IoT-Enriched Event Log for Process Mining in Smart Factories

Data Mining Tools Market Report | Global Forecast From 2025 To 2033

Data Mining Tools Market Outlook 2032

Impact of Artificial Intelligence (

Helpdesk

Process Mining Software Market By Enterprise Size (Large Enterprises And...

Educational Process Mining (EPM): A Learning Analytics Data Set Data Set

Improving Editorial Workflow and Publication Retrievability at Springer...

Data-driven Process Discovery - Artificial Event Log

Process mining integration plans in companies in Russia 2021

Data Mining Software in Australia - Market Research Report (2015-2030)

Data from: How are software repositories mined? A systematic literature...

Supplementary Material: Predictive model using Cross Industry Standard...

Table_1_Data Mining Techniques in Analyzing Process Data: A Didactic.pdf