100+ datasets found
  1. f

    Table_1_Data Mining Techniques in Analyzing Process Data: A Didactic.pdf

    • frontiersin.figshare.com
    pdf
    Updated Jun 7, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Xin Qiao; Hong Jiao (2023). Table_1_Data Mining Techniques in Analyzing Process Data: A Didactic.pdf [Dataset]. http://doi.org/10.3389/fpsyg.2018.02231.s001
    Explore at:
    pdfAvailable download formats
    Dataset updated
    Jun 7, 2023
    Dataset provided by
    Frontiers
    Authors
    Xin Qiao; Hong Jiao
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Due to increasing use of technology-enhanced educational assessment, data mining methods have been explored to analyse process data in log files from such assessment. However, most studies were limited to one data mining technique under one specific scenario. The current study demonstrates the usage of four frequently used supervised techniques, including Classification and Regression Trees (CART), gradient boosting, random forest, support vector machine (SVM), and two unsupervised methods, Self-organizing Map (SOM) and k-means, fitted to one assessment data. The USA sample (N = 426) from the 2012 Program for International Student Assessment (PISA) responding to problem-solving items is extracted to demonstrate the methods. After concrete feature generation and feature selection, classifier development procedures are implemented using the illustrated techniques. Results show satisfactory classification accuracy for all the techniques. Suggestions for the selection of classifiers are presented based on the research questions, the interpretability and the simplicity of the classifiers. Interpretations for the results from both supervised and unsupervised learning methods are provided.

  2. d

    Data Mining in Systems Health Management

    • catalog.data.gov
    • data.nasa.gov
    • +2more
    Updated Apr 10, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dashlink (2025). Data Mining in Systems Health Management [Dataset]. https://catalog.data.gov/dataset/data-mining-in-systems-health-management
    Explore at:
    Dataset updated
    Apr 10, 2025
    Dataset provided by
    Dashlink
    Description

    This chapter presents theoretical and practical aspects associated to the implementation of a combined model-based/data-driven approach for failure prognostics based on particle filtering algorithms, in which the current esti- mate of the state PDF is used to determine the operating condition of the system and predict the progression of a fault indicator, given a dynamic state model and a set of process measurements. In this approach, the task of es- timating the current value of the fault indicator, as well as other important changing parameters in the environment, involves two basic steps: the predic- tion step, based on the process model, and an update step, which incorporates the new measurement into the a priori state estimate. This framework allows to estimate of the probability of failure at future time instants (RUL PDF) in real-time, providing information about time-to- failure (TTF) expectations, statistical confidence intervals, long-term predic- tions; using for this purpose empirical knowledge about critical conditions for the system (also referred to as the hazard zones). This information is of paramount significance for the improvement of the system reliability and cost-effective operation of critical assets, as it has been shown in a case study where feedback correction strategies (based on uncertainty measures) have been implemented to lengthen the RUL of a rotorcraft transmission system with propagating fatigue cracks on a critical component. Although the feed- back loop is implemented using simple linear relationships, it is helpful to provide a quick insight into the manner that the system reacts to changes on its input signals, in terms of its predicted RUL. The method is able to manage non-Gaussian pdf’s since it includes concepts such as nonlinear state estimation and confidence intervals in its formulation. Real data from a fault seeded test showed that the proposed framework was able to anticipate modifications on the system input to lengthen its RUL. Results of this test indicate that the method was able to successfully suggest the correction that the system required. In this sense, future work will be focused on the development and testing of similar strategies using different input-output uncertainty metrics.

  3. Phylogenetics Workflows

    • figshare.com
    zip
    Updated Nov 6, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ahmed Halioui (2019). Phylogenetics Workflows [Dataset]. http://doi.org/10.6084/m9.figshare.10246952.v3
    Explore at:
    zipAvailable download formats
    Dataset updated
    Nov 6, 2019
    Dataset provided by
    figshare
    Authors
    Ahmed Halioui
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The dataset presents input/output data of the TGOWLeR framework. TGOWLeR abstracts general patterns from workflow sequences previously extracted from texts. It comprises two modules –a workflow extractor and a pattern miner– both relying on a specific domain ontology.The input of the first module is an RDF/ZIPPED ontology (tgowler_resource_ontologies_PHAGE_1.0.rdf.zip) and a set of un-annotated articles (e.g., datastore_2018_2019.zip). The proposed pipeline (implemented in GATE 8.1) produces annotated texts (e.g., annotated_2018_2019.zip) and their corresponding workflows (e.g., WFMiner_WF_2018_2019.xml).The input of the second module is the serialized ontology ... Publications:Halioui, Ahmed, et al. "Ontology-based workflow pattern mining: Application to bioinformatics expertise acquisition." Proceedings of the Symposium on Applied Computing. ACM, 2017.Halioui, Ahmed, Petko Valtchev, and Abdoulaye Baniré Diallo. "Bioinformatic Workflow Extraction from Scientific Texts based on Word Sense Disambiguation." IEEE/ACM transactions on computational biology and bioinformatics 15.6 (2018): 1979-1990.

  4. d

    Discovering Anomalous Aviation Safety Events Using Scalable Data Mining...

    • catalog.data.gov
    • datadiscoverystudio.org
    • +5more
    Updated Apr 10, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dashlink (2025). Discovering Anomalous Aviation Safety Events Using Scalable Data Mining Algorithms [Dataset]. https://catalog.data.gov/dataset/discovering-anomalous-aviation-safety-events-using-scalable-data-mining-algorithms
    Explore at:
    Dataset updated
    Apr 10, 2025
    Dataset provided by
    Dashlink
    Description

    The worldwide civilian aviation system is one of the most complex dynamical systems created. Most modern commercial aircraft have onboard flight data recorders that record several hundred discrete and continuous parameters at approximately 1Hz for the entire duration of the flight. These data contain information about the flight control systems, actuators, engines, landing gear, avionics, and pilot commands. In this paper, recent advances in the development of a novel knowledge discovery process consisting of a suite of data mining techniques for identifying precursors to aviation safety incidents are discussed. The data mining techniques include scalable multiple-kernel learning for large-scale distributed anomaly detection. A novel multivariate time-series search algorithm is used to search for signatures of discovered anomalies on massive datasets. The process can identify operationally significant events due to environmental, mechanical, and human factors issues in the high-dimensional flight operations quality assurance data. All discovered anomalies are validated by a team of independent domain experts. This novel automated knowledge discovery process is aimed at complementing the state-of-the-art human-generated exceedance-based analysis that fails to discover previously unknown aviation safety incidents. In this paper, the discovery pipeline, the methods used, and some of the significant anomalies detected on real-world commercial aviation data are discussed.

  5. Application of data analytics and mining across procurement process globally...

    • statista.com
    Updated Jul 7, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2023). Application of data analytics and mining across procurement process globally 2017 [Dataset]. https://www.statista.com/statistics/728137/worldwide-application-of-data-analytics-and-mining-across-procurement-process/
    Explore at:
    Dataset updated
    Jul 7, 2023
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    2017
    Area covered
    Worldwide
    Description

    This statistic displays the various applications of data analytics and mining across procurement processes, according to chief procurement officers (CPOs) worldwide, as of 2017. Fifty-seven percent of the CPOs asked agreed that data analytics and mining had been applied to intelligent and advanced analytics for negotiations, and 40 percent of them indicated data analytics and mining had been applied to supplier portfolio optimization processes.

  6. f

    Results obtained in a data mining process applied to a database containing...

    • scielo.figshare.com
    jpeg
    Updated Jun 4, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    E.M. Ruiz Lobaina; C. P. Romero Suárez (2023). Results obtained in a data mining process applied to a database containing bibliographic information concerning four segments of science. [Dataset]. http://doi.org/10.6084/m9.figshare.20011798.v1
    Explore at:
    jpegAvailable download formats
    Dataset updated
    Jun 4, 2023
    Dataset provided by
    SciELO journals
    Authors
    E.M. Ruiz Lobaina; C. P. Romero Suárez
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Abstract The objective of this work is to improve the quality of the information that belongs to the database CubaCiencia, of the Institute of Scientific and Technological Information. This database has bibliographic information referring to four segments of science and is the main database of the Library Management System. The applied methodology was based on the Decision Trees, the Correlation Matrix, the 3D Scatter Plot, etc., which are techniques used by data mining, for the study of large volumes of information. The results achieved not only made it possible to improve the information in the database, but also provided truly useful patterns in the solution of the proposed objectives.

  7. Video-to-Model Data Set

    • figshare.com
    • commons.datacite.org
    xml
    Updated Mar 24, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sönke Knoch; Shreeraman Ponpathirkoottam; Tim Schwartz (2020). Video-to-Model Data Set [Dataset]. http://doi.org/10.6084/m9.figshare.12026850.v1
    Explore at:
    xmlAvailable download formats
    Dataset updated
    Mar 24, 2020
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Sönke Knoch; Shreeraman Ponpathirkoottam; Tim Schwartz
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This data set belongs to the paper "Video-to-Model: Unsupervised Trace Extraction from Videos for Process Discovery and Conformance Checking in Manual Assembly", submitted on March 24, 2020, to the 18th International Conference on Business Process Management (BPM).Abstract: Manual activities are often hidden deep down in discrete manufacturing processes. For the elicitation and optimization of process behavior, complete information about the execution of Manual activities are required. Thus, an approach is presented on how execution level information can be extracted from videos in manual assembly. The goal is the generation of a log that can be used in state-of-the-art process mining tools. The test bed for the system was lightweight and scalable consisting of an assembly workstation equipped with a single RGB camera recording only the hand movements of the worker from top. A neural network based real-time object classifier was trained to detect the worker’s hands. The hand detector delivers the input for an algorithm, which generates trajectories reflecting the movement paths of the hands. Those trajectories are automatically assigned to work steps using the position of material boxes on the assembly shelf as reference points and hierarchical clustering of similar behaviors with dynamic time warping. The system has been evaluated in a task-based study with ten participants in a laboratory, but under realistic conditions. The generated logs have been loaded into the process mining toolkit ProM to discover the underlying process model and to detect deviations from both, instructions and ground truth, using conformance checking. The results show that process mining delivers insights about the assembly process and the system’s precision.The data set contains the generated and the annotated logs based on the video material gathered during the user study. In addition, the petri nets from the process discovery and conformance checking conducted with ProM (http://www.promtools.org) and the reference nets modeled with Yasper (http://www.yasper.org/) are provided.

  8. Data Mining Software Market Report | Global Forecast From 2025 To 2033

    • dataintelo.com
    csv, pdf, pptx
    Updated Jan 7, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dataintelo (2025). Data Mining Software Market Report | Global Forecast From 2025 To 2033 [Dataset]. https://dataintelo.com/report/data-mining-software-market
    Explore at:
    pdf, pptx, csvAvailable download formats
    Dataset updated
    Jan 7, 2025
    Dataset authored and provided by
    Dataintelo
    License

    https://dataintelo.com/privacy-and-policyhttps://dataintelo.com/privacy-and-policy

    Time period covered
    2024 - 2032
    Area covered
    Global
    Description

    Data Mining Software Market Outlook



    The global data mining software market size was valued at USD 7.2 billion in 2023 and is projected to reach USD 15.5 billion by 2032, growing at a compound annual growth rate (CAGR) of 8.7% during the forecast period. This growth is driven primarily by the increasing adoption of big data analytics and the rising demand for business intelligence across various industries. As businesses increasingly recognize the value of data-driven decision-making, the market is expected to witness substantial growth.



    One of the significant growth factors for the data mining software market is the exponential increase in data generation. With the proliferation of internet-enabled devices and the rapid advancement of technologies such as the Internet of Things (IoT), there is a massive influx of data. Organizations are now more focused than ever on harnessing this data to gain insights, improve operations, and create a competitive advantage. This has led to a surge in demand for advanced data mining tools that can process and analyze large datasets efficiently.



    Another driving force is the growing need for personalized customer experiences. In industries such as retail, healthcare, and BFSI, understanding customer behavior and preferences is crucial. Data mining software enables organizations to analyze customer data, segment their audience, and deliver personalized offerings, ultimately enhancing customer satisfaction and loyalty. This drive towards personalization is further fueling the adoption of data mining solutions, contributing significantly to market growth.



    The integration of artificial intelligence (AI) and machine learning (ML) technologies with data mining software is also a key growth factor. These advanced technologies enhance the capabilities of data mining tools by enabling them to learn from data patterns and make more accurate predictions. The convergence of AI and data mining is opening new avenues for businesses, allowing them to automate complex tasks, predict market trends, and make informed decisions more swiftly. The continuous advancements in AI and ML are expected to propel the data mining software market over the forecast period.



    Regionally, North America holds a significant share of the data mining software market, driven by the presence of major technology companies and the early adoption of advanced analytics solutions. The Asia Pacific region is also expected to witness substantial growth due to the rapid digital transformation across various industries and the increasing investments in data infrastructure. Additionally, the growing awareness and implementation of data-driven strategies in emerging economies are contributing to the market expansion in this region.



    Text Mining Software is becoming an integral part of the data mining landscape, offering unique capabilities to analyze unstructured data. As organizations generate vast amounts of textual data from various sources such as social media, emails, and customer feedback, the need for specialized tools to extract meaningful insights is growing. Text Mining Software enables businesses to process and analyze this data, uncovering patterns and trends that were previously hidden. This capability is particularly valuable in industries like marketing, customer service, and research, where understanding the nuances of language can lead to more informed decision-making. The integration of text mining with traditional data mining processes is enhancing the overall analytical capabilities of organizations, allowing them to derive comprehensive insights from both structured and unstructured data.



    Component Analysis



    The data mining software market is segmented by components, which primarily include software and services. The software segment encompasses various types of data mining tools that are used for analyzing and extracting valuable insights from raw data. These tools are designed to handle large volumes of data and provide advanced functionalities such as predictive analytics, data visualization, and pattern recognition. The increasing demand for sophisticated data analysis tools is driving the growth of the software segment. Enterprises are investing in these tools to enhance their data processing capabilities and derive actionable insights.



    Within the software segment, the emergence of cloud-based data mining solutions is a notable trend. Cloud-based solutions offer several advantages, including s

  9. Process mining application areas in companies in Russia 2021

    • statista.com
    Updated Feb 15, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2022). Process mining application areas in companies in Russia 2021 [Dataset]. https://www.statista.com/statistics/1289110/process-mining-application-areas-russia/
    Explore at:
    Dataset updated
    Feb 15, 2022
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    Sep 2021 - Oct 2021
    Area covered
    Russia
    Description

    Nearly two thirds of surveyed top managers of large companies operating in Russia viewed process mining as useful for purchasing, in 2021. Furthermore, over 60 percent of respondents saw the technology's potential in improving the customer journey map and IT processes.

  10. Data from: An IoT-Enriched Event Log for Process Mining in Smart Factories

    • zenodo.org
    • data.niaid.nih.gov
    bin, txt, zip
    Updated Jun 10, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Lukas Malburg; Lukas Malburg; Joscha Grüger; Joscha Grüger; Ralph Bergmann; Ralph Bergmann (2024). An IoT-Enriched Event Log for Process Mining in Smart Factories [Dataset]. http://doi.org/10.6084/m9.figshare.20130794
    Explore at:
    txt, zip, binAvailable download formats
    Dataset updated
    Jun 10, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Lukas Malburg; Lukas Malburg; Joscha Grüger; Joscha Grüger; Ralph Bergmann; Ralph Bergmann
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    DEPRECATED - current version: https://figshare.com/articles/dataset/Dataset_An_IoT-Enriched_Event_Log_for_Process_Mining_in_Smart_Factories/20130794

    Modern technologies such as the Internet of Things (IoT) are becoming increasingly important in various domains, including Business Process Management (BPM) research. One main research area in BPM is process mining, which can be used to analyze event logs, e.g., for checking the conformance of running processes. However, there are only a few IoT-based event logs available for research purposes. Some of them are artificially generated, and the problem occurs that they do not always completely reflect the actual physical properties of smart environments. In this paper, we present an IoT-enriched XES event log that is generated by a physical smart factory. For this purpose, we created the DataStream XES extension for representing IoT-data in event logs. Finally, we present some preliminary analysis and properties of the log.

  11. Data Mining Tools Market Report | Global Forecast From 2025 To 2033

    • dataintelo.com
    csv, pdf, pptx
    Updated Apr 1, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dataintelo (2024). Data Mining Tools Market Report | Global Forecast From 2025 To 2033 [Dataset]. https://dataintelo.com/report/global-data-mining-tools-market
    Explore at:
    pdf, pptx, csvAvailable download formats
    Dataset updated
    Apr 1, 2024
    Dataset authored and provided by
    Dataintelo
    License

    https://dataintelo.com/privacy-and-policyhttps://dataintelo.com/privacy-and-policy

    Time period covered
    2024 - 2032
    Area covered
    Global
    Description

    Data Mining Tools Market Outlook 2032



    The global data mining tools market size was USD 932 Million in 2023 and is projected to reach USD 2,584.7 Million by 2032, expanding at a CAGR of 12% during 2024–2032. The market is fueled by the rising demand for big data analytics across various industries and the increasing need for AI-integrated data mining tools for insightful decision-making.



    Increasing adoption of cloud-based platforms in data mining tools fuels the market. This enhances scalability, flexibility, and cost-efficiency in data handling processes. Major tech companies are launching cloud-based data mining solutions, enabling businesses to analyze vast datasets effectively. This trend reflects the shift toward agile and scalable data analysis methods, meeting the dynamic needs of modern enterprises.





    • In July 2023, Microsoft launched Power Automate Process Mining. This tool, powered by advanced AI, allows companies to gain deep insights into their operations, streamline processes, and foster ongoing improvement through automation and low-code applications, marking a new era in business efficiency and process optimization.







    Rising focus on predictive analytics propels the development of advanced data mining tools capable of forecasting future trends and behaviors. Industries such as finance, healthcare, and retail invest significantly in predictive analytics to gain a competitive edge, driving demand for sophisticated data mining technologies. This trend underscores the strategic importance of foresight in decision-making processes.



    Visual data mining tools are gaining traction in the market, offering intuitive data exploration and interpretation capabilities. These tools enable users to uncover patterns and insights through graphical representations, making data analysis accessible to a broader audience. The launch of user-friendly visual data mining applications marks a significant step toward democratizing data analytics.



    Impact of Artificial Intelligence (

  12. m

    Helpdesk

    • data.mendeley.com
    Updated Dec 1, 2016
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ilya Verenich (2016). Helpdesk [Dataset]. http://doi.org/10.17632/39bp3vv62t.1
    Explore at:
    Dataset updated
    Dec 1, 2016
    Authors
    Ilya Verenich
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    This dataset contains events from a ticketing management process of the help desk of an Italian software company. The process consists of 9 activities, and all cases start with the insertion of a new ticket into the ticketing management system. Each case ends when the issue is resolved and the ticket is closed. This log contains 3804 process instances (a.k.a "cases") and 13710 events

  13. Process Mining Software Market By Enterprise Size (Large Enterprises And...

    • zionmarketresearch.com
    pdf
    Updated Jun 20, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Zion Market Research (2025). Process Mining Software Market By Enterprise Size (Large Enterprises And Small & Medium Enterprises), By Type (Enhancement, Conformance, And Discovery), By Component (Services And Software), By Application (Hidden Problems, Ongoing Monitoring & Optimization, Business Processes, And Critical Process Intersections), and By Region: Global Industry Analysis, Size, Share, Growth, Trends, and Forecast, 2024-2032- [Dataset]. https://www.zionmarketresearch.com/report/process-mining-software-market
    Explore at:
    pdfAvailable download formats
    Dataset updated
    Jun 20, 2025
    Dataset provided by
    Authors
    Zion Market Research
    License

    https://www.zionmarketresearch.com/privacy-policyhttps://www.zionmarketresearch.com/privacy-policy

    Time period covered
    2022 - 2030
    Area covered
    Global
    Description

    Global process mining software market is expected to revenue of around USD 41.74 billion by 2032, growing at a CAGR of around 42.86% between 2024-2032.

  14. a

    Educational Process Mining (EPM): A Learning Analytics Data Set Data Set

    • academictorrents.com
    bittorrent
    Updated Feb 11, 2016
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mehrnoosh Vahdatand Luca Oneto and Davide Anguita and Mathias Funk and Matthias Rauterberg (2016). Educational Process Mining (EPM): A Learning Analytics Data Set Data Set [Dataset]. https://academictorrents.com/details/e24e083cc337695bb84a2b68707695579c0ab4d8
    Explore at:
    bittorrent(4934446)Available download formats
    Dataset updated
    Feb 11, 2016
    Dataset authored and provided by
    Mehrnoosh Vahdatand Luca Oneto and Davide Anguita and Mathias Funk and Matthias Rauterberg
    License

    https://academictorrents.com/nolicensespecifiedhttps://academictorrents.com/nolicensespecified

    Description

    Data Set Information: The experiments have been carried out with a group of 115 students of first-year, undergraduate Engineering major of the University of Genoa. We carried out this study over a simulation environment named Deeds (Digital Electronics Education and Design Suite) which is used for e-learning in digital electronics. The environment provides learning materials through specialized browsers for the students, and asks them to solve various problems with different levels of difficulty. For more information about the Deeds simulator used for this course look at: [Web Link] and to know more about the exercises contents of each session see exercises_info.txt . Our data set contains the students time series of activities during six sessions of laboratory sessions of the course of digital electronics. There are 6 folders containing the students’ data per session. Each Session folder contains up to 99 CSV files each dedicated to a specific student log during that ses

  15. o

    Improving Editorial Workflow and Publication Retrievability at Springer...

    • ordo.open.ac.uk
    xlsx
    Updated Aug 21, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Angelo Salatino; Francesco Osborne (2020). Improving Editorial Workflow and Publication Retrievability at Springer Nature [Dataset]. http://doi.org/10.21954/ou.rd.7951496.v1
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Aug 21, 2020
    Dataset provided by
    The Open University
    Authors
    Angelo Salatino; Francesco Osborne
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Result of the survey for the paper: "Improving Editorial Workflow and Publication Retrievability at Springer Nature"

  16. f

    Data-driven Process Discovery - Artificial Event Log

    • figshare.com
    zip
    Updated Jun 2, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Felix Mannhardt (2023). Data-driven Process Discovery - Artificial Event Log [Dataset]. http://doi.org/10.4121/uuid:32cad43f-8bb9-46af-8333-48aae2bea037
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jun 2, 2023
    Dataset provided by
    4TU.ResearchData
    Authors
    Felix Mannhardt
    License

    https://doi.org/10.4121/resource:terms_of_usehttps://doi.org/10.4121/resource:terms_of_use

    Description

    A synthetic event log with 100,000 traces and 900,000 events that was generated by simulating a simple artificial process model. There are three data attributes in the event log: Priority, Nurse, and Type. Some paths in the model are recorded infrequently based on the value of these attributes. Noise is added by randomly adding one additional event to an increasing number of traces. CPN Tools (http://cpntools.org) was used to generate the event log and inject the noise.

  17. Process mining integration plans in companies in Russia 2021

    • statista.com
    Updated Feb 15, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2022). Process mining integration plans in companies in Russia 2021 [Dataset]. https://www.statista.com/statistics/1289097/process-mining-integration-plans-russia/
    Explore at:
    Dataset updated
    Feb 15, 2022
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    Sep 2021 - Oct 2021
    Area covered
    Russia
    Description

    One fifth of surveyed top managers of large companies operating in Russia stated that their businesses either already had integrated process mining or were in the process of implementing it in 2021. Furthermore, nearly 30 percent revealed plans to integrate it in the following three years.

  18. Data Mining Software in Australia - Market Research Report (2015-2030)

    • ibisworld.com
    Updated Oct 27, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    IBISWorld (2022). Data Mining Software in Australia - Market Research Report (2015-2030) [Dataset]. https://www.ibisworld.com/australia/employment/data-mining-software/5598/
    Explore at:
    Dataset updated
    Oct 27, 2022
    Dataset authored and provided by
    IBISWorld
    License

    https://www.ibisworld.com/about/termsofuse/https://www.ibisworld.com/about/termsofuse/

    Time period covered
    2015 - 2030
    Area covered
    Australia
    Description

    Companies in this industry develop software for data mining. Data mining is the process of extracting patterns from large data sets.

  19. Data from: How are software repositories mined? A systematic literature...

    • zenodo.org
    • data.niaid.nih.gov
    bin
    Updated Sep 2, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Anonymized for Review; Anonymized for Review (2021). How are software repositories mined? A systematic literature review of workflows, methodologies, reproducibility, and tools [Dataset]. http://doi.org/10.5281/zenodo.5274208
    Explore at:
    binAvailable download formats
    Dataset updated
    Sep 2, 2021
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Anonymized for Review; Anonymized for Review
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This is the excel spreadsheet dataset containing our analysis of papers performing mining software repositories research from the conferences ICSE, ESEC/FSE, and MSR from the years 2018 - 2020. The data is broken into columns and can be explained at a high-level as follows:

    Column Content

    1 The paper being analyzed

    2 Does the paper state the data they analyzed is available

    3 Does the paper perform some sort of data analysis or sampling using data others have compiled in the past

    4 Does the paper state a timestamp for when they begin their work

    5 Does the paper state the use of systems pre-built to help with MSR work

    6 - 18 Forms of sampling researchers may have employed to select their data

    19 What datasets (if any) were used in the analysis

    20 What tools (if any) were used in the analysis

    21 How they performed their data sampling workflow

    22 How they performed their data filtering workflow

    23 How they performed their data retrieval workflow

    24 Did they create any scripts in each of these workflows

    25 - 33 Did they publish a replication package and what is contained within

    34 Is the paper describing a tool for research or not

    35 Short description of the paper read

    36 A high-level category of the work performed in each paper

  20. o

    Supplementary Material: Predictive model using Cross Industry Standard...

    • explore.openaire.eu
    • data.niaid.nih.gov
    • +1more
    Updated Apr 22, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2022). Supplementary Material: Predictive model using Cross Industry Standard Process for Data Mining [Dataset]. http://doi.org/10.5281/zenodo.6478177
    Explore at:
    Dataset updated
    Apr 22, 2022
    Description

    The Supplementary Material of the paper "Supplementary Material: Predictive model using Cross Industry Standard Process for Data Mining" includes: 1) APPENDIX 1: SQL Statements for data extraction. Appendix 2: Interview for operating Staff. 2) The DataSet of the normalized data to define the predictive model.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Xin Qiao; Hong Jiao (2023). Table_1_Data Mining Techniques in Analyzing Process Data: A Didactic.pdf [Dataset]. http://doi.org/10.3389/fpsyg.2018.02231.s001

Table_1_Data Mining Techniques in Analyzing Process Data: A Didactic.pdf

Related Article
Explore at:
pdfAvailable download formats
Dataset updated
Jun 7, 2023
Dataset provided by
Frontiers
Authors
Xin Qiao; Hong Jiao
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Due to increasing use of technology-enhanced educational assessment, data mining methods have been explored to analyse process data in log files from such assessment. However, most studies were limited to one data mining technique under one specific scenario. The current study demonstrates the usage of four frequently used supervised techniques, including Classification and Regression Trees (CART), gradient boosting, random forest, support vector machine (SVM), and two unsupervised methods, Self-organizing Map (SOM) and k-means, fitted to one assessment data. The USA sample (N = 426) from the 2012 Program for International Student Assessment (PISA) responding to problem-solving items is extracted to demonstrate the methods. After concrete feature generation and feature selection, classifier development procedures are implemented using the illustrated techniques. Results show satisfactory classification accuracy for all the techniques. Suggestions for the selection of classifiers are presented based on the research questions, the interpretability and the simplicity of the classifiers. Interpretations for the results from both supervised and unsupervised learning methods are provided.

Search
Clear search
Close search
Google apps
Main menu