100+ datasets found
  1. d

    Data Mining in Systems Health Management

    • catalog.data.gov
    • s.cnmilf.com
    • +1more
    Updated Apr 10, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dashlink (2025). Data Mining in Systems Health Management [Dataset]. https://catalog.data.gov/dataset/data-mining-in-systems-health-management
    Explore at:
    Dataset updated
    Apr 10, 2025
    Dataset provided by
    Dashlink
    Description

    This chapter presents theoretical and practical aspects associated to the implementation of a combined model-based/data-driven approach for failure prognostics based on particle filtering algorithms, in which the current esti- mate of the state PDF is used to determine the operating condition of the system and predict the progression of a fault indicator, given a dynamic state model and a set of process measurements. In this approach, the task of es- timating the current value of the fault indicator, as well as other important changing parameters in the environment, involves two basic steps: the predic- tion step, based on the process model, and an update step, which incorporates the new measurement into the a priori state estimate. This framework allows to estimate of the probability of failure at future time instants (RUL PDF) in real-time, providing information about time-to- failure (TTF) expectations, statistical confidence intervals, long-term predic- tions; using for this purpose empirical knowledge about critical conditions for the system (also referred to as the hazard zones). This information is of paramount significance for the improvement of the system reliability and cost-effective operation of critical assets, as it has been shown in a case study where feedback correction strategies (based on uncertainty measures) have been implemented to lengthen the RUL of a rotorcraft transmission system with propagating fatigue cracks on a critical component. Although the feed- back loop is implemented using simple linear relationships, it is helpful to provide a quick insight into the manner that the system reacts to changes on its input signals, in terms of its predicted RUL. The method is able to manage non-Gaussian pdf’s since it includes concepts such as nonlinear state estimation and confidence intervals in its formulation. Real data from a fault seeded test showed that the proposed framework was able to anticipate modifications on the system input to lengthen its RUL. Results of this test indicate that the method was able to successfully suggest the correction that the system required. In this sense, future work will be focused on the development and testing of similar strategies using different input-output uncertainty metrics.

  2. Data Mining HW 3 Task 1 dataset

    • kaggle.com
    zip
    Updated Nov 17, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nikunj Phutela (2023). Data Mining HW 3 Task 1 dataset [Dataset]. https://www.kaggle.com/datasets/nikunjphutela/data-mining-hw-3-task-1-dataset
    Explore at:
    zip(2276319 bytes)Available download formats
    Dataset updated
    Nov 17, 2023
    Authors
    Nikunj Phutela
    Description

    Dataset

    This dataset was created by Nikunj Phutela

    Contents

  3. g

    Data from: Linked Data Mining Challenge RM Set

    • search.gesis.org
    • da-ra.de
    Updated Nov 5, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Schaible, Johann (2025). Linked Data Mining Challenge RM Set [Dataset]. https://search.gesis.org/research_data/SDN-10.7802-78
    Explore at:
    Dataset updated
    Nov 5, 2025
    Dataset provided by
    GESIS, Köln
    GESIS search
    Authors
    Schaible, Johann
    License

    https://www.gesis.org/en/institute/data-usage-termshttps://www.gesis.org/en/institute/data-usage-terms

    Description

    Rapid Miner Process files and XML test set including the predicted labels for the Linked Data Mining Challenge 2015.

  4. d

    Data from: Mining Distance-Based Outliers in Near Linear Time

    • catalog.data.gov
    • datasets.ai
    Updated Apr 11, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dashlink (2025). Mining Distance-Based Outliers in Near Linear Time [Dataset]. https://catalog.data.gov/dataset/mining-distance-based-outliers-in-near-linear-time
    Explore at:
    Dataset updated
    Apr 11, 2025
    Dataset provided by
    Dashlink
    Description

    Full title: Mining Distance-Based Outliers in Near Linear Time with Randomization and a Simple Pruning Rule Abstract: Defining outliers by their distance to neighboring examples is a popular approach to finding unusual examples in a data set. Recently, much work has been conducted with the goal of finding fast algorithms for this task. We show that a simple nested loop algorithm that in the worst case is quadratic can give near linear time performance when the data is in random order and a simple pruning rule is used. We test our algorithm on real high-dimensional data sets with millions of examples and show that the near linear scaling holds over several orders of magnitude. Our average case analysis suggests that much of the efficiency is because the time to process non-outliers, which are the majority of examples, does not depend on the size of the data set.

  5. A Local Distributed Peer-to-Peer Algorithm Using Multi-Party Optimization...

    • data.nasa.gov
    Updated Mar 31, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    nasa.gov (2025). A Local Distributed Peer-to-Peer Algorithm Using Multi-Party Optimization Based Privacy Preservation for Data Mining Primitive Computation - Dataset - NASA Open Data Portal [Dataset]. https://data.nasa.gov/dataset/a-local-distributed-peer-to-peer-algorithm-using-multi-party-optimization-based-privacy-pr
    Explore at:
    Dataset updated
    Mar 31, 2025
    Dataset provided by
    NASAhttp://nasa.gov/
    Description

    This paper proposes a scalable, local privacy-preserving algorithm for distributed peer-to-peer (P2P) data aggregation useful for many advanced data mining/analysis tasks such as average/sum computation, decision tree induction, feature selection, and more. Unlike most multi-party privacy-preserving data mining algorithms, this approach works in an asynchronous manner through local interactions and therefore, is highly scalable. It particularly deals with the distributed computation of the sum of a set of numbers stored at different peers in a P2P network in the context of a P2P web mining application. The proposed optimization-based privacy-preserving technique for computing the sum allows different peers to specify different privacy requirements without having to adhere to a global set of parameters for the chosen privacy model. Since distributed sum computation is a frequently used primitive, the proposed approach is likely to have significant impact on many data mining tasks such as multi-party privacypreserving clustering, frequent itemset mining, and statistical aggregate computation.

  6. G

    Data Mining Tools Market Research Report 2033

    • growthmarketreports.com
    csv, pdf, pptx
    Updated Aug 4, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Growth Market Reports (2025). Data Mining Tools Market Research Report 2033 [Dataset]. https://growthmarketreports.com/report/data-mining-tools-market
    Explore at:
    pdf, csv, pptxAvailable download formats
    Dataset updated
    Aug 4, 2025
    Dataset authored and provided by
    Growth Market Reports
    Time period covered
    2024 - 2032
    Area covered
    Global
    Description

    Data Mining Tools Market Outlook




    According to our latest research, the global Data Mining Tools market size reached USD 1.93 billion in 2024, reflecting robust industry momentum. The market is expected to grow at a CAGR of 12.7% from 2025 to 2033, reaching a projected value of USD 5.69 billion by 2033. This growth is primarily driven by the increasing adoption of advanced analytics across diverse industries, rapid digital transformation, and the necessity for actionable insights from massive data volumes.




    One of the pivotal growth factors propelling the Data Mining Tools market is the exponential rise in data generation, particularly through digital channels, IoT devices, and enterprise applications. Organizations across sectors are leveraging data mining tools to extract meaningful patterns, trends, and correlations from structured and unstructured data. The need for improved decision-making, operational efficiency, and competitive advantage has made data mining an essential component of modern business strategies. Furthermore, advancements in artificial intelligence and machine learning are enhancing the capabilities of these tools, enabling predictive analytics, anomaly detection, and automation of complex analytical tasks, which further fuels market expansion.




    Another significant driver is the growing demand for customer-centric solutions in industries such as retail, BFSI, and healthcare. Data mining tools are increasingly being used for customer relationship management, targeted marketing, fraud detection, and risk management. By analyzing customer behavior and preferences, organizations can personalize their offerings, optimize marketing campaigns, and mitigate risks. The integration of data mining tools with cloud platforms and big data technologies has also simplified deployment and scalability, making these solutions accessible to small and medium-sized enterprises (SMEs) as well as large organizations. This democratization of advanced analytics is creating new growth avenues for vendors and service providers.




    The regulatory landscape and the increasing emphasis on data privacy and security are also shaping the development and adoption of Data Mining Tools. Compliance with frameworks such as GDPR, HIPAA, and CCPA necessitates robust data governance and transparent analytics processes. Vendors are responding by incorporating features like data masking, encryption, and audit trails into their solutions, thereby enhancing trust and adoption among regulated industries. Additionally, the emergence of industry-specific data mining applications, such as fraud detection in BFSI and predictive diagnostics in healthcare, is expanding the addressable market and fostering innovation.




    From a regional perspective, North America currently dominates the Data Mining Tools market owing to the early adoption of advanced analytics, strong presence of leading technology vendors, and high investments in digital transformation. However, the Asia Pacific region is emerging as a lucrative market, driven by rapid industrialization, expansion of IT infrastructure, and growing awareness of data-driven decision-making in countries like China, India, and Japan. Europe, with its focus on data privacy and digital innovation, also represents a significant market share, while Latin America and the Middle East & Africa are witnessing steady growth as organizations in these regions modernize their operations and adopt cloud-based analytics solutions.





    Component Analysis




    The Component segment of the Data Mining Tools market is bifurcated into Software and Services. Software remains the dominant segment, accounting for the majority of the market share in 2024. This dominance is attributed to the continuous evolution of data mining algorithms, the proliferation of user-friendly graphical interfaces, and the integration of advanced analytics capabilities such as machine learning, artificial intelligence, and natural language pro

  7. b

    Data from: Ontology of Core Data Mining Entities

    • bioregistry.io
    Updated Jul 5, 2014
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2014). Ontology of Core Data Mining Entities [Dataset]. https://bioregistry.io/ontodm
    Explore at:
    Dataset updated
    Jul 5, 2014
    Description

    OntoDM-core defines the most essential data mining entities in a three-layered ontological structure comprising of a specification, an implementation and an application layer. It provides a representational framework for the description of mining structured data, and in addition provides taxonomies of datasets, data mining tasks, generalizations, data mining algorithms and constraints, based on the type of data. OntoDM-core is designed to support a wide range of applications/use cases, such as semantic annotation of data mining algorithms, datasets and results; annotation of QSAR studies in the context of drug discovery investigations; and disambiguation of terms in text mining. (from abstract)

  8. Data from: Mining significant crisp-fuzzy spatial association rules

    • tandf.figshare.com
    pdf
    Updated May 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Wenzhong Shi; Anshu Zhang; Geoffrey I. Webb (2023). Mining significant crisp-fuzzy spatial association rules [Dataset]. http://doi.org/10.6084/m9.figshare.5873139.v1
    Explore at:
    pdfAvailable download formats
    Dataset updated
    May 30, 2023
    Dataset provided by
    Taylor & Francishttps://taylorandfrancis.com/
    Authors
    Wenzhong Shi; Anshu Zhang; Geoffrey I. Webb
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Spatial association rule mining (SARM) is an important data mining task for understanding implicit and sophisticated interactions in spatial data. The usefulness of SARM results, represented as sets of rules, depends on their reliability: the abundance of rules, control over the risk of spurious rules, and accuracy of rule interestingness measure (RIM) values. This study presents crisp-fuzzy SARM, a novel SARM method that can enhance the reliability of resultant rules. The method firstly prunes dubious rules using statistically sound tests and crisp supports for the patterns involved, and then evaluates RIMs of accepted rules using fuzzy supports. For the RIM evaluation stage, the study also proposes a Gaussian-curve-based fuzzy data discretization model for SARM with improved design for spatial semantics. The proposed techniques were evaluated by both synthetic and real-world data. The synthetic data was generated with predesigned rules and RIM values, thus the reliability of SARM results could be confidently and quantitatively evaluated. The proposed techniques showed high efficacy in enhancing the reliability of SARM results in all three aspects. The abundance of resultant rules was improved by 50% or more compared with using conventional fuzzy SARM. Minimal risk of spurious rules was guaranteed by statistically sound tests. The probability that the entire result contained any spurious rules was below 1%. The RIM values also avoided large positive errors committed by crisp SARM, which typically exceeded 50% for representative RIMs. The real-world case study on New York City points of interest reconfirms the improved reliability of crisp-fuzzy SARM results, and demonstrates that such improvement is critical for practical spatial data analytics and decision support.

  9. R

    Data Mining Kel 11 Dataset

    • universe.roboflow.com
    zip
    Updated Oct 29, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Data Mining (2025). Data Mining Kel 11 Dataset [Dataset]. https://universe.roboflow.com/data-mining-mtwls/data-mining-kel-11-zp4xe
    Explore at:
    zipAvailable download formats
    Dataset updated
    Oct 29, 2025
    Dataset authored and provided by
    Data Mining
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    Beras
    Description

    Data Mining Kel 11

    ## Overview
    
    Data Mining Kel 11 is a dataset for classification tasks - it contains Beras annotations for 59,785 images.
    
    ## Getting Started
    
    You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
    
      ## License
    
      This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
    
  10. BIL3013 - Data Mining - Assignment 2 - Data

    • kaggle.com
    zip
    Updated Nov 12, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    ozgurd5 (2024). BIL3013 - Data Mining - Assignment 2 - Data [Dataset]. https://www.kaggle.com/datasets/ozgurd5/bil3013-data-mining-assignment-2-data
    Explore at:
    zip(66264 bytes)Available download formats
    Dataset updated
    Nov 12, 2024
    Authors
    ozgurd5
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Dataset

    This dataset was created by ozgurd5

    Released under Apache 2.0

    Contents

  11. R

    Data Mining Test Dataset

    • universe.roboflow.com
    zip
    Updated Oct 20, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    ons (2025). Data Mining Test Dataset [Dataset]. https://universe.roboflow.com/ons-eykpy/data-mining-test-fjlw4/dataset/1
    Explore at:
    zipAvailable download formats
    Dataset updated
    Oct 20, 2025
    Dataset authored and provided by
    ons
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    Cars Damage Cars Bounding Boxes
    Description

    Data Mining Test

    ## Overview
    
    Data Mining Test is a dataset for object detection tasks - it contains Cars Damage Cars annotations for 382 images.
    
    ## Getting Started
    
    You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
    
      ## License
    
      This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
    
  12. R

    Data Mining Dataset

    • universe.roboflow.com
    zip
    Updated Aug 4, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    ilham project (2023). Data Mining Dataset [Dataset]. https://universe.roboflow.com/ilham-project/data-mining-n52lu/model/1
    Explore at:
    zipAvailable download formats
    Dataset updated
    Aug 4, 2023
    Dataset authored and provided by
    ilham project
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Variables measured
    Uangrupiah Bounding Boxes
    Description

    Data Mining

    ## Overview
    
    Data Mining is a dataset for object detection tasks - it contains Uangrupiah annotations for 692 images.
    
    ## Getting Started
    
    You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
    
      ## License
    
      This dataset is available under the [Public Domain license](https://creativecommons.org/licenses/Public Domain).
    
  13. f

    Credit Requirement Event Logs

    • figshare.com
    xml
    Updated Jun 11, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Almir Djedović (2023). Credit Requirement Event Logs [Dataset]. http://doi.org/10.4121/uuid:453e8ad1-4df0-4511-a916-93f46a37a1b5
    Explore at:
    xmlAvailable download formats
    Dataset updated
    Jun 11, 2023
    Dataset provided by
    4TU.ResearchData
    Authors
    Almir Djedović
    License

    https://doi.org/10.4121/resource:terms_of_usehttps://doi.org/10.4121/resource:terms_of_use

    Description

    This dataset contains information about a credit requirement process in a bank. It contains data about events, time execution etc.

  14. f

    Credit Requirement Event Logs with Random Attributes and Errors

    • figshare.com
    • data.4tu.nl
    txt
    Updated Jun 16, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Almir Djedović (2023). Credit Requirement Event Logs with Random Attributes and Errors [Dataset]. http://doi.org/10.4121/uuid:35e6200e-ff3c-4019-af1e-c1ee8df19977
    Explore at:
    txtAvailable download formats
    Dataset updated
    Jun 16, 2023
    Dataset provided by
    4TU.ResearchData
    Authors
    Almir Djedović
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    This MXML events logs contain random attributes and errors and it is used for data preprocessing.

  15. Pump it Up: Data Mining the Water Table

    • kaggle.com
    zip
    Updated Feb 8, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dylan (2018). Pump it Up: Data Mining the Water Table [Dataset]. https://www.kaggle.com/dylanli/pump-it-up-data-mining-the-water-table
    Explore at:
    zip(5482324 bytes)Available download formats
    Dataset updated
    Feb 8, 2018
    Authors
    Dylan
    Description
  16. Product data mining: entity classification&linking

    • kaggle.com
    zip
    Updated Jul 13, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    zzhang (2020). Product data mining: entity classification&linking [Dataset]. https://www.kaggle.com/ziqizhang/product-data-miningentity-classificationlinking
    Explore at:
    zip(10933 bytes)Available download formats
    Dataset updated
    Jul 13, 2020
    Authors
    zzhang
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Description

    IMPORTANT: Round 1 results are now released, check our website for the leaderboard. We now open Round 2 submissions!

    1. Overview

    We release two datasets that are part of the the Semantic Web Challenge on Mining the Web of HTML-embedded Product Data is co-located with the 19th International Semantic Web Conference (https://iswc2020.semanticweb.org/, 2-6 Nov 2020 at Athens, Greece). The datasets belong to two shared tasks related to product data mining on the Web: (1) product matching (linking) and (2) product classification. This event is organised by The University of Sheffield, The University of Mannheim and Amazon, and is open to anyone. Systems successfully beating the baseline of the respective task, will be invited to write a paper describing their method and system and present the method as a poster (and potentially also a short talk) at the ISWC2020 conference. Winners of each task will be awarded 500 euro as prize (partly sponsored by Peak Indicators, https://www.peakindicators.com/).

    2. Task and dataset brief

    The challenge organises two tasks, product matching and product categorisation.

    i) Product Matching deals with identifying product offers on different websites that refer to the same real-world product (e.g., the same iPhone X model offered using different names/offer titles as well as different descriptions on various websites). A multi-million product offer corpus (16M) containing product offer clusters is released for the generation of training data. A validation set containing 1.1K offer pairs and a test set of 600 offer pairs will also be released. The goal of this task is to classify if the offer pairs in these datasets are match (i.e., referring to the same product) or non-match.

    ii) Product classification deals with assigning predefined product category labels (which can be multiple levels) to product instances (e.g., iPhone X is a ‘SmartPhone’, and also ‘Electronics’). A training dataset containing 10K product offers, a validation set of 3K product offers and a test set of 3K product offers will be released. Each dataset contains product offers with their metadata (e.g., name, description, URL) and three classification labels each corresponding to a level in the GS1 Global Product Classification taxonomy. The goal is to classify these product offers into the pre-defined category labels.

    All datasets are built based on structured data that was extracted from the Common Crawl (https://commoncrawl.org/) by the Web Data Commons project (http://webdatacommons.org/). Datasets can be found at: https://ir-ischool-uos.github.io/mwpd/

    3. Resources and tools

    The challenge will also release utility code (in Python) for processing the above datasets and scoring the system outputs. In addition, the following language resources for product-related data mining tasks: A text corpus of 150 million product offer descriptions Word embeddings trained on the above corpus

    4. Challenge website

    For details of the challenge please visit https://ir-ischool-uos.github.io/mwpd/

    5. Organizing committee

    Dr Ziqi Zhang (Information School, The University of Sheffield) Prof. Christian Bizer (Institute of Computer Science and Business Informatics, The Mannheim University) Dr Haiping Lu (Department of Computer Science, The University of Sheffield) Dr Jun Ma (Amazon Inc. Seattle, US) Prof. Paul Clough (Information School, The University of Sheffield & Peak Indicators) Ms Anna Primpeli (Institute of Computer Science and Business Informatics, The Mannheim University) Mr Ralph Peeters (Institute of Computer Science and Business Informatics, The Mannheim University) Mr. Abdulkareem Alqusair (Information School, The University of Sheffield)

    6. Contact

    To contact the organising committee please use the Google discussion group https://groups.google.com/forum/#!forum/mwpd2020

  17. Modified Big SalesData

    • kaggle.com
    zip
    Updated Feb 26, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Shubhra Rana (2025). Modified Big SalesData [Dataset]. https://www.kaggle.com/datasets/shubhrarana/modified-big-salesdata
    Explore at:
    zip(2360624 bytes)Available download formats
    Dataset updated
    Feb 26, 2025
    Authors
    Shubhra Rana
    License

    http://www.gnu.org/licenses/fdl-1.3.htmlhttp://www.gnu.org/licenses/fdl-1.3.html

    Description

    Modified and Cleaned data set from https://www.kaggle.com/datasets/pigment/big-sales-data.

    This can be used for EDA, Data Analytics, Data Mining and Visualizations.

    Will be uploading two more versions shortly.

    1. Denormalized Version of the data to optimize the storage
    2. Consolidated Data from all months together to perform data mining tasks like ARM.
  18. T

    Task Mining Tool Report

    • marketreportanalytics.com
    doc, pdf, ppt
    Updated Apr 10, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Market Report Analytics (2025). Task Mining Tool Report [Dataset]. https://www.marketreportanalytics.com/reports/task-mining-tool-75890
    Explore at:
    pdf, doc, pptAvailable download formats
    Dataset updated
    Apr 10, 2025
    Dataset authored and provided by
    Market Report Analytics
    License

    https://www.marketreportanalytics.com/privacy-policyhttps://www.marketreportanalytics.com/privacy-policy

    Time period covered
    2025 - 2033
    Area covered
    Global
    Variables measured
    Market Size
    Description

    The Task Mining Tool market is experiencing robust growth, driven by the increasing need for process optimization and automation across diverse sectors. The market, currently valued at approximately $2 billion in 2025, is projected to exhibit a Compound Annual Growth Rate (CAGR) of 25% from 2025 to 2033, reaching an estimated $10 billion by 2033. This significant expansion is fueled by several key factors. Firstly, the rising adoption of digital transformation initiatives across industries like manufacturing, retail, and financial services is creating a surge in demand for tools that can efficiently analyze and improve complex business processes. Secondly, the escalating costs associated with inefficient processes are prompting organizations to seek cost-effective solutions, making task mining a compelling investment. Thirdly, advancements in artificial intelligence (AI) and machine learning (ML) are enhancing the capabilities of task mining tools, improving accuracy and expanding their applications. Finally, the growing availability of cloud-based solutions is further contributing to market expansion by offering flexible deployment options and reducing upfront infrastructure investments. However, certain challenges exist. The complexity of implementing and integrating task mining tools within existing IT infrastructures can present a barrier to adoption, particularly for smaller organizations with limited resources. Furthermore, concerns regarding data privacy and security need to be addressed to ensure widespread acceptance and avoid hindering market growth. Despite these restraints, the long-term outlook for the task mining tool market remains highly positive, with continued technological advancements and a growing awareness of its benefits expected to drive further expansion across all segments, including cloud-based and on-premises solutions, and across all major geographic regions. The strong presence of established players like UiPath and Celonis, coupled with the emergence of innovative startups, indicates a dynamic and competitive market landscape that will further stimulate innovation and adoption.

  19. Data Mining in Systems Health Management - Dataset - NASA Open Data Portal

    • data.nasa.gov
    Updated Mar 31, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    nasa.gov (2025). Data Mining in Systems Health Management - Dataset - NASA Open Data Portal [Dataset]. https://data.nasa.gov/dataset/data-mining-in-systems-health-management
    Explore at:
    Dataset updated
    Mar 31, 2025
    Dataset provided by
    NASAhttp://nasa.gov/
    Description

    This chapter presents theoretical and practical aspects associated to the implementation of a combined model-based/data-driven approach for failure prognostics based on particle filtering algorithms, in which the current esti- mate of the state PDF is used to determine the operating condition of the system and predict the progression of a fault indicator, given a dynamic state model and a set of process measurements. In this approach, the task of es- timating the current value of the fault indicator, as well as other important changing parameters in the environment, involves two basic steps: the predic- tion step, based on the process model, and an update step, which incorporates the new measurement into the a priori state estimate. This framework allows to estimate of the probability of failure at future time instants (RUL PDF) in real-time, providing information about time-to- failure (TTF) expectations, statistical confidence intervals, long-term predic- tions; using for this purpose empirical knowledge about critical conditions for the system (also referred to as the hazard zones). This information is of paramount significance for the improvement of the system reliability and cost-effective operation of critical assets, as it has been shown in a case study where feedback correction strategies (based on uncertainty measures) have been implemented to lengthen the RUL of a rotorcraft transmission system with propagating fatigue cracks on a critical component. Although the feed- back loop is implemented using simple linear relationships, it is helpful to provide a quick insight into the manner that the system reacts to changes on its input signals, in terms of its predicted RUL. The method is able to manage non-Gaussian pdf’s since it includes concepts such as nonlinear state estimation and confidence intervals in its formulation. Real data from a fault seeded test showed that the proposed framework was able to anticipate modifications on the system input to lengthen its RUL. Results of this test indicate that the method was able to successfully suggest the correction that the system required. In this sense, future work will be focused on the development and testing of similar strategies using different input-output uncertainty metrics.

  20. f

    Sepsis Cases - Event Log

    • figshare.com
    • data.4tu.nl
    txt
    Updated Jun 7, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Felix Mannhardt (2023). Sepsis Cases - Event Log [Dataset]. http://doi.org/10.4121/uuid:915d2bfb-7e84-49ad-a286-dc35f063a460
    Explore at:
    txtAvailable download formats
    Dataset updated
    Jun 7, 2023
    Dataset provided by
    4TU.ResearchData
    Authors
    Felix Mannhardt
    License

    https://doi.org/10.4121/resource:terms_of_usehttps://doi.org/10.4121/resource:terms_of_use

    Description

    This real-life event log contains events of sepsis cases from a hospital. Sepsis is a life threatening condition typically caused by an infection. One case represents the pathway through the hospital. The events were recorded by the ERP (Enterprise Resource Planning) system of the hospital. There are about 1000 cases with in total 15,000 events that were recorded for 16 different activities. Moreover, 39 data attributes are recorded, e.g., the group responsible for the activity, the results of tests and information from checklists. Events and attribute values have been anonymized. The time stamps of events have been randomized, but the time between events within a trace has not been altered.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Dashlink (2025). Data Mining in Systems Health Management [Dataset]. https://catalog.data.gov/dataset/data-mining-in-systems-health-management

Data Mining in Systems Health Management

Explore at:
12 scholarly articles cite this dataset (View in Google Scholar)
Dataset updated
Apr 10, 2025
Dataset provided by
Dashlink
Description

This chapter presents theoretical and practical aspects associated to the implementation of a combined model-based/data-driven approach for failure prognostics based on particle filtering algorithms, in which the current esti- mate of the state PDF is used to determine the operating condition of the system and predict the progression of a fault indicator, given a dynamic state model and a set of process measurements. In this approach, the task of es- timating the current value of the fault indicator, as well as other important changing parameters in the environment, involves two basic steps: the predic- tion step, based on the process model, and an update step, which incorporates the new measurement into the a priori state estimate. This framework allows to estimate of the probability of failure at future time instants (RUL PDF) in real-time, providing information about time-to- failure (TTF) expectations, statistical confidence intervals, long-term predic- tions; using for this purpose empirical knowledge about critical conditions for the system (also referred to as the hazard zones). This information is of paramount significance for the improvement of the system reliability and cost-effective operation of critical assets, as it has been shown in a case study where feedback correction strategies (based on uncertainty measures) have been implemented to lengthen the RUL of a rotorcraft transmission system with propagating fatigue cracks on a critical component. Although the feed- back loop is implemented using simple linear relationships, it is helpful to provide a quick insight into the manner that the system reacts to changes on its input signals, in terms of its predicted RUL. The method is able to manage non-Gaussian pdf’s since it includes concepts such as nonlinear state estimation and confidence intervals in its formulation. Real data from a fault seeded test showed that the proposed framework was able to anticipate modifications on the system input to lengthen its RUL. Results of this test indicate that the method was able to successfully suggest the correction that the system required. In this sense, future work will be focused on the development and testing of similar strategies using different input-output uncertainty metrics.

Search
Clear search
Close search
Google apps
Main menu