100+ datasets found
  1. Table_1_Data Mining Techniques in Analyzing Process Data: A Didactic.pdf

    • frontiersin.figshare.com
    pdf
    Updated Jun 7, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Xin Qiao; Hong Jiao (2023). Table_1_Data Mining Techniques in Analyzing Process Data: A Didactic.pdf [Dataset]. http://doi.org/10.3389/fpsyg.2018.02231.s001
    Explore at:
    pdfAvailable download formats
    Dataset updated
    Jun 7, 2023
    Dataset provided by
    Frontiers Mediahttp://www.frontiersin.org/
    Authors
    Xin Qiao; Hong Jiao
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Due to increasing use of technology-enhanced educational assessment, data mining methods have been explored to analyse process data in log files from such assessment. However, most studies were limited to one data mining technique under one specific scenario. The current study demonstrates the usage of four frequently used supervised techniques, including Classification and Regression Trees (CART), gradient boosting, random forest, support vector machine (SVM), and two unsupervised methods, Self-organizing Map (SOM) and k-means, fitted to one assessment data. The USA sample (N = 426) from the 2012 Program for International Student Assessment (PISA) responding to problem-solving items is extracted to demonstrate the methods. After concrete feature generation and feature selection, classifier development procedures are implemented using the illustrated techniques. Results show satisfactory classification accuracy for all the techniques. Suggestions for the selection of classifiers are presented based on the research questions, the interpretability and the simplicity of the classifiers. Interpretations for the results from both supervised and unsupervised learning methods are provided.

  2. G

    Data Mining Tools Market Research Report 2033

    • growthmarketreports.com
    csv, pdf, pptx
    Updated Aug 4, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Growth Market Reports (2025). Data Mining Tools Market Research Report 2033 [Dataset]. https://growthmarketreports.com/report/data-mining-tools-market
    Explore at:
    pdf, csv, pptxAvailable download formats
    Dataset updated
    Aug 4, 2025
    Dataset authored and provided by
    Growth Market Reports
    Time period covered
    2024 - 2032
    Area covered
    Global
    Description

    Data Mining Tools Market Outlook




    According to our latest research, the global Data Mining Tools market size reached USD 1.93 billion in 2024, reflecting robust industry momentum. The market is expected to grow at a CAGR of 12.7% from 2025 to 2033, reaching a projected value of USD 5.69 billion by 2033. This growth is primarily driven by the increasing adoption of advanced analytics across diverse industries, rapid digital transformation, and the necessity for actionable insights from massive data volumes.




    One of the pivotal growth factors propelling the Data Mining Tools market is the exponential rise in data generation, particularly through digital channels, IoT devices, and enterprise applications. Organizations across sectors are leveraging data mining tools to extract meaningful patterns, trends, and correlations from structured and unstructured data. The need for improved decision-making, operational efficiency, and competitive advantage has made data mining an essential component of modern business strategies. Furthermore, advancements in artificial intelligence and machine learning are enhancing the capabilities of these tools, enabling predictive analytics, anomaly detection, and automation of complex analytical tasks, which further fuels market expansion.




    Another significant driver is the growing demand for customer-centric solutions in industries such as retail, BFSI, and healthcare. Data mining tools are increasingly being used for customer relationship management, targeted marketing, fraud detection, and risk management. By analyzing customer behavior and preferences, organizations can personalize their offerings, optimize marketing campaigns, and mitigate risks. The integration of data mining tools with cloud platforms and big data technologies has also simplified deployment and scalability, making these solutions accessible to small and medium-sized enterprises (SMEs) as well as large organizations. This democratization of advanced analytics is creating new growth avenues for vendors and service providers.




    The regulatory landscape and the increasing emphasis on data privacy and security are also shaping the development and adoption of Data Mining Tools. Compliance with frameworks such as GDPR, HIPAA, and CCPA necessitates robust data governance and transparent analytics processes. Vendors are responding by incorporating features like data masking, encryption, and audit trails into their solutions, thereby enhancing trust and adoption among regulated industries. Additionally, the emergence of industry-specific data mining applications, such as fraud detection in BFSI and predictive diagnostics in healthcare, is expanding the addressable market and fostering innovation.




    From a regional perspective, North America currently dominates the Data Mining Tools market owing to the early adoption of advanced analytics, strong presence of leading technology vendors, and high investments in digital transformation. However, the Asia Pacific region is emerging as a lucrative market, driven by rapid industrialization, expansion of IT infrastructure, and growing awareness of data-driven decision-making in countries like China, India, and Japan. Europe, with its focus on data privacy and digital innovation, also represents a significant market share, while Latin America and the Middle East & Africa are witnessing steady growth as organizations in these regions modernize their operations and adopt cloud-based analytics solutions.





    Component Analysis




    The Component segment of the Data Mining Tools market is bifurcated into Software and Services. Software remains the dominant segment, accounting for the majority of the market share in 2024. This dominance is attributed to the continuous evolution of data mining algorithms, the proliferation of user-friendly graphical interfaces, and the integration of advanced analytics capabilities such as machine learning, artificial intelligence, and natural language pro

  3. Data Mining Project - Boston

    • kaggle.com
    zip
    Updated Nov 25, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    SophieLiu (2019). Data Mining Project - Boston [Dataset]. https://www.kaggle.com/sliu65/data-mining-project-boston
    Explore at:
    zip(59313797 bytes)Available download formats
    Dataset updated
    Nov 25, 2019
    Authors
    SophieLiu
    Area covered
    Boston
    Description

    Context

    To make this a seamless process, I cleaned the data and delete many variables that I thought were not important to our dataset. I then uploaded all of those files to Kaggle for each of you to download. The rideshare_data has both lyft and uber but it is still a cleaned version from the dataset we downloaded from Kaggle.

    Use of Data Files

    You can easily subset the data into the car types that you will be modeling by first loading the csv into R, here is the code for how you do this:

    This loads the file into R

    df<-read.csv('uber.csv')

    The next codes is to subset the data into specific car types. The example below only has Uber 'Black' car types.

    df_black<-subset(uber_df, uber_df$name == 'Black')

    This next portion of code will be to load it into R. First, we must write this dataframe into a csv file on our computer in order to load it into R.

    write.csv(df_black, "nameofthefileyouwanttosaveas.csv")

    The file will appear in you working directory. If you are not familiar with your working directory. Run this code:

    getwd()

    The output will be the file path to your working directory. You will find the file you just created in that folder.

    Inspiration

    Your data will be in front of the world's largest data science community. What questions do you want to see answered?

  4. d

    Data from: Discovering System Health Anomalies using Data Mining Techniques

    • catalog.data.gov
    • s.cnmilf.com
    Updated Apr 10, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dashlink (2025). Discovering System Health Anomalies using Data Mining Techniques [Dataset]. https://catalog.data.gov/dataset/discovering-system-health-anomalies-using-data-mining-techniques
    Explore at:
    Dataset updated
    Apr 10, 2025
    Dataset provided by
    Dashlink
    Description

    We discuss a statistical framework that underlies envelope detection schemes as well as dynamical models based on Hidden Markov Models (HMM) that can encompass both discrete and continuous sensor measurements for use in Integrated System Health Management (ISHM) applications. The HMM allows for the rapid assimilation, analysis, and discovery of system anomalies. We motivate our work with a discussion of an aviation problem where the identification of anomalous sequences is essential for safety reasons. The data in this application are discrete and continuous sensor measurements and can be dealt with seamlessly using the methods described here to discover anomalous flights. We specifically treat the problem of discovering anomalous features in the time series that may be hidden from the sensor suite and compare those methods to standard envelope detection methods on test data designed to accentuate the differences between the two methods. Identification of these hidden anomalies is crucial to building stable, reusable, and cost-efficient systems. We also discuss a data mining framework for the analysis and discovery of anomalies in high-dimensional time series of sensor measurements that would be found in an ISHM system. We conclude with recommendations that describe the tradeoffs in building an integrated scalable platform for robust anomaly detection in ISHM applications.

  5. Data Mining Tools Market - A Global and Regional Analysis

    • bisresearch.com
    csv, pdf
    Updated Nov 30, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bisresearch (2025). Data Mining Tools Market - A Global and Regional Analysis [Dataset]. https://bisresearch.com/industry-report/global-data-mining-tools-market.html
    Explore at:
    csv, pdfAvailable download formats
    Dataset updated
    Nov 30, 2025
    Dataset authored and provided by
    Bisresearch
    License

    https://bisresearch.com/privacy-policy-cookie-restriction-modehttps://bisresearch.com/privacy-policy-cookie-restriction-mode

    Time period covered
    2023 - 2033
    Area covered
    Worldwide
    Description

    The Data Mining Tools Market is expected to be valued at $1.24 billion in 2024, with an anticipated expansion at a CAGR of 11.63% to reach $3.73 billion by 2034.

  6. w

    Global Web Mining Technology Market Research Report: By Application (Data...

    • wiseguyreports.com
    Updated Sep 15, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). Global Web Mining Technology Market Research Report: By Application (Data Extraction, Web Content Mining, Web Structure Mining, Web Usage Mining), By Deployment Type (On-Premises, Cloud-Based), By Technology (Machine Learning, Natural Language Processing, Artificial Intelligence), By End Use Sector (E-Commerce, Healthcare, Finance, Telecommunications) and By Regional (North America, Europe, South America, Asia Pacific, Middle East and Africa) - Forecast to 2035 [Dataset]. https://www.wiseguyreports.com/reports/web-mining-technology-market
    Explore at:
    Dataset updated
    Sep 15, 2025
    License

    https://www.wiseguyreports.com/pages/privacy-policyhttps://www.wiseguyreports.com/pages/privacy-policy

    Time period covered
    Sep 25, 2025
    Area covered
    Global
    Description
    BASE YEAR2024
    HISTORICAL DATA2019 - 2023
    REGIONS COVEREDNorth America, Europe, APAC, South America, MEA
    REPORT COVERAGERevenue Forecast, Competitive Landscape, Growth Factors, and Trends
    MARKET SIZE 20242.37(USD Billion)
    MARKET SIZE 20252.6(USD Billion)
    MARKET SIZE 20356.5(USD Billion)
    SEGMENTS COVEREDApplication, Deployment Type, Technology, End Use Sector, Regional
    COUNTRIES COVEREDUS, Canada, Germany, UK, France, Russia, Italy, Spain, Rest of Europe, China, India, Japan, South Korea, Malaysia, Thailand, Indonesia, Rest of APAC, Brazil, Mexico, Argentina, Rest of South America, GCC, South Africa, Rest of MEA
    KEY MARKET DYNAMICSIncreased data generation, Growing demand for analytics, Rising cloud computing adoption, Advancements in AI technologies, Enhanced focus on data security
    MARKET FORECAST UNITSUSD Billion
    KEY COMPANIES PROFILEDIBM, Amazon Web Services, Domo, TIBCO Software, Palantir Technologies, Oracle, MicroStrategy, SAP, Microsoft, Tableau Software, Cloudera, Google, SAS Institute, Alteryx, Qlik, DataRobot
    MARKET FORECAST PERIOD2025 - 2035
    KEY MARKET OPPORTUNITIESIncreased demand for big data analytics, Growth in e-commerce personalization, Rising adoption of AI-driven insights, Enhanced focus on customer experience, Need for competitive intelligence solutions
    COMPOUND ANNUAL GROWTH RATE (CAGR) 9.6% (2025 - 2035)
  7. Data from: Peer-to-Peer Data Mining, Privacy Issues, and Games

    • data.nasa.gov
    • s.cnmilf.com
    • +2more
    Updated Mar 31, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    nasa.gov (2025). Peer-to-Peer Data Mining, Privacy Issues, and Games [Dataset]. https://data.nasa.gov/dataset/peer-to-peer-data-mining-privacy-issues-and-games
    Explore at:
    Dataset updated
    Mar 31, 2025
    Dataset provided by
    NASAhttp://nasa.gov/
    Description

    Peer-to-Peer (P2P) networks are gaining increasing popularity in many distributed applications such as file-sharing, network storage, web caching, sear- ching and indexing of relevant documents and P2P network-threat analysis. Many of these applications require scalable analysis of data over a P2P network. This paper starts by offering a brief overview of distributed data mining applications and algorithms for P2P environments. Next it discusses some of the privacy concerns with P2P data mining and points out the problems of existing privacy-preserving multi-party data mining techniques. It further points out that most of the nice assumptions of these existing privacy preserving techniques fall apart in real-life applications of privacy-preserving distributed data mining (PPDM). The paper offers a more realistic formulation of the PPDM problem as a multi-party game and points out some recent results.

  8. Video-to-Model Data Set

    • figshare.com
    • commons.datacite.org
    xml
    Updated Mar 24, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sönke Knoch; Shreeraman Ponpathirkoottam; Tim Schwartz (2020). Video-to-Model Data Set [Dataset]. http://doi.org/10.6084/m9.figshare.12026850.v1
    Explore at:
    xmlAvailable download formats
    Dataset updated
    Mar 24, 2020
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Sönke Knoch; Shreeraman Ponpathirkoottam; Tim Schwartz
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This data set belongs to the paper "Video-to-Model: Unsupervised Trace Extraction from Videos for Process Discovery and Conformance Checking in Manual Assembly", submitted on March 24, 2020, to the 18th International Conference on Business Process Management (BPM).Abstract: Manual activities are often hidden deep down in discrete manufacturing processes. For the elicitation and optimization of process behavior, complete information about the execution of Manual activities are required. Thus, an approach is presented on how execution level information can be extracted from videos in manual assembly. The goal is the generation of a log that can be used in state-of-the-art process mining tools. The test bed for the system was lightweight and scalable consisting of an assembly workstation equipped with a single RGB camera recording only the hand movements of the worker from top. A neural network based real-time object classifier was trained to detect the worker’s hands. The hand detector delivers the input for an algorithm, which generates trajectories reflecting the movement paths of the hands. Those trajectories are automatically assigned to work steps using the position of material boxes on the assembly shelf as reference points and hierarchical clustering of similar behaviors with dynamic time warping. The system has been evaluated in a task-based study with ten participants in a laboratory, but under realistic conditions. The generated logs have been loaded into the process mining toolkit ProM to discover the underlying process model and to detect deviations from both, instructions and ground truth, using conformance checking. The results show that process mining delivers insights about the assembly process and the system’s precision.The data set contains the generated and the annotated logs based on the video material gathered during the user study. In addition, the petri nets from the process discovery and conformance checking conducted with ProM (http://www.promtools.org) and the reference nets modeled with Yasper (http://www.yasper.org/) are provided.

  9. l

    LScDC (Leicester Scientific Dictionary-Core)

    • figshare.le.ac.uk
    docx
    Updated Apr 15, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Neslihan Suzen (2020). LScDC (Leicester Scientific Dictionary-Core) [Dataset]. http://doi.org/10.25392/leicester.data.9896579.v3
    Explore at:
    docxAvailable download formats
    Dataset updated
    Apr 15, 2020
    Dataset provided by
    University of Leicester
    Authors
    Neslihan Suzen
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Leicester
    Description

    The LScDC (Leicester Scientific Dictionary-Core Dictionary)April 2020 by Neslihan Suzen, PhD student at the University of Leicester (ns433@leicester.ac.uk/suzenneslihan@hotmail.com)Supervised by Prof Alexander Gorban and Dr Evgeny Mirkes[Version 3] The third version of LScDC (Leicester Scientific Dictionary-Core) is formed using the updated LScD (Leicester Scientific Dictionary) - Version 3*. All steps applied to build the new version of core dictionary are the same as in Version 2** and can be found in description of Version 2 below. We did not repeat the explanation. The files provided with this description are also same as described as for LScDC Version 2. The numbers of words in the 3rd versions of LScD and LScDC are summarized below. # of wordsLScD (v3) 972,060LScDC (v3) 103,998 * Suzen, Neslihan (2019): LScD (Leicester Scientific Dictionary). figshare. Dataset. https://doi.org/10.25392/leicester.data.9746900.v3 ** Suzen, Neslihan (2019): LScDC (Leicester Scientific Dictionary-Core). figshare. Dataset. https://doi.org/10.25392/leicester.data.9896579.v2[Version 2] Getting StartedThis file describes a sorted and cleaned list of words from LScD (Leicester Scientific Dictionary), explains steps for sub-setting the LScD and basic statistics of words in the LSC (Leicester Scientific Corpus), to be found in [1, 2]. The LScDC (Leicester Scientific Dictionary-Core) is a list of words ordered by the number of documents containing the words, and is available in the CSV file published. There are 104,223 unique words (lemmas) in the LScDC. This dictionary is created to be used in future work on the quantification of the sense of research texts. The objective of sub-setting the LScD is to discard words which appear too rarely in the corpus. In text mining algorithms, usage of enormous number of text data brings the challenge to the performance and the accuracy of data mining applications. The performance and the accuracy of models are heavily depend on the type of words (such as stop words and content words) and the number of words in the corpus. Rare occurrence of words in a collection is not useful in discriminating texts in large corpora as rare words are likely to be non-informative signals (or noise) and redundant in the collection of texts. The selection of relevant words also holds out the possibility of more effective and faster operation of text mining algorithms.To build the LScDC, we decided the following process on LScD: removing words that appear in no more than 10 documents (

  10. Pokemon data mining 2020

    • kaggle.com
    zip
    Updated Jul 31, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    AJ Pass (2020). Pokemon data mining 2020 [Dataset]. https://www.kaggle.com/ajpass/pokemon-data-mining-2020
    Explore at:
    zip(20239 bytes)Available download formats
    Dataset updated
    Jul 31, 2020
    Authors
    AJ Pass
    Description

    Context

    This dataset was obtained using a web scrapper made in this notebook as learning purposes for data mining and web scrapping:

    https://www.kaggle.com/ajpass/data-mining-web-scrapper-vol-1-pokedex

    Content

    Inside this dataset are the diferrent generations of pokemons with all their stats.

    Acknowledgements

    This dataset was come from the knowledge I learned following a tutorial a year ago and because I couldn't find it I made a version with what I remembered.

  11. r

    Data from: Time in market: using data mining technologies to measure product...

    • resodate.org
    Updated Sep 13, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Erik Poppe (2022). Time in market: using data mining technologies to measure product lifecycles [Dataset]. http://doi.org/10.14279/depositonce-16226
    Explore at:
    Dataset updated
    Sep 13, 2022
    Dataset provided by
    Technische Universität Berlin
    DepositOnce
    Authors
    Erik Poppe
    Description

    Time in Market (TIM) is a metric to describe the time period of a product from its market entry to its decline and disappearance from the market. The concept is often used implicit to describe the acceleration of product life cycles, innovation cycles and is an essential part of the product life cycle concept. It can be assumed that time in markets is an important indicator for manufacturers and marketers to plan and evaluate their market success. Moreover, time in markets are necessary to measure the speed of product life cycles and their implication for the general development of product lifetime. This article’s major contributions are to presenting (1) time in markets as a highly relevant concept for the assessment of product life cycles, although the indicator has received little attention so far, (2) explaining an automated internet-based data mining approach to gather semi-structured product data from 5 German internet shops for electronic consumer goods and (3) presenting initial insights for a period of a half to one year on market data for smartphones. It will turn out that longer periods of time are needed to obtain significant data on time in markets, nevertheless initial results show a high product rollover rate of 40-45% within one year and present a time in market below 100 days for at least 16% of the captured products. Due to the current state of work, this article is addressed to researchers already engaged in data mining or interested in the application of it.

  12. e

    Data Warehousing and Data Mining (PEC-IT602B), 6th Semester, Information...

    • paper.erudition.co.in
    html
    Updated Aug 12, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Einetic (2025). Data Warehousing and Data Mining (PEC-IT602B), 6th Semester, Information Technology, MAKAUT | Erudition Paper [Dataset]. https://paper.erudition.co.in/makaut/btech-in-information-technology/6/data-warehousing-and-data-mining
    Explore at:
    htmlAvailable download formats
    Dataset updated
    Aug 12, 2021
    Dataset authored and provided by
    Einetic
    License

    https://paper.erudition.co.in/termshttps://paper.erudition.co.in/terms

    Description

    Question Paper Solutions of Data Warehousing and Data Mining (PEC-IT602B),6th Semester,Information Technology,Maulana Abul Kalam Azad University of Technology

  13. e

    mining-technology.com Traffic Analytics Data

    • analytics.explodingtopics.com
    Updated Sep 1, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). mining-technology.com Traffic Analytics Data [Dataset]. https://analytics.explodingtopics.com/website/mining-technology.com
    Explore at:
    Dataset updated
    Sep 1, 2025
    Variables measured
    Global Rank, Monthly Visits, Authority Score, US Country Rank, Metals & Mining Category Rank
    Description

    Traffic analytics, rankings, and competitive metrics for mining-technology.com as of September 2025

  14. G

    Privacy‑Preserving Data Mining Tools Market Research Report 2033

    • growthmarketreports.com
    csv, pdf, pptx
    Updated Aug 29, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Growth Market Reports (2025). Privacy‑Preserving Data Mining Tools Market Research Report 2033 [Dataset]. https://growthmarketreports.com/report/privacypreserving-data-mining-tools-market
    Explore at:
    pptx, csv, pdfAvailable download formats
    Dataset updated
    Aug 29, 2025
    Dataset authored and provided by
    Growth Market Reports
    Time period covered
    2024 - 2032
    Area covered
    Global
    Description

    Privacy?Preserving Data Mining Tools Market Outlook



    According to our latest research, the global Privacy?Preserving Data Mining Tools market size reached USD 1.42 billion in 2024, reflecting robust adoption across diverse industries. The market is expected to exhibit a CAGR of 22.8% during the forecast period, propelling the market to USD 10.98 billion by 2033. This remarkable growth is driven by the increasing need for secure data analytics, stringent data protection regulations, and the rising frequency of data breaches, all of which are pushing organizations to adopt advanced privacy solutions.



    One of the primary growth factors for the Privacy?Preserving Data Mining Tools market is the exponential rise in data generation and the parallel escalation of privacy concerns. As organizations collect vast amounts of sensitive information, especially in sectors like healthcare and BFSI, the risk of data exposure and misuse grows. Governments worldwide are enacting stricter data protection laws, such as the GDPR in Europe and CCPA in California, compelling enterprises to integrate privacy?preserving technologies into their analytics workflows. These regulations not only mandate compliance but also foster consumer trust, making privacy?preserving data mining tools a strategic investment for businesses aiming to maintain a competitive edge while safeguarding user data.



    Another significant driver is the rapid digital transformation across industries, which necessitates the extraction of actionable insights from large, distributed data sets without compromising privacy. Privacy?preserving techniques, such as federated learning, homomorphic encryption, and differential privacy, are gaining traction as they allow organizations to collaborate and analyze data securely. The advent of cloud computing and the proliferation of connected devices further amplify the demand for scalable and secure data mining solutions. As enterprises embrace cloud-based analytics, the need for robust privacy-preserving mechanisms becomes paramount, fueling the adoption of advanced tools that can operate seamlessly in both on-premises and cloud environments.



    Moreover, the increasing sophistication of cyber threats and the growing awareness of the potential reputational and financial damage caused by data breaches are prompting organizations to prioritize data privacy. High-profile security incidents have underscored the vulnerabilities inherent in traditional data mining approaches, accelerating the shift towards privacy-preserving alternatives. The integration of artificial intelligence and machine learning with privacy-preserving technologies is also opening new avenues for innovation, enabling more granular and context-aware data analytics. This technological convergence is expected to further catalyze market growth, as organizations seek to harness the full potential of their data assets while maintaining stringent privacy standards.



    Privacy-Preserving Analytics is becoming a cornerstone in the modern data-driven landscape, offering organizations a way to extract valuable insights while maintaining stringent data privacy standards. This approach ensures that sensitive information remains protected even as it is analyzed, allowing businesses to comply with increasing regulatory demands without sacrificing the depth and breadth of their data analysis. By leveraging Privacy-Preserving Analytics, companies can foster greater trust among their customers and stakeholders, knowing that their data is being handled with the utmost care and security. This paradigm shift is not just about compliance; it’s about redefining how organizations approach data analytics in a world where privacy concerns are paramount.



    From a regional perspective, North America currently commands the largest share of the Privacy?Preserving Data Mining Tools market, driven by the presence of leading technology vendors, high awareness levels, and a robust regulatory framework. Europe follows closely, propelled by stringent data privacy laws and increasing investments in secure analytics infrastructure. The Asia Pacific region is witnessing the fastest growth, fueled by rapid digitalization, expanding IT ecosystems, and rising cybersecurity concerns in emerging economies such as China and India. Latin America and the Middle East & Africa are also experiencing steady growth, albeit from

  15. Data from: Enriching time series datasets using Nonparametric kernel...

    • figshare.com
    pdf
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mohamad Ivan Fanany (2023). Enriching time series datasets using Nonparametric kernel regression to improve forecasting accuracy [Dataset]. http://doi.org/10.6084/m9.figshare.1609661.v1
    Explore at:
    pdfAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Mohamad Ivan Fanany
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Improving the accuracy of prediction on future values based on the past and current observations has been pursued by enhancing the prediction's methods, combining those methods or performing data pre-processing. In this paper, another approach is taken, namely by increasing the number of input in the dataset. This approach would be useful especially for a shorter time series data. By filling the in-between values in the time series, the number of training set can be increased, thus increasing the generalization capability of the predictor. The algorithm used to make prediction is Neural Network as it is widely used in literature for time series tasks. For comparison, Support Vector Regression is also employed. The dataset used in the experiment is the frequency of USPTO's patents and PubMed's scientific publications on the field of health, namely on Apnea, Arrhythmia, and Sleep Stages. Another time series data designated for NN3 Competition in the field of transportation is also used for benchmarking. The experimental result shows that the prediction performance can be significantly increased by filling in-between data in the time series. Furthermore, the use of detrend and deseasonalization which separates the data into trend, seasonal and stationary time series also improve the prediction performance both on original and filled dataset. The optimal number of increase on the dataset in this experiment is about five times of the length of original dataset.

  16. Grocery Store dataset for data mining

    • kaggle.com
    zip
    Updated Mar 9, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Honey Patel (2021). Grocery Store dataset for data mining [Dataset]. https://www.kaggle.com/honeypatel2158/grocery-store-dataset-for-data-mining
    Explore at:
    zip(7990 bytes)Available download formats
    Dataset updated
    Mar 9, 2021
    Authors
    Honey Patel
    Description

    Dataset

    This dataset was created by Honey Patel

    Contents

  17. Application Research of Clustering on kmeans

    • kaggle.com
    zip
    Updated Feb 27, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    ddpr raju (2021). Application Research of Clustering on kmeans [Dataset]. https://www.kaggle.com/ddprraju/tirupati-compus-school
    Explore at:
    zip(34507 bytes)Available download formats
    Dataset updated
    Feb 27, 2021
    Authors
    ddpr raju
    Description

    Dataset

    This dataset was created by ddpr raju

    Contents

  18. S

    Smart Mining Technology Report

    • datainsightsmarket.com
    doc, pdf, ppt
    Updated May 16, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Data Insights Market (2025). Smart Mining Technology Report [Dataset]. https://www.datainsightsmarket.com/reports/smart-mining-technology-1980353
    Explore at:
    ppt, doc, pdfAvailable download formats
    Dataset updated
    May 16, 2025
    Dataset authored and provided by
    Data Insights Market
    License

    https://www.datainsightsmarket.com/privacy-policyhttps://www.datainsightsmarket.com/privacy-policy

    Time period covered
    2025 - 2033
    Area covered
    Global
    Variables measured
    Market Size
    Description

    The smart mining technology market, valued at $2760 million in 2025, is poised for robust growth, exhibiting a Compound Annual Growth Rate (CAGR) of 6.7% from 2025 to 2033. This expansion is driven by several key factors. Increasing demand for enhanced operational efficiency and safety within mining operations is a primary catalyst. The integration of Artificial Intelligence (AI) and Machine Learning (ML) for predictive maintenance, resource optimization, and autonomous equipment control significantly improves productivity and reduces downtime. Furthermore, the growing focus on environmental sustainability and stricter emission regulations are propelling the adoption of emissions management software and blockchain-based solutions for transparent and traceable metal trading. The market's segmentation reveals strong demand across applications like risk & compliance management, mining operations & process control, and data warehousing. AI/ML-enabled supply chain management and mining analytics platforms are the leading types driving market growth. North America and Europe currently hold significant market shares, fueled by early adoption of advanced technologies and robust regulatory frameworks. However, the Asia-Pacific region is expected to witness substantial growth in the coming years, driven by rapid industrialization, increasing mining activities, and government initiatives promoting technological advancements in the sector. Key players like Rockwell Automation, Caterpillar Inc., and Komatsu Ltd are leading the innovation and market penetration, while emerging companies specializing in AI, blockchain, and data analytics are also contributing significantly. The market's sustained growth will be influenced by factors like technological advancements, fluctuating commodity prices, and the overall economic climate. However, the high initial investment costs associated with implementing smart mining technologies and the need for skilled workforce training might pose challenges to wider adoption, especially in developing economies.

  19. e

    Mining Data Streams

    • paper.erudition.co.in
    html
    Updated Aug 12, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Einetic (2021). Mining Data Streams [Dataset]. https://paper.erudition.co.in/makaut/btech-in-information-technology/6/data-warehousing-and-data-mining
    Explore at:
    htmlAvailable download formats
    Dataset updated
    Aug 12, 2021
    Dataset authored and provided by
    Einetic
    License

    https://paper.erudition.co.in/termshttps://paper.erudition.co.in/terms

    Description

    Question Paper Solutions of chapter Mining Data Streams of Data Warehousing and Data Mining, 6th Semester , Information Technology

  20. 4

    Production Analysis with Process Mining Technology

    • data.4tu.nl
    • figshare.com
    zip
    Updated Jan 28, 2014
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dafna Levy (2014). Production Analysis with Process Mining Technology [Dataset]. http://doi.org/10.4121/uuid:68726926-5ac5-4fab-b873-ee76ea412399
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jan 28, 2014
    Dataset provided by
    NooL - Integrating People & Solutions
    Authors
    Dafna Levy
    License

    https://doi.org/10.4121/resource:terms_of_usehttps://doi.org/10.4121/resource:terms_of_use

    Description

    The comma separated value dataset contains process data from a production process, including data on cases, activities, resources, timestamps and more data fields.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Xin Qiao; Hong Jiao (2023). Table_1_Data Mining Techniques in Analyzing Process Data: A Didactic.pdf [Dataset]. http://doi.org/10.3389/fpsyg.2018.02231.s001
Organization logo

Table_1_Data Mining Techniques in Analyzing Process Data: A Didactic.pdf

Related Article
Explore at:
pdfAvailable download formats
Dataset updated
Jun 7, 2023
Dataset provided by
Frontiers Mediahttp://www.frontiersin.org/
Authors
Xin Qiao; Hong Jiao
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Due to increasing use of technology-enhanced educational assessment, data mining methods have been explored to analyse process data in log files from such assessment. However, most studies were limited to one data mining technique under one specific scenario. The current study demonstrates the usage of four frequently used supervised techniques, including Classification and Regression Trees (CART), gradient boosting, random forest, support vector machine (SVM), and two unsupervised methods, Self-organizing Map (SOM) and k-means, fitted to one assessment data. The USA sample (N = 426) from the 2012 Program for International Student Assessment (PISA) responding to problem-solving items is extracted to demonstrate the methods. After concrete feature generation and feature selection, classifier development procedures are implemented using the illustrated techniques. Results show satisfactory classification accuracy for all the techniques. Suggestions for the selection of classifiers are presented based on the research questions, the interpretability and the simplicity of the classifiers. Interpretations for the results from both supervised and unsupervised learning methods are provided.

Search
Clear search
Close search
Google apps
Main menu