100+ datasets found
  1. W

    Data from: Multi-objective optimization based privacy preserving distributed...

    • cloud.csiss.gmu.edu
    • data.staging.idas-ds1.appdat.jsc.nasa.gov
    • +2more
    pdf
    Updated Jan 29, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    United States (2020). Multi-objective optimization based privacy preserving distributed data mining in Peer-to-Peer networks [Dataset]. https://cloud.csiss.gmu.edu/uddi/dataset/multi-objective-optimization-based-privacy-preserving-distributed-data-mining-in-peer-to-p
    Explore at:
    pdfAvailable download formats
    Dataset updated
    Jan 29, 2020
    Dataset provided by
    United States
    Description

    This paper proposes a scalable, local privacy preserving algorithm for distributed Peer-to-Peer (P2P) data aggregation useful for many advanced data mining/analysis tasks such as average/sum computation, decision tree induction, feature selection, and more. Unlike most multi-party privacy-preserving data mining algorithms, this approach works in an asynchronous manner through local interactions and it is highly scalable. It particularly deals with the distributed computation of the sum of a set of numbers stored at different peers in a P2P network in the context of a P2P web mining application. The proposed optimization based privacy-preserving technique for computing the sum allows different peers to specify different privacy requirements without having to adhere to a global set of parameters for the chosen privacy model. Since distributed sum computation is a frequently used primitive, the proposed approach is likely to have significant impact on many data mining tasks such as multi-party privacy-preserving clustering, frequent itemset mining, and statistical aggregate computation.

  2. Data Mining Software Market Report | Global Forecast From 2025 To 2033

    • dataintelo.com
    csv, pdf, pptx
    Updated Jan 7, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dataintelo (2025). Data Mining Software Market Report | Global Forecast From 2025 To 2033 [Dataset]. https://dataintelo.com/report/data-mining-software-market
    Explore at:
    pdf, pptx, csvAvailable download formats
    Dataset updated
    Jan 7, 2025
    Dataset authored and provided by
    Dataintelo
    License

    https://dataintelo.com/privacy-and-policyhttps://dataintelo.com/privacy-and-policy

    Time period covered
    2024 - 2032
    Area covered
    Global
    Description

    Data Mining Software Market Outlook



    The global data mining software market size was valued at USD 7.2 billion in 2023 and is projected to reach USD 15.5 billion by 2032, growing at a compound annual growth rate (CAGR) of 8.7% during the forecast period. This growth is driven primarily by the increasing adoption of big data analytics and the rising demand for business intelligence across various industries. As businesses increasingly recognize the value of data-driven decision-making, the market is expected to witness substantial growth.



    One of the significant growth factors for the data mining software market is the exponential increase in data generation. With the proliferation of internet-enabled devices and the rapid advancement of technologies such as the Internet of Things (IoT), there is a massive influx of data. Organizations are now more focused than ever on harnessing this data to gain insights, improve operations, and create a competitive advantage. This has led to a surge in demand for advanced data mining tools that can process and analyze large datasets efficiently.



    Another driving force is the growing need for personalized customer experiences. In industries such as retail, healthcare, and BFSI, understanding customer behavior and preferences is crucial. Data mining software enables organizations to analyze customer data, segment their audience, and deliver personalized offerings, ultimately enhancing customer satisfaction and loyalty. This drive towards personalization is further fueling the adoption of data mining solutions, contributing significantly to market growth.



    The integration of artificial intelligence (AI) and machine learning (ML) technologies with data mining software is also a key growth factor. These advanced technologies enhance the capabilities of data mining tools by enabling them to learn from data patterns and make more accurate predictions. The convergence of AI and data mining is opening new avenues for businesses, allowing them to automate complex tasks, predict market trends, and make informed decisions more swiftly. The continuous advancements in AI and ML are expected to propel the data mining software market over the forecast period.



    Regionally, North America holds a significant share of the data mining software market, driven by the presence of major technology companies and the early adoption of advanced analytics solutions. The Asia Pacific region is also expected to witness substantial growth due to the rapid digital transformation across various industries and the increasing investments in data infrastructure. Additionally, the growing awareness and implementation of data-driven strategies in emerging economies are contributing to the market expansion in this region.



    Text Mining Software is becoming an integral part of the data mining landscape, offering unique capabilities to analyze unstructured data. As organizations generate vast amounts of textual data from various sources such as social media, emails, and customer feedback, the need for specialized tools to extract meaningful insights is growing. Text Mining Software enables businesses to process and analyze this data, uncovering patterns and trends that were previously hidden. This capability is particularly valuable in industries like marketing, customer service, and research, where understanding the nuances of language can lead to more informed decision-making. The integration of text mining with traditional data mining processes is enhancing the overall analytical capabilities of organizations, allowing them to derive comprehensive insights from both structured and unstructured data.



    Component Analysis



    The data mining software market is segmented by components, which primarily include software and services. The software segment encompasses various types of data mining tools that are used for analyzing and extracting valuable insights from raw data. These tools are designed to handle large volumes of data and provide advanced functionalities such as predictive analytics, data visualization, and pattern recognition. The increasing demand for sophisticated data analysis tools is driving the growth of the software segment. Enterprises are investing in these tools to enhance their data processing capabilities and derive actionable insights.



    Within the software segment, the emergence of cloud-based data mining solutions is a notable trend. Cloud-based solutions offer several advantages, including s

  3. f

    Data from: Results obtained in a data mining process applied to a database...

    • scielo.figshare.com
    jpeg
    Updated Jun 4, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    E.M. Ruiz Lobaina; C. P. Romero Suárez (2023). Results obtained in a data mining process applied to a database containing bibliographic information concerning four segments of science. [Dataset]. http://doi.org/10.6084/m9.figshare.20011798.v1
    Explore at:
    jpegAvailable download formats
    Dataset updated
    Jun 4, 2023
    Dataset provided by
    SciELO journals
    Authors
    E.M. Ruiz Lobaina; C. P. Romero Suárez
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Abstract The objective of this work is to improve the quality of the information that belongs to the database CubaCiencia, of the Institute of Scientific and Technological Information. This database has bibliographic information referring to four segments of science and is the main database of the Library Management System. The applied methodology was based on the Decision Trees, the Correlation Matrix, the 3D Scatter Plot, etc., which are techniques used by data mining, for the study of large volumes of information. The results achieved not only made it possible to improve the information in the database, but also provided truly useful patterns in the solution of the proposed objectives.

  4. c

    Discovering Anomalous Aviation Safety Events Using Scalable Data Mining...

    • s.cnmilf.com
    • cloud.csiss.gmu.edu
    • +6more
    Updated Apr 10, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dashlink (2025). Discovering Anomalous Aviation Safety Events Using Scalable Data Mining Algorithms [Dataset]. https://s.cnmilf.com/user74170196/https/catalog.data.gov/dataset/discovering-anomalous-aviation-safety-events-using-scalable-data-mining-algorithms
    Explore at:
    Dataset updated
    Apr 10, 2025
    Dataset provided by
    Dashlink
    Description

    The worldwide civilian aviation system is one of the most complex dynamical systems created. Most modern commercial aircraft have onboard flight data recorders that record several hundred discrete and continuous parameters at approximately 1Hz for the entire duration of the flight. These data contain information about the flight control systems, actuators, engines, landing gear, avionics, and pilot commands. In this paper, recent advances in the development of a novel knowledge discovery process consisting of a suite of data mining techniques for identifying precursors to aviation safety incidents are discussed. The data mining techniques include scalable multiple-kernel learning for large-scale distributed anomaly detection. A novel multivariate time-series search algorithm is used to search for signatures of discovered anomalies on massive datasets. The process can identify operationally significant events due to environmental, mechanical, and human factors issues in the high-dimensional flight operations quality assurance data. All discovered anomalies are validated by a team of independent _domain experts. This novel automated knowledge discovery process is aimed at complementing the state-of-the-art human-generated exceedance-based analysis that fails to discover previously unknown aviation safety incidents. In this paper, the discovery pipeline, the methods used, and some of the significant anomalies detected on real-world commercial aviation data are discussed.

  5. d

    Data Mining in Systems Health Management

    • catalog.data.gov
    • data.wu.ac.at
    Updated Apr 10, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dashlink (2025). Data Mining in Systems Health Management [Dataset]. https://catalog.data.gov/dataset/data-mining-in-systems-health-management
    Explore at:
    Dataset updated
    Apr 10, 2025
    Dataset provided by
    Dashlink
    Description

    This chapter presents theoretical and practical aspects associated to the implementation of a combined model-based/data-driven approach for failure prognostics based on particle filtering algorithms, in which the current esti- mate of the state PDF is used to determine the operating condition of the system and predict the progression of a fault indicator, given a dynamic state model and a set of process measurements. In this approach, the task of es- timating the current value of the fault indicator, as well as other important changing parameters in the environment, involves two basic steps: the predic- tion step, based on the process model, and an update step, which incorporates the new measurement into the a priori state estimate. This framework allows to estimate of the probability of failure at future time instants (RUL PDF) in real-time, providing information about time-to- failure (TTF) expectations, statistical confidence intervals, long-term predic- tions; using for this purpose empirical knowledge about critical conditions for the system (also referred to as the hazard zones). This information is of paramount significance for the improvement of the system reliability and cost-effective operation of critical assets, as it has been shown in a case study where feedback correction strategies (based on uncertainty measures) have been implemented to lengthen the RUL of a rotorcraft transmission system with propagating fatigue cracks on a critical component. Although the feed- back loop is implemented using simple linear relationships, it is helpful to provide a quick insight into the manner that the system reacts to changes on its input signals, in terms of its predicted RUL. The method is able to manage non-Gaussian pdf’s since it includes concepts such as nonlinear state estimation and confidence intervals in its formulation. Real data from a fault seeded test showed that the proposed framework was able to anticipate modifications on the system input to lengthen its RUL. Results of this test indicate that the method was able to successfully suggest the correction that the system required. In this sense, future work will be focused on the development and testing of similar strategies using different input-output uncertainty metrics.

  6. f

    Video-to-Model Data Set

    • figshare.com
    • commons.datacite.org
    xml
    Updated Mar 24, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sönke Knoch; Shreeraman Ponpathirkoottam; Tim Schwartz (2020). Video-to-Model Data Set [Dataset]. http://doi.org/10.6084/m9.figshare.12026850.v1
    Explore at:
    xmlAvailable download formats
    Dataset updated
    Mar 24, 2020
    Dataset provided by
    figshare
    Authors
    Sönke Knoch; Shreeraman Ponpathirkoottam; Tim Schwartz
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This data set belongs to the paper "Video-to-Model: Unsupervised Trace Extraction from Videos for Process Discovery and Conformance Checking in Manual Assembly", submitted on March 24, 2020, to the 18th International Conference on Business Process Management (BPM).Abstract: Manual activities are often hidden deep down in discrete manufacturing processes. For the elicitation and optimization of process behavior, complete information about the execution of Manual activities are required. Thus, an approach is presented on how execution level information can be extracted from videos in manual assembly. The goal is the generation of a log that can be used in state-of-the-art process mining tools. The test bed for the system was lightweight and scalable consisting of an assembly workstation equipped with a single RGB camera recording only the hand movements of the worker from top. A neural network based real-time object classifier was trained to detect the worker’s hands. The hand detector delivers the input for an algorithm, which generates trajectories reflecting the movement paths of the hands. Those trajectories are automatically assigned to work steps using the position of material boxes on the assembly shelf as reference points and hierarchical clustering of similar behaviors with dynamic time warping. The system has been evaluated in a task-based study with ten participants in a laboratory, but under realistic conditions. The generated logs have been loaded into the process mining toolkit ProM to discover the underlying process model and to detect deviations from both, instructions and ground truth, using conformance checking. The results show that process mining delivers insights about the assembly process and the system’s precision.The data set contains the generated and the annotated logs based on the video material gathered during the user study. In addition, the petri nets from the process discovery and conformance checking conducted with ProM (http://www.promtools.org) and the reference nets modeled with Yasper (http://www.yasper.org/) are provided.

  7. d

    Distributed Data Mining in Peer-to-Peer Networks

    • catalog.data.gov
    • data.staging.idas-ds1.appdat.jsc.nasa.gov
    • +1more
    Updated Apr 11, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dashlink (2025). Distributed Data Mining in Peer-to-Peer Networks [Dataset]. https://catalog.data.gov/dataset/distributed-data-mining-in-peer-to-peer-networks
    Explore at:
    Dataset updated
    Apr 11, 2025
    Dataset provided by
    Dashlink
    Description

    Peer-to-peer (P2P) networks are gaining popularity in many applications such as file sharing, e-commerce, and social networking, many of which deal with rich, distributed data sources that can benefit from data mining. P2P networks are, in fact,well-suited to distributed data mining (DDM), which deals with the problem of data analysis in environments with distributed data,computing nodes,and users. This article offers an overview of DDM applications and algorithms for P2P environments,focusing particularly on local algorithms that perform data analysis by using computing primitives with limited communication overhead. The authors describe both exact and approximate local P2P data mining algorithms that work in a decentralized and communication-efficient manner.

  8. Data Mining Tools Market Size, Share, Trend Analysis by 2033

    • emergenresearch.com
    pdf,excel,csv,ppt
    Updated Dec 8, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Emergen Research (2024). Data Mining Tools Market Size, Share, Trend Analysis by 2033 [Dataset]. https://www.emergenresearch.com/industry-report/data-mining-tools-market
    Explore at:
    pdf,excel,csv,pptAvailable download formats
    Dataset updated
    Dec 8, 2024
    Dataset authored and provided by
    Emergen Research
    License

    https://www.emergenresearch.com/privacy-policyhttps://www.emergenresearch.com/privacy-policy

    Area covered
    Global
    Variables measured
    Base Year, No. of Pages, Growth Drivers, Forecast Period, Segments covered, Historical Data for, Pitfalls Challenges, 2033 Value Projection, Tables, Charts, and Figures, Forecast Period 2024 - 2033 CAGR, and 1 more
    Description

    The Data Mining Tools Market size is expected to reach a valuation of USD 3.33 billion in 2033 growing at a CAGR of 12.50%. The Data Mining Tools market research report classifies market by share, trend, demand, forecast and based on segmentation.

  9. r

    A predictive model for opal exploration in Australia from a data mining...

    • researchdata.edu.au
    Updated May 1, 2015
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Thomas Landgrebe; Thomas Landgrebe; Adriana Dutkiewicz; Dietmar Muller (2015). A predictive model for opal exploration in Australia from a data mining approach [Dataset]. http://doi.org/10.4227/11/5587A86C0FDF1
    Explore at:
    Dataset updated
    May 1, 2015
    Dataset provided by
    The University of Sydney
    Authors
    Thomas Landgrebe; Thomas Landgrebe; Adriana Dutkiewicz; Dietmar Muller
    License

    Attribution 3.0 (CC BY 3.0)https://creativecommons.org/licenses/by/3.0/
    License information was derived automatically

    Area covered
    Dataset funded by
    Australian Research Council
    Description

    This data collection is associated with the publications: Merdith, A. S., Landgrebe, T. C. W., Dutkiewicz, A., & Müller, R. D. (2013). Towards a predictive model for opal exploration using a spatio-temporal data mining approach. Australian Journal of Earth Sciences, 60(2), 217-229. doi: 10.1080/08120099.2012.754793

    and

    Landgrebe, T. C. W., Merdith, A., Dutkiewicz, A., & Müller, R. D. (2013). Relationships between palaeogeography and opal occurrence in Australia: A data-mining approach. Computers & Geosciences, 56(0), 76-82. doi: 10.1016/j.cageo.2013.02.002

    Publication Abstract - Merdith et al. (2013)

    Opal is Australia's national gemstone, however most significant opal discoveries were made in the early 1900's - more than 100 years ago - until recently. Currently there is no formal exploration model for opal, meaning there are no widely accepted concepts or methodologies available to suggest where new opal fields may be found. As a consequence opal mining in Australia is a cottage industry with the majority of opal exploration focused around old opal fields. The EarthByte Group has developed a new opal exploration methodology for the Great Artesian Basin. The work is based on the concept of applying “big data mining” approaches to data sets relevant for identifying regions that are prospective for opal. The group combined a multitude of geological and geophysical data sets that were jointly analysed to establish associations between particular features in the data with known opal mining sites. A “training set” of known opal localities (1036 opal mines) was assembled, using those localities, which were featured in published reports and on maps. The data used include rock types, soil type, regolith type, topography, radiometric data and a stack of digital palaeogeographic maps. The different data layers were analysed via spatio-temporal data mining combining the GPlates PaleoGIS software (www.gplates.org) with the Orange data mining software (orange.biolab.si) to produce the first opal prospectivity map for the Great Artesian Basin. One of the main results of the study is that the geological conditions favourable for opal were found to be related to a particular sequence of surface environments over geological time. These conditions involved alternating shallow seas and river systems followed by uplift and erosion. The approach reduces the entire area of the Great Artesian Basin to a mere 6% that is deemed to be prospective for opal exploration. The work is described in two companion papers in the Australian Journal of Earth Sciences and Computers and Geosciences.

    Publication Abstract - Landgrebe et al. (2013)

    Age-coded multi-layered geological datasets are becoming increasingly prevalent with the surge in open-access geodata, yet there are few methodologies for extracting geological information and knowledge from these data. We present a novel methodology, based on the open-source GPlates software in which age-coded digital palaeogeographic maps are used to “data-mine” spatio-temporal patterns related to the occurrence of Australian opal. Our aim is to test the concept that only a particular sequence of depositional/erosional environments may lead to conditions suitable for the formation of gem quality sedimentary opal. Time-varying geographic environment properties are extracted from a digital palaeogeographic dataset of the eastern Australian Great Artesian Basin (GAB) at 1036 opal localities. We obtain a total of 52 independent ordinal sequences sampling 19 time slices from the Early Cretaceous to the present-day. We find that 95% of the known opal deposits are tied to only 27 sequences all comprising fluvial and shallow marine depositional sequences followed by a prolonged phase of erosion. We then map the total area of the GAB that matches these 27 opal-specific sequences, resulting in an opal-prospective region of only about 10% of the total area of the basin. The key patterns underlying this association involve only a small number of key environmental transitions. We demonstrate that these key associations are generally absent at arbitrary locations in the basin. This new methodology allows for the simplification of a complex time-varying geological dataset into a single map view, enabling straightforward application for opal exploration and for future co-assessment with other datasets/geological criteria. This approach may help unravel the poorly understood opal formation process using an empirical spatio-temporal data-mining methodology and readily available datasets to aid hypothesis testing.

    Authors and Institutions

    Andrew Merdith - EarthByte Research Group, School of Geosciences, The University of Sydney, Australia. ORCID: 0000-0002-7564-8149

    Thomas Landgrebe - EarthByte Research Group, School of Geosciences, The University of Sydney, Australia

    Adriana Dutkiewicz - EarthByte Research Group, School of Geosciences, The University of Sydney, Australia

    R. Dietmar Müller - EarthByte Research Group, School of Geosciences, The University of Sydney, Australia. ORCID: 0000-0002-3334-5764

    Overview of Resources Contained

    This collection contains geological data from Australia used for data mining in the publications Merdith et al. (2013) and Landgrebe et al. (2013). The resulting maps of opal prospectivity are also included.

    List of Resources

    Note: For details on the files included in this data collection, see “Description_of_Resources.txt”.

    Note: For information on file formats and what programs to use to interact with various file formats, see “File_Formats_and_Recommended_Programs.txt”.

    • Map of Barfield region, Australia (.jpg, 270 KB)
    • Map overviewing the Great Artesian basins and main opal mining camps (.png, 82 KB)
    • Maps showing opal prospectivity data mining results for different geological datasets (.tif, 23.1 MB)
    • Map of opal prospectivity from palaeogeography data mining (.pdf, 2.6 MB)
    • Raster of palaeogeography target regions for viewing in Google Earth (.jpg, 418 KB)
    • Opal mine locations (.gpml, .txt, .kmz, .shp, total 15.6 MB)
    • Map of opal prospectivity from all data mining results as a Google Earth overlay (.kmz, 12 KB)
    • Map of probability of opal occurrence in prospective regions from all data mining results (.tif, 5.9 MB)
    • Paleogeography of Australia (.gpml, .txt, .shp, total 114.2 MB)
    • Radiometric data showing potassium concentration contrasts (.tif, .kmz, total 311.3 MB)
    • Regolith data (.gpml, .txt, .kml, .shp, total 7.1 MB)
    • Soil type data (.gpml, .txt, .kml, .shp, total 7.1 MB)

    For more information on this data collection, and links to other datasets from the EarthByte Research Group please visit EarthByte

    For more information about using GPlates, including tutorials and a user manual please visit GPlates or EarthByte

  10. m

    Educational Attainment in North Carolina Public Schools: Use of statistical...

    • data.mendeley.com
    Updated Nov 14, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Scott Herford (2018). Educational Attainment in North Carolina Public Schools: Use of statistical modeling, data mining techniques, and machine learning algorithms to explore 2014-2017 North Carolina Public School datasets. [Dataset]. http://doi.org/10.17632/6cm9wyd5g5.1
    Explore at:
    Dataset updated
    Nov 14, 2018
    Authors
    Scott Herford
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    North Carolina
    Description

    The purpose of data mining analysis is always to find patterns of the data using certain kind of techiques such as classification or regression. It is not always feasible to apply classification algorithms directly to dataset. Before doing any work on the data, the data has to be pre-processed and this process normally involves feature selection and dimensionality reduction. We tried to use clustering as a way to reduce the dimension of the data and create new features. Based on our project, after using clustering prior to classification, the performance has not improved much. The reason why it has not improved could be the features we selected to perform clustering are not well suited for it. Because of the nature of the data, classification tasks are going to provide more information to work with in terms of improving knowledge and overall performance metrics. From the dimensionality reduction perspective: It is different from Principle Component Analysis which guarantees finding the best linear transformation that reduces the number of dimensions with a minimum loss of information. Using clusters as a technique of reducing the data dimension will lose a lot of information since clustering techniques are based a metric of 'distance'. At high dimensions euclidean distance loses pretty much all meaning. Therefore using clustering as a "Reducing" dimensionality by mapping data points to cluster numbers is not always good since you may lose almost all the information. From the creating new features perspective: Clustering analysis creates labels based on the patterns of the data, it brings uncertainties into the data. By using clustering prior to classification, the decision on the number of clusters will highly affect the performance of the clustering, then affect the performance of classification. If the part of features we use clustering techniques on is very suited for it, it might increase the overall performance on classification. For example, if the features we use k-means on are numerical and the dimension is small, the overall classification performance may be better. We did not lock in the clustering outputs using a random_state in the effort to see if they were stable. Our assumption was that if the results vary highly from run to run which they definitely did, maybe the data just does not cluster well with the methods selected at all. Basically, the ramification we saw was that our results are not much better than random when applying clustering to the data preprocessing. Finally, it is important to ensure a feedback loop is in place to continuously collect the same data in the same format from which the models were created. This feedback loop can be used to measure the model real world effectiveness and also to continue to revise the models from time to time as things change.

  11. Data Mining Tools Market Report | Global Forecast From 2025 To 2033

    • dataintelo.com
    csv, pdf, pptx
    Updated Apr 1, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dataintelo (2024). Data Mining Tools Market Report | Global Forecast From 2025 To 2033 [Dataset]. https://dataintelo.com/report/global-data-mining-tools-market
    Explore at:
    pdf, pptx, csvAvailable download formats
    Dataset updated
    Apr 1, 2024
    Dataset authored and provided by
    Dataintelo
    License

    https://dataintelo.com/privacy-and-policyhttps://dataintelo.com/privacy-and-policy

    Time period covered
    2024 - 2032
    Area covered
    Global
    Description

    Data Mining Tools Market Outlook 2032



    The global data mining tools market size was USD 932 Million in 2023 and is projected to reach USD 2,584.7 Million by 2032, expanding at a CAGR of 12% during 2024–2032. The market is fueled by the rising demand for big data analytics across various industries and the increasing need for AI-integrated data mining tools for insightful decision-making.



    Increasing adoption of cloud-based platforms in data mining tools fuels the market. This enhances scalability, flexibility, and cost-efficiency in data handling processes. Major tech companies are launching cloud-based data mining solutions, enabling businesses to analyze vast datasets effectively. This trend reflects the shift toward agile and scalable data analysis methods, meeting the dynamic needs of modern enterprises.





    • In July 2023, Microsoft launched Power Automate Process Mining. This tool, powered by advanced AI, allows companies to gain deep insights into their operations, streamline processes, and foster ongoing improvement through automation and low-code applications, marking a new era in business efficiency and process optimization.







    Rising focus on predictive analytics propels the development of advanced data mining tools capable of forecasting future trends and behaviors. Industries such as finance, healthcare, and retail invest significantly in predictive analytics to gain a competitive edge, driving demand for sophisticated data mining technologies. This trend underscores the strategic importance of foresight in decision-making processes.



    Visual data mining tools are gaining traction in the market, offering intuitive data exploration and interpretation capabilities. These tools enable users to uncover patterns and insights through graphical representations, making data analysis accessible to a broader audience. The launch of user-friendly visual data mining applications marks a significant step toward democratizing data analytics.



    Impact of Artificial Intelligence (

  12. Data Mining Tools Market Research Report 2033

    • growthmarketreports.com
    csv, pdf, pptx
    Updated Jul 3, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Growth Market Reports (2025). Data Mining Tools Market Research Report 2033 [Dataset]. https://growthmarketreports.com/report/data-mining-tools-market
    Explore at:
    pptx, pdf, csvAvailable download formats
    Dataset updated
    Jul 3, 2025
    Dataset authored and provided by
    Growth Market Reports
    Time period covered
    2024 - 2032
    Area covered
    Global
    Description

    Data Mining Tools Market Outlook




    According to our latest research, the global Data Mining Tools market size reached USD 1.93 billion in 2024, reflecting robust industry momentum. The market is expected to grow at a CAGR of 12.7% from 2025 to 2033, reaching a projected value of USD 5.69 billion by 2033. This growth is primarily driven by the increasing adoption of advanced analytics across diverse industries, rapid digital transformation, and the necessity for actionable insights from massive data volumes.




    One of the pivotal growth factors propelling the Data Mining Tools market is the exponential rise in data generation, particularly through digital channels, IoT devices, and enterprise applications. Organizations across sectors are leveraging data mining tools to extract meaningful patterns, trends, and correlations from structured and unstructured data. The need for improved decision-making, operational efficiency, and competitive advantage has made data mining an essential component of modern business strategies. Furthermore, advancements in artificial intelligence and machine learning are enhancing the capabilities of these tools, enabling predictive analytics, anomaly detection, and automation of complex analytical tasks, which further fuels market expansion.




    Another significant driver is the growing demand for customer-centric solutions in industries such as retail, BFSI, and healthcare. Data mining tools are increasingly being used for customer relationship management, targeted marketing, fraud detection, and risk management. By analyzing customer behavior and preferences, organizations can personalize their offerings, optimize marketing campaigns, and mitigate risks. The integration of data mining tools with cloud platforms and big data technologies has also simplified deployment and scalability, making these solutions accessible to small and medium-sized enterprises (SMEs) as well as large organizations. This democratization of advanced analytics is creating new growth avenues for vendors and service providers.




    The regulatory landscape and the increasing emphasis on data privacy and security are also shaping the development and adoption of Data Mining Tools. Compliance with frameworks such as GDPR, HIPAA, and CCPA necessitates robust data governance and transparent analytics processes. Vendors are responding by incorporating features like data masking, encryption, and audit trails into their solutions, thereby enhancing trust and adoption among regulated industries. Additionally, the emergence of industry-specific data mining applications, such as fraud detection in BFSI and predictive diagnostics in healthcare, is expanding the addressable market and fostering innovation.




    From a regional perspective, North America currently dominates the Data Mining Tools market owing to the early adoption of advanced analytics, strong presence of leading technology vendors, and high investments in digital transformation. However, the Asia Pacific region is emerging as a lucrative market, driven by rapid industrialization, expansion of IT infrastructure, and growing awareness of data-driven decision-making in countries like China, India, and Japan. Europe, with its focus on data privacy and digital innovation, also represents a significant market share, while Latin America and the Middle East & Africa are witnessing steady growth as organizations in these regions modernize their operations and adopt cloud-based analytics solutions.





    Component Analysis




    The Component segment of the Data Mining Tools market is bifurcated into Software and Services. Software remains the dominant segment, accounting for the majority of the market share in 2024. This dominance is attributed to the continuous evolution of data mining algorithms, the proliferation of user-friendly graphical interfaces, and the integration of advanced analytics capabilities such as machine learning, artificial intelligence, and natural language pro

  13. S

    Predictive data analysis techniques for higher education students dropout

    • scidb.cn
    Updated Apr 10, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Cindy (2023). Predictive data analysis techniques for higher education students dropout [Dataset]. http://doi.org/10.57760/sciencedb.07894
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Apr 10, 2023
    Dataset provided by
    Science Data Bank
    Authors
    Cindy
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    In this research, we have generated student retention alerts. The alerts are classified into two types: preventive and corrective. This classification varies according to the level of maturity of the data systematization process. Therefore, to systematize the data, data mining techniques have been applied. The experimental analytical method has been used, with a population of 13,715 students with 62 sociological, academic, family, personal, economic, psychological, and institutional variables, and factors such as academic follow-up and performance, financial situation, and personal information. In particular, information is collected on each of the problems or a combination of problems that could affect dropout rates. Following the methodology, the information has been generated through an abstract data model to reflect the profile of the dropout student. As advancement from previous research, this proposal will create preventive and corrective alternatives to avoid dropout higher education. Also, in contrast to previous work, we generated corrective warnings with the application of data mining techniques such as neural networks until reaching a precision of 97% and losses of 0.1052. In conclusion, this study pretends to analyze the behavior of students who drop out the university through the evaluation of predictive patterns. The overall objective is to predict the profile of student dropout, considering reasons such as admission to higher education and career changes. Consequently, using a data systematization process promotes the permanence of students in higher education. Once the profile of the dropout has been identified, student retention strategies have been approached, according to the time of its appearance and the point of view of the institution.

  14. w

    Dataset of books called Data mining techniques in CRM : inside customer...

    • workwithdata.com
    Updated Apr 17, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Work With Data (2025). Dataset of books called Data mining techniques in CRM : inside customer segmentation [Dataset]. https://www.workwithdata.com/datasets/books?f=1&fcol0=book&fop0=%3D&fval0=Data+mining+techniques+in+CRM+%3A+inside+customer+segmentation
    Explore at:
    Dataset updated
    Apr 17, 2025
    Dataset authored and provided by
    Work With Data
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset is about books. It has 1 row and is filtered where the book is Data mining techniques in CRM : inside customer segmentation. It features 7 columns including author, publication date, language, and book publisher.

  15. c

    Data from: Discovering System Health Anomalies using Data Mining Techniques

    • s.cnmilf.com
    • data.staging.idas-ds1.appdat.jsc.nasa.gov
    • +2more
    Updated Apr 10, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dashlink (2025). Discovering System Health Anomalies using Data Mining Techniques [Dataset]. https://s.cnmilf.com/user74170196/https/catalog.data.gov/dataset/discovering-system-health-anomalies-using-data-mining-techniques
    Explore at:
    Dataset updated
    Apr 10, 2025
    Dataset provided by
    Dashlink
    Description

    We discuss a statistical framework that underlies envelope detection schemes as well as dynamical models based on Hidden Markov Models (HMM) that can encompass both discrete and continuous sensor measurements for use in Integrated System Health Management (ISHM) applications. The HMM allows for the rapid assimilation, analysis, and discovery of system anomalies. We motivate our work with a discussion of an aviation problem where the identification of anomalous sequences is essential for safety reasons. The data in this application are discrete and continuous sensor measurements and can be dealt with seamlessly using the methods described here to discover anomalous flights. We specifically treat the problem of discovering anomalous features in the time series that may be hidden from the sensor suite and compare those methods to standard envelope detection methods on test data designed to accentuate the differences between the two methods. Identification of these hidden anomalies is crucial to building stable, reusable, and cost-efficient systems. We also discuss a data mining framework for the analysis and discovery of anomalies in high-dimensional time series of sensor measurements that would be found in an ISHM system. We conclude with recommendations that describe the tradeoffs in building an integrated scalable platform for robust anomaly detection in ISHM applications.

  16. s

    Data from: Social Media Data Mining Becomes Ordinary

    • orda.shef.ac.uk
    docx
    Updated May 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Helen Kennedy (2023). Social Media Data Mining Becomes Ordinary [Dataset]. http://doi.org/10.15131/shef.data.5195032.v1
    Explore at:
    docxAvailable download formats
    Dataset updated
    May 30, 2023
    Dataset provided by
    The University of Sheffield
    Authors
    Helen Kennedy
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This research explored what happens when social media data mining becomes ordinary and is carried out by organisations that might be seen as the pillars of everyday life. The interviews on which the transcripts are based are discussed in Chapter 6 of the book. The referenced book contains a description of the methods. No other publications resulted from working with these transcripts.

  17. w

    Dataset of books about Data mining-Social aspects

    • workwithdata.com
    Updated Apr 17, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Work With Data (2025). Dataset of books about Data mining-Social aspects [Dataset]. https://www.workwithdata.com/datasets/books?f=1&fcol0=j0-book_subject&fop0=%3D&fval0=Data+mining-Social+aspects&j=1&j0=book_subjects
    Explore at:
    Dataset updated
    Apr 17, 2025
    Dataset authored and provided by
    Work With Data
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset is about books. It has 11 rows and is filtered where the book subjects is Data mining-Social aspects. It features 9 columns including author, publication date, language, and book publisher.

  18. w

    Dataset of book subjects that contain Learning Data Mining with Python

    • workwithdata.com
    Updated Nov 7, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Work With Data (2024). Dataset of book subjects that contain Learning Data Mining with Python [Dataset]. https://www.workwithdata.com/datasets/book-subjects?f=1&fcol0=j0-book&fop0=%3D&fval0=Learning+Data+Mining+with+Python&j=1&j0=books
    Explore at:
    Dataset updated
    Nov 7, 2024
    Dataset authored and provided by
    Work With Data
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset is about book subjects. It has 2 rows and is filtered where the books is Learning Data Mining with Python. It features 10 columns including number of authors, number of books, earliest publication date, and latest publication date.

  19. d

    Data from: Community-Scale Attic Retrofit and Home Energy Upgrade Data...

    • catalog.data.gov
    • data.openei.org
    • +3more
    Updated Nov 2, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Davis Energy (2023). Community-Scale Attic Retrofit and Home Energy Upgrade Data Mining - Hot Dry Climate [Dataset]. https://catalog.data.gov/dataset/community-scale-attic-retrofit-and-home-energy-upgrade-data-mining-hot-dry-climate
    Explore at:
    Dataset updated
    Nov 2, 2023
    Dataset provided by
    Davis Energy
    Description

    Retrofitting is an essential element of any comprehensive strategy for improving residential energy efficiency. The residential retrofit market is still developing, and program managers must develop innovative strategies to increase uptake and promote economies of scale. Residential retrofitting remains a challenging proposition to sell to homeowners, because awareness levels are low and financial incentives are lacking. The U.S. Department of Energy's Building America research team, Alliance for Residential Building Innovation (ARBI), implemented a project to increase residential retrofits in Davis, California. The project used a neighborhood-focused strategy for implementation and a low-cost retrofit program that focused on upgraded attic insulation and duct sealing. ARBI worked with a community partner, the not-for-profit Cool Davis Initiative, as well as selected area contractors to implement a strategy that sought to capitalize on the strong local expertise of partners and the unique aspects of the Davis, California, community. Working with community partners also allowed ARBI to collect and analyze data about effective messaging tactics for community-based retrofit programs. ARBI expected this project, called Retrofit Your Attic, to achieve higher uptake than other retrofit projects, because it emphasized a low-cost, one-measure retrofit program. However, this was not the case. The program used a strategy that focused on attics-including air sealing, duct sealing, and attic insulation-as a low-cost entry for homeowners to complete home retrofits. The price was kept below $4,000 after incentives; both contractors in the program offered the same price. The program completed only five retrofits. Interestingly, none of those homeowners used the one-measure strategy. All five homeowners were concerned about cost, comfort, and energy savings and included additional measures in their retrofits. The low-cost, one-measure strategy did not increase the uptake among homeowners, even in a well-educated, affluent community such as Davis. This project has two primary components. One is to complete attic retrofits on a community scale in the hot-dry climate on Davis, CA. Sufficient data will be collected on these projects to include them in the BAFDR. Additionally, ARBI is working with contractors to obtain building and utility data from a large set of retrofit projects in CA (hot-dry). These projects are to be uploaded into the BAFDR.

  20. d

    Data Mining Tools Market Analysis, Trends, Growth, Industry Revenue, Market...

    • datastringconsulting.com
    pdf, xlsx
    Updated Jan 3, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Datastring Consulting (2025). Data Mining Tools Market Analysis, Trends, Growth, Industry Revenue, Market Size and Forecast Report 2024-2034 [Dataset]. https://datastringconsulting.com/industry-analysis/data-mining-tools-market-research-report
    Explore at:
    pdf, xlsxAvailable download formats
    Dataset updated
    Jan 3, 2025
    Dataset authored and provided by
    Datastring Consulting
    License

    https://datastringconsulting.com/privacy-policyhttps://datastringconsulting.com/privacy-policy

    Time period covered
    2019 - 2034
    Area covered
    Global
    Description
    Report Attribute/MetricDetails
    Market Value in 2025USD 1.7 billion
    Revenue Forecast in 2034USD 4.8 billion
    Growth RateCAGR of 12.4% from 2025 to 2034
    Base Year for Estimation2024
    Industry Revenue 20241.5 billion
    Growth Opportunity USD 3.3 billion
    Historical Data2019 - 2023
    Forecast Period2025 - 2034
    Market Size UnitsMarket Revenue in USD billion and Industry Statistics
    Market Size 20241.5 billion USD
    Market Size 20272.1 billion USD
    Market Size 20292.7 billion USD
    Market Size 20303.0 billion USD
    Market Size 20344.8 billion USD
    Market Size 20355.4 billion USD
    Report CoverageMarket Size for past 5 years and forecast for future 10 years, Competitive Analysis & Company Market Share, Strategic Insights & trends
    Segments CoveredApplication, Type, Industry, Deployment
    Regional ScopeNorth America, Europe, Asia Pacific, Latin America and Middle East & Africa
    Country ScopeU.S., Canada, Mexico, UK, Germany, France, Italy, Spain, China, India, Japan, South Korea, Brazil, Mexico, Argentina, Saudi Arabia, UAE and South Africa
    Top 5 Major Countries and Expected CAGR ForecastU.S., China, India, UK, Germany - Expected CAGR 11.2% - 14.9% (2025 - 2034)
    Top 3 Emerging Countries and Expected ForecastBrazil, South Africa, Indonesia - Expected Forecast CAGR 8.7% - 13.0% (2025 - 2034)
    Top 2 Opportunistic Market SegmentsHealthcare and Retail Industry
    Top 2 Industry TransitionsShift Towards Automated Data Mining, Rise of Predictive Analytics
    Companies ProfiledIBM Corporation, SAS Institute, RapidMiner Inc, KNIME.com AG, Oracle Corporation, Microsoft Corporation, Intel Corporation, Alteryx Inc, SAP SE, Fair Isaac Corporation, MathWorks Inc and Salford Systems Ltd
    CustomizationFree customization at segment, region, or country scope and direct contact with report analyst team for 10 to 20 working hours for any additional niche requirement (10% of report value)
Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
United States (2020). Multi-objective optimization based privacy preserving distributed data mining in Peer-to-Peer networks [Dataset]. https://cloud.csiss.gmu.edu/uddi/dataset/multi-objective-optimization-based-privacy-preserving-distributed-data-mining-in-peer-to-p

Data from: Multi-objective optimization based privacy preserving distributed data mining in Peer-to-Peer networks

Related Article
Explore at:
pdfAvailable download formats
Dataset updated
Jan 29, 2020
Dataset provided by
United States
Description

This paper proposes a scalable, local privacy preserving algorithm for distributed Peer-to-Peer (P2P) data aggregation useful for many advanced data mining/analysis tasks such as average/sum computation, decision tree induction, feature selection, and more. Unlike most multi-party privacy-preserving data mining algorithms, this approach works in an asynchronous manner through local interactions and it is highly scalable. It particularly deals with the distributed computation of the sum of a set of numbers stored at different peers in a P2P network in the context of a P2P web mining application. The proposed optimization based privacy-preserving technique for computing the sum allows different peers to specify different privacy requirements without having to adhere to a global set of parameters for the chosen privacy model. Since distributed sum computation is a frequently used primitive, the proposed approach is likely to have significant impact on many data mining tasks such as multi-party privacy-preserving clustering, frequent itemset mining, and statistical aggregate computation.

Search
Clear search
Close search
Google apps
Main menu