100+ datasets found
  1. Data from: Enhancing the Human Health Status Prediction: The ATHLOS Project

    • tandf.figshare.com
    xls
    Updated Jun 3, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    P. Anagnostou; S. Tasoulis; A. G. Vrahatis; S. Georgakopoulos; M. Prina; J. L. Ayuso-Mateos; J. Bickenbach; I. Bayes-Marin; F. F. Caballero; L. Egea-Cortés; E. García-Esquinas; M. Leonardi; S. Scherbov; A. Tamosiunas; A. Galas; J. M. Haro; A. Sanchez-Niubo; V. Plagianakos; D. Panagiotakos (2023). Enhancing the Human Health Status Prediction: The ATHLOS Project [Dataset]. http://doi.org/10.6084/m9.figshare.14798079.v1
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 3, 2023
    Dataset provided by
    Taylor & Francishttps://taylorandfrancis.com/
    Authors
    P. Anagnostou; S. Tasoulis; A. G. Vrahatis; S. Georgakopoulos; M. Prina; J. L. Ayuso-Mateos; J. Bickenbach; I. Bayes-Marin; F. F. Caballero; L. Egea-Cortés; E. García-Esquinas; M. Leonardi; S. Scherbov; A. Tamosiunas; A. Galas; J. M. Haro; A. Sanchez-Niubo; V. Plagianakos; D. Panagiotakos
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Preventive healthcare is a crucial pillar of health as it contributes to staying healthy and having immediate treatment when needed. Mining knowledge from longitudinal studies has the potential to significantly contribute to the improvement of preventive healthcare. Unfortunately, data originated from such studies are characterized by high complexity, huge volume, and a plethora of missing values. Machine Learning, Data Mining and Data Imputation models are utilized a part of solving these challenges, respectively. Toward this direction, we focus on the development of a complete methodology for the ATHLOS Project – funded by the European Union’s Horizon 2020 Research and Innovation Program, which aims to achieve a better interpretation of the impact of aging on health. The inherent complexity of the provided dataset lies in the fact that the project includes 15 independent European and international longitudinal studies of aging. In this work, we mainly focus on the HealthStatus (HS) score, an index that estimates the human status of health, aiming to examine the effect of various data imputation models to the prediction power of classification and regression models. Our results are promising, indicating the critical importance of data imputation in enhancing preventive medicine’s crucial role.

  2. Data from: Comparison of predictive performance of data mining algorithms in...

    • scielo.figshare.com
    jpeg
    Updated Jun 4, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Senol Celik; Ecevit Eyduran; Koksal Karadas; Mohammad Masood Tariq (2023). Comparison of predictive performance of data mining algorithms in predicting body weight in Mengali rams of Pakistan [Dataset]. http://doi.org/10.6084/m9.figshare.5719009.v1
    Explore at:
    jpegAvailable download formats
    Dataset updated
    Jun 4, 2023
    Dataset provided by
    SciELOhttp://www.scielo.org/
    Authors
    Senol Celik; Ecevit Eyduran; Koksal Karadas; Mohammad Masood Tariq
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Pakistan
    Description

    ABSTRACT The present study aimed at comparing predictive performance of some data mining algorithms (CART, CHAID, Exhaustive CHAID, MARS, MLP, and RBF) in biometrical data of Mengali rams. To compare the predictive capability of the algorithms, the biometrical data regarding body (body length, withers height, and heart girth) and testicular (testicular length, scrotal length, and scrotal circumference) measurements of Mengali rams in predicting live body weight were evaluated by most goodness of fit criteria. In addition, age was considered as a continuous independent variable. In this context, MARS data mining algorithm was used for the first time to predict body weight in two forms, without (MARS_1) and with interaction (MARS_2) terms. The superiority order in the predictive accuracy of the algorithms was found as CART > CHAID ≈ Exhaustive CHAID > MARS_2 > MARS_1 > RBF > MLP. Moreover, all tested algorithms provided a strong predictive accuracy for estimating body weight. However, MARS is the only algorithm that generated a prediction equation for body weight. Therefore, it is hoped that the available results might present a valuable contribution in terms of predicting body weight and describing the relationship between the body weight and body and testicular measurements in revealing breed standards and the conservation of indigenous gene sources for Mengali sheep breeding. Therefore, it will be possible to perform more profitable and productive sheep production. Use of data mining algorithms is useful for revealing the relationship between body weight and testicular traits in describing breed standards of Mengali sheep.

  3. e

    Overview of data mining and predictive analytics

    • paper.erudition.co.in
    html
    Updated Dec 3, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Einetic (2025). Overview of data mining and predictive analytics [Dataset]. https://paper.erudition.co.in/makaut/btech-in-computer-science-and-engineering-artificial-intelligence-and-machine-learning/6/data-mining
    Explore at:
    htmlAvailable download formats
    Dataset updated
    Dec 3, 2025
    Dataset authored and provided by
    Einetic
    License

    https://paper.erudition.co.in/termshttps://paper.erudition.co.in/terms

    Description

    Question Paper Solutions of chapter Overview of data mining and predictive analytics of Data Mining, 6th Semester , B.Tech in Computer Science & Engineering (Artificial Intelligence and Machine Learning)

  4. e

    Supervised learning for prediction

    • paper.erudition.co.in
    html
    Updated Dec 3, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Einetic (2025). Supervised learning for prediction [Dataset]. https://paper.erudition.co.in/makaut/btech-in-computer-science-and-engineering-artificial-intelligence-and-machine-learning/6/data-mining
    Explore at:
    htmlAvailable download formats
    Dataset updated
    Dec 3, 2025
    Dataset authored and provided by
    Einetic
    License

    https://paper.erudition.co.in/termshttps://paper.erudition.co.in/terms

    Description

    Question Paper Solutions of chapter Supervised learning for prediction of Data Mining, 6th Semester , B.Tech in Computer Science & Engineering (Artificial Intelligence and Machine Learning)

  5. D

    Data Mining Software Report

    • datainsightsmarket.com
    doc, pdf, ppt
    Updated Oct 16, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Data Insights Market (2025). Data Mining Software Report [Dataset]. https://www.datainsightsmarket.com/reports/data-mining-software-1447715
    Explore at:
    pdf, ppt, docAvailable download formats
    Dataset updated
    Oct 16, 2025
    Dataset authored and provided by
    Data Insights Market
    License

    https://www.datainsightsmarket.com/privacy-policyhttps://www.datainsightsmarket.com/privacy-policy

    Time period covered
    2025 - 2033
    Area covered
    Global
    Variables measured
    Market Size
    Description

    Explore the dynamic Data Mining Software market forecast (2025-2033) with a 12.5% CAGR. Uncover key drivers, restraints, and trends shaping analytics for large enterprises and SMEs.

  6. Z

    Predictive Analytics Market - by Software Solutions (Data Mining &...

    • zionmarketresearch.com
    pdf
    Updated Nov 23, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Zion Market Research (2025). Predictive Analytics Market - by Software Solutions (Data Mining & Management, Decision Support Systems, Fraud & Security Intelligence, Financial Intelligence, Customer Intelligence, and Others), By Delivery Mode (Cloud-Based Technology and On-Premise Deployment), By End-User (BFSI, Telecom & IT, Healthcare, Transport & Logistics, Government & Utilities, and Others) and by Application (Customer & Channel, Sales and Marketing, Finance & Risk, and Other Applications), and By Region - Global and Regional Industry Overview, Comprehensive Analysis, Historical Data, and Forecasts 2024-2032 [Dataset]. https://www.zionmarketresearch.com/report/predictive-analytic-market
    Explore at:
    pdfAvailable download formats
    Dataset updated
    Nov 23, 2025
    Dataset authored and provided by
    Zion Market Research
    License

    https://www.zionmarketresearch.com/privacy-policyhttps://www.zionmarketresearch.com/privacy-policy

    Time period covered
    2022 - 2030
    Area covered
    Global
    Description

    Global Predictive Analytics Market size worth at USD 16.19 Billion in 2023 and projected to USD 113.8 Billion by 2032, with a CAGR of around 24.19% between 2024-2032.

  7. G

    Data Mining Tools Market Research Report 2033

    • growthmarketreports.com
    csv, pdf, pptx
    Updated Aug 4, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Growth Market Reports (2025). Data Mining Tools Market Research Report 2033 [Dataset]. https://growthmarketreports.com/report/data-mining-tools-market
    Explore at:
    pdf, csv, pptxAvailable download formats
    Dataset updated
    Aug 4, 2025
    Dataset authored and provided by
    Growth Market Reports
    Time period covered
    2024 - 2032
    Area covered
    Global
    Description

    Data Mining Tools Market Outlook




    According to our latest research, the global Data Mining Tools market size reached USD 1.93 billion in 2024, reflecting robust industry momentum. The market is expected to grow at a CAGR of 12.7% from 2025 to 2033, reaching a projected value of USD 5.69 billion by 2033. This growth is primarily driven by the increasing adoption of advanced analytics across diverse industries, rapid digital transformation, and the necessity for actionable insights from massive data volumes.




    One of the pivotal growth factors propelling the Data Mining Tools market is the exponential rise in data generation, particularly through digital channels, IoT devices, and enterprise applications. Organizations across sectors are leveraging data mining tools to extract meaningful patterns, trends, and correlations from structured and unstructured data. The need for improved decision-making, operational efficiency, and competitive advantage has made data mining an essential component of modern business strategies. Furthermore, advancements in artificial intelligence and machine learning are enhancing the capabilities of these tools, enabling predictive analytics, anomaly detection, and automation of complex analytical tasks, which further fuels market expansion.




    Another significant driver is the growing demand for customer-centric solutions in industries such as retail, BFSI, and healthcare. Data mining tools are increasingly being used for customer relationship management, targeted marketing, fraud detection, and risk management. By analyzing customer behavior and preferences, organizations can personalize their offerings, optimize marketing campaigns, and mitigate risks. The integration of data mining tools with cloud platforms and big data technologies has also simplified deployment and scalability, making these solutions accessible to small and medium-sized enterprises (SMEs) as well as large organizations. This democratization of advanced analytics is creating new growth avenues for vendors and service providers.




    The regulatory landscape and the increasing emphasis on data privacy and security are also shaping the development and adoption of Data Mining Tools. Compliance with frameworks such as GDPR, HIPAA, and CCPA necessitates robust data governance and transparent analytics processes. Vendors are responding by incorporating features like data masking, encryption, and audit trails into their solutions, thereby enhancing trust and adoption among regulated industries. Additionally, the emergence of industry-specific data mining applications, such as fraud detection in BFSI and predictive diagnostics in healthcare, is expanding the addressable market and fostering innovation.




    From a regional perspective, North America currently dominates the Data Mining Tools market owing to the early adoption of advanced analytics, strong presence of leading technology vendors, and high investments in digital transformation. However, the Asia Pacific region is emerging as a lucrative market, driven by rapid industrialization, expansion of IT infrastructure, and growing awareness of data-driven decision-making in countries like China, India, and Japan. Europe, with its focus on data privacy and digital innovation, also represents a significant market share, while Latin America and the Middle East & Africa are witnessing steady growth as organizations in these regions modernize their operations and adopt cloud-based analytics solutions.





    Component Analysis




    The Component segment of the Data Mining Tools market is bifurcated into Software and Services. Software remains the dominant segment, accounting for the majority of the market share in 2024. This dominance is attributed to the continuous evolution of data mining algorithms, the proliferation of user-friendly graphical interfaces, and the integration of advanced analytics capabilities such as machine learning, artificial intelligence, and natural language pro

  8. f

    Data from: Which Is a More Accurate Predictor in Colorectal Survival...

    • datasetcatalog.nlm.nih.gov
    • figshare.com
    Updated Jul 25, 2012
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yue, Zhen-yu; Wang, Zhen-ning; Zhou, Xin; Tong, Lin-lin; Gao, Peng; Xu, Ying-ying; Song, Yong-xi; Xu, Hui-mian (2012). Which Is a More Accurate Predictor in Colorectal Survival Analysis? Nine Data Mining Algorithms vs. the TNM Staging System [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0001153856
    Explore at:
    Dataset updated
    Jul 25, 2012
    Authors
    Yue, Zhen-yu; Wang, Zhen-ning; Zhou, Xin; Tong, Lin-lin; Gao, Peng; Xu, Ying-ying; Song, Yong-xi; Xu, Hui-mian
    Description

    ObjectiveOver the past decades, many studies have used data mining technology to predict the 5-year survival rate of colorectal cancer, but there have been few reports that compared multiple data mining algorithms to the TNM classification of malignant tumors (TNM) staging system using a dataset in which the training and testing data were from different sources. Here we compared nine data mining algorithms to the TNM staging system for colorectal survival analysis. MethodsTwo different datasets were used: 1) the National Cancer Institute's Surveillance, Epidemiology, and End Results dataset; and 2) the dataset from a single Chinese institution. An optimization and prediction system based on nine data mining algorithms as well as two variable selection methods was implemented. The TNM staging system was based on the 7th edition of the American Joint Committee on Cancer TNM staging system. ResultsWhen the training and testing data were from the same sources, all algorithms had slight advantages over the TNM staging system in predictive accuracy. When the data were from different sources, only four algorithms (logistic regression, general regression neural network, Bayesian networks, and Naïve Bayes) had slight advantages over the TNM staging system. Also, there was no significant differences among all the algorithms (p>0.05). ConclusionsThe TNM staging system is simple and practical at present, and data mining methods are not accurate enough to replace the TNM staging system for colorectal cancer survival prediction. Furthermore, there were no significant differences in the predictive accuracy of all the algorithms when the data were from different sources. Building a larger dataset that includes more variables may be important for furthering predictive accuracy.

  9. Data Mining Tools Market - Global Opportunity Analysis and Industry Forecast...

    • meticulousresearch.com
    Updated Aug 12, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Meticulous Market Research Pvt Ltd (2025). Data Mining Tools Market - Global Opportunity Analysis and Industry Forecast (2025–2035) [Dataset]. https://www.meticulousresearch.com/product/data-mining-tools-market-6239
    Explore at:
    Dataset updated
    Aug 12, 2025
    Dataset provided by
    Meticulous Market Research Pvt. Ltd.
    Authors
    Meticulous Market Research Pvt Ltd
    License

    https://www.meticulousresearch.com/privacy-policyhttps://www.meticulousresearch.com/privacy-policy

    Area covered
    Latin America, Europe, Asia Pacific, Global, Middle East & Africa, North America
    Description

    Data Mining Tools Market Size, Share, Forecast & Trends Size - By Component, Deployment (Cloud, On-Premise), By Organization Size (Large Enterprises, Small & Medium Enterprises), By End-user Vertical (BFSI, Healthcare, Retail, IT & Telecom) - Global Forecast to 2035

  10. e

    Classification and Prediction

    • paper.erudition.co.in
    html
    Updated Jan 7, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Einetic (2022). Classification and Prediction [Dataset]. https://paper.erudition.co.in/1/btech-in-computer-science-and-engineering/6/data-warehousing-and-data-mining
    Explore at:
    htmlAvailable download formats
    Dataset updated
    Jan 7, 2022
    Dataset authored and provided by
    Einetic
    License

    https://paper.erudition.co.in/termshttps://paper.erudition.co.in/terms

    Description

    Question Paper Solutions of chapter Classification and Prediction of Data Warehousing and Data Mining, 6th Semester , Computer Science and Engineering

  11. Prediction of Online Orders

    • kaggle.com
    zip
    Updated May 23, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Oscar Aguilar (2023). Prediction of Online Orders [Dataset]. https://www.kaggle.com/datasets/oscarm524/prediction-of-orders/versions/3
    Explore at:
    zip(6680913 bytes)Available download formats
    Dataset updated
    May 23, 2023
    Authors
    Oscar Aguilar
    Description

    The visit of an online shop by a possible customer is also called a session. During a session the visitor clicks on products in order to see the corresponding detail page. Furthermore, he possibly will add or remove products to/from his shopping basket. At the end of a session it is possible that one or several products from the shopping basket will be ordered. The activities of the user are also called transactions. The goal of the analysis is to predict whether the visitor will place an order or not on the basis of the transaction data collected during the session.

    Tasks

    In the first task historical shop data are given consisting of the session activities inclusive of the associated information whether an order was placed or not. These data can be used in order to subsequently make order forecasts for other session activities in the same shop. Of course, the real outcome of the sessions for this set is not known. Thus, the first task can be understood as a classical data mining problem.

    The second task deals with the online scenario. In this context the participants are to implement an agent learning on the basis of transactions. That means that the agent successively receives the individual transactions and has to make a forecast for each of them with respect to the outcome of the shopping cart transaction. This task maps the practice scenario in the best possible way in the case that a transaction-based forecast is required and a corresponding algorithm should learn in an adaptive manner.

    The Data

    For the individual tasks anonymised real shop data are provided in the form of structured text files consisting of individual data sets. The data sets represent in each case transactions in the shop and may contain redundant information. For the data, in particular the following applies:

    1. Each data set is in an individual line that is closed by “LF”(“line feed”, 0xA), “CR”(“carriage return”, 0xD), or “CR”and “LF”(“carriage return”and “line feed”, 0xD and 0xA).
    2. The first line is structured analog to the data sets but contains the names of the respective columns (data arrays).
    3. The header and each data set contain several arrays separated by the symbol “|”.
    4. There is no escape character, and no quota system is used.
    5. ASCII is used as character set.
    6. There may be missing values. These are marked by the symbol “?”.

    In concrete terms, only the array names of the attached document “*features.pdf*” in their respective sequence will be used as column headings. The corresponding value ranges are listed there, too.

    The training file for task 1 is “*transact_train.txt*“) contains all data arrays of the document, whereas the corresponding classification file (“*transact_class.txt*”) of course does not contain the target attribute “*order*”.

    In task 2 data in the form of a string array are transferred to the implementations of the participants by means of a method. The individual fields of the array contain the same data arrays that are listed in “*features.pdf*”–also without the target attribute “*order*”–and exactly in the sequence used there.

    Acknowledgement

    This dataset is publicly available in the data-mining-cup-website.

  12. d

    Ensemble Data Mining Methods

    • catalog.data.gov
    • s.cnmilf.com
    • +1more
    Updated Apr 11, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dashlink (2025). Ensemble Data Mining Methods [Dataset]. https://catalog.data.gov/dataset/ensemble-data-mining-methods
    Explore at:
    Dataset updated
    Apr 11, 2025
    Dataset provided by
    Dashlink
    Description

    Ensemble Data Mining Methods, also known as Committee Methods or Model Combiners, are machine learning methods that leverage the power of multiple models to achieve better prediction accuracy than any of the individual models could on their own. The basic goal when designing an ensemble is the same as when establishing a committee of people: each member of the committee should be as competent as possible, but the members should be complementary to one another. If the members are not complementary, i.e., if they always agree, then the committee is unnecessary---any one member is sufficient. If the members are complementary, then when one or a few members make an error, the probability is high that the remaining members can correct this error. Research in ensemble methods has largely revolved around designing ensembles consisting of competent yet complementary models.

  13. d

    Privacy Preserving Distributed Data Mining

    • catalog.data.gov
    • s.cnmilf.com
    Updated Apr 10, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dashlink (2025). Privacy Preserving Distributed Data Mining [Dataset]. https://catalog.data.gov/dataset/privacy-preserving-distributed-data-mining
    Explore at:
    Dataset updated
    Apr 10, 2025
    Dataset provided by
    Dashlink
    Description

    Distributed data mining from privacy-sensitive multi-party data is likely to play an important role in the next generation of integrated vehicle health monitoring systems. For example, consider an airline manufacturer [tex]$\mathcal{C}$[/tex] manufacturing an aircraft model [tex]$A$[/tex] and selling it to five different airline operating companies [tex]$\mathcal{V}_1 \dots \mathcal{V}_5$[/tex]. These aircrafts, during their operation, generate huge amount of data. Mining this data can reveal useful information regarding the health and operability of the aircraft which can be useful for disaster management and prediction of efficient operating regimes. Now if the manufacturer [tex]$\mathcal{C}$[/tex] wants to analyze the performance data collected from different aircrafts of model-type [tex]$A$[/tex] belonging to different airlines then central collection of data for subsequent analysis may not be an option. It should be noted that the result of this analysis may be statistically more significant if the data for aircraft model [tex]$A$[/tex] across all companies were available to [tex]$\mathcal{C}$[/tex]. The potential problems arising out of such a data mining scenario are:

  14. Data from: A Proposed Churn Prediction Model

    • figshare.com
    pdf
    Updated Feb 24, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mona Nasr; Essam Shaaban; Yehia Helmy; Dr. Ayman Khedr (2019). A Proposed Churn Prediction Model [Dataset]. http://doi.org/10.6084/m9.figshare.7763183.v2
    Explore at:
    pdfAvailable download formats
    Dataset updated
    Feb 24, 2019
    Dataset provided by
    figshare
    Figsharehttp://figshare.com/
    Authors
    Mona Nasr; Essam Shaaban; Yehia Helmy; Dr. Ayman Khedr
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Churn prediction aims to detect customers intended to leave a service provider. Retaining one customer costs an organization from 5 to 10 times than gaining a new one. Predictive models can provide correct identification of possible churners in the near future in order to provide a retention solution. This paper presents a new prediction model based on Data Mining (DM) techniques. The proposed model is composed of six steps which are; identify problem domain, data selection, investigate data set, classification, clustering and knowledge usage. A data set with 23 attributes and 5000 instances is used. 4000 instances used for training the model and 1000 instances used as a testing set. The predicted churners are clustered into 3 categories in case of using in a retention strategy. The data mining techniques used in this paper are Decision Tree, Support Vector Machine and Neural Network throughout an open source software name WEKA.

  15. e

    Classification and Prediction

    • paper.erudition.co.in
    html
    Updated Dec 3, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Einetic (2025). Classification and Prediction [Dataset]. https://paper.erudition.co.in/makaut/master-of-computer-applications-2-years/3/data-warehousing-and-data-mining
    Explore at:
    htmlAvailable download formats
    Dataset updated
    Dec 3, 2025
    Dataset authored and provided by
    Einetic
    License

    https://paper.erudition.co.in/termshttps://paper.erudition.co.in/terms

    Description

    Question Paper Solutions of chapter Classification and Prediction of Data Warehousing and Data Mining, 3rd Semester , Master of Computer Applications (2 Years)

  16. Data Mining Tools Market - A Global and Regional Analysis

    • bisresearch.com
    csv, pdf
    Updated Nov 30, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bisresearch (2025). Data Mining Tools Market - A Global and Regional Analysis [Dataset]. https://bisresearch.com/industry-report/global-data-mining-tools-market.html
    Explore at:
    csv, pdfAvailable download formats
    Dataset updated
    Nov 30, 2025
    Dataset authored and provided by
    Bisresearch
    License

    https://bisresearch.com/privacy-policy-cookie-restriction-modehttps://bisresearch.com/privacy-policy-cookie-restriction-mode

    Time period covered
    2023 - 2033
    Area covered
    Worldwide
    Description

    The Data Mining Tools Market is expected to be valued at $1.24 billion in 2024, with an anticipated expansion at a CAGR of 11.63% to reach $3.73 billion by 2034.

  17. m

    Iraqi Student Performance Prediction

    • data.mendeley.com
    Updated Dec 12, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    saja taha (2018). Iraqi Student Performance Prediction [Dataset]. http://doi.org/10.17632/smgx6s5pwr.1
    Explore at:
    Dataset updated
    Dec 12, 2018
    Authors
    saja taha
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Iraq
    Description

    Iraqi dataset is collected through applying (or submitting) questionnaire in three Iraqi secondary schools for both applicable and biology branches of the final stage during the second semester of the 2018 year. Initially, the questionnaire contains 56 questions in three A4 sheets and it is answered by 250 students (samples). Latter, 130 samples are discarded due to lack of information since pre-processing is applied to obtain the most complete information of students. After removing inconsistencies and incompleteness in the dataset, this study considers 120 samples instances with 55 features for experiment purposes. The features are distributed into five main categories: Demographic, Economic, Educational, Time, and Marks. Table (1) shows the dataset’s attributes/features and their description. As illustrated in this table, new features are introduced, such as holiday and worrying effects. The relationships between parents with schools and use of books and references by the student are also considered.

  18. w

    Global Data Mining and Modeling Market Research Report: By Application...

    • wiseguyreports.com
    Updated Aug 23, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). Global Data Mining and Modeling Market Research Report: By Application (Fraud Detection, Customer Segmentation, Risk Management, Market Basket Analysis), By Deployment Model (Cloud, On-Premises, Hybrid), By Technique (Predictive Analytics, Descriptive Analytics, Prescriptive Analytics, Text Mining), By End Use (Retail, Telecommunications, Banking and Financial Services, Healthcare) and By Regional (North America, Europe, South America, Asia Pacific, Middle East and Africa) - Forecast to 2035 [Dataset]. https://www.wiseguyreports.com/reports/data-mining-and-modeling-market
    Explore at:
    Dataset updated
    Aug 23, 2025
    License

    https://www.wiseguyreports.com/pages/privacy-policyhttps://www.wiseguyreports.com/pages/privacy-policy

    Time period covered
    Aug 25, 2025
    Area covered
    Global
    Description
    BASE YEAR2024
    HISTORICAL DATA2019 - 2023
    REGIONS COVEREDNorth America, Europe, APAC, South America, MEA
    REPORT COVERAGERevenue Forecast, Competitive Landscape, Growth Factors, and Trends
    MARKET SIZE 20247.87(USD Billion)
    MARKET SIZE 20258.37(USD Billion)
    MARKET SIZE 203515.4(USD Billion)
    SEGMENTS COVEREDApplication, Deployment Model, Technique, End Use, Regional
    COUNTRIES COVEREDUS, Canada, Germany, UK, France, Russia, Italy, Spain, Rest of Europe, China, India, Japan, South Korea, Malaysia, Thailand, Indonesia, Rest of APAC, Brazil, Mexico, Argentina, Rest of South America, GCC, South Africa, Rest of MEA
    KEY MARKET DYNAMICSGrowing demand for actionable insights, Increasing adoption of AI technologies, Rising need for predictive analytics, Expanding data sources and volume, Regulatory compliance and data privacy concerns
    MARKET FORECAST UNITSUSD Billion
    KEY COMPANIES PROFILEDInformatica, Tableau, Cloudera, Microsoft, Google, Alteryx, Oracle, SAP, SAS, DataRobot, Dell Technologies, Qlik, Teradata, TIBCO Software, Snowflake, IBM
    MARKET FORECAST PERIOD2025 - 2035
    KEY MARKET OPPORTUNITIESIncreased demand for predictive analytics, Growth in big data technologies, Rising need for data-driven decision-making, Adoption of AI and machine learning, Expansion in healthcare data analysis
    COMPOUND ANNUAL GROWTH RATE (CAGR) 6.3% (2025 - 2035)
  19. The Insurance Company (TIC) Benchmark

    • kaggle.com
    zip
    Updated May 27, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kush Shah (2020). The Insurance Company (TIC) Benchmark [Dataset]. https://www.kaggle.com/datasets/kushshah95/the-insurance-company-tic-benchmark/code
    Explore at:
    zip(268454 bytes)Available download formats
    Dataset updated
    May 27, 2020
    Authors
    Kush Shah
    Description

    This data set used in the CoIL 2000 Challenge contains information on customers of an insurance company. The data consists of 86 variables and includes product usage data and socio-demographic data

    DETAILED DATA DESCRIPTION

    THE INSURANCE COMPANY (TIC) 2000

    (c) Sentient Machine Research 2000

    DISCLAIMER

    This dataset is owned and supplied by the Dutch data mining company Sentient Machine Research, and is based on real-world business data. You are allowed to use this dataset and accompanying information for non-commercial research and education purposes only. It is explicitly not allowed to use this dataset for commercial education or demonstration purposes. For any other use, please contact Peter van der Putten, info@smr.nl.

    This dataset has been used in the CoIL Challenge 2000 data mining competition. For papers describing results on this dataset, see the TIC 2000 homepage: http://www.wi.leidenuniv.nl/~putten/library/cc2000/

    REFERENCE P. van der Putten and M. van Someren (eds). CoIL Challenge 2000: The Insurance Company Case. Published by Sentient Machine Research, Amsterdam. Also a Leiden Institute of Advanced Computer Science Technical Report 2000-09. June 22, 2000. See http://www.liacs.nl/~putten/library/cc2000/

    RELEVANT FILES

    tic_2000_train_data.csv: Dataset to train and validate prediction models and build a description (5822 customer records). Each record consists of 86 attributes, containing sociodemographic data (attribute 1-43) and product ownership (attributes 44-86). The sociodemographic data is derived from zip codes. All customers living in areas with the same zip code have the same sociodemographic attributes. Attribute 86, "CARAVAN: Number of mobile home policies", is the target variable.

    tic_2000_eval_data.csv: Dataset for predictions (4000 customer records). It has the same format as TICDATA2000.txt, only the target is missing. Participants are supposed to return the list of predicted targets only. All datasets are in CSV format. The meaning of the attributes and attribute values is given dictionary.csv

    tic_2000_target_data.csv Targets for the evaluation set.

    dictionary.txt: Data description with numerical labeled categories descriptions. It has columnar description data and the labels of the dummy/Labeled encoding.

    Original Task description Link: http://liacs.leidenuniv.nl/~puttenpwhvander/library/cc2000/problem.html UCI Machine Learning Repository: http://archive.ics.uci.edu/ml/datasets/Insurance+Company+Benchmark+%28COIL+2000%29

  20. Copper Mining Company - Stock Price Prediction

    • kaggle.com
    zip
    Updated May 16, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Maciej Gronczynski (2023). Copper Mining Company - Stock Price Prediction [Dataset]. https://www.kaggle.com/datasets/maciejgronczynski/cooper-mining-company-stock-price-prediction
    Explore at:
    zip(331452 bytes)Available download formats
    Dataset updated
    May 16, 2023
    Authors
    Maciej Gronczynski
    Description

    "The 'KGHM Dataset' is a meticulously curated collection of financial and economic data specifically designed for the purpose of stock price prediction for KGHM, a leading copper mining company. This dataset encompasses a wide range of features including historical prices, macroeconomic indicators, industry-related metrics, company-specific financials, and technical indicators. The dataset comprises 59 carefully selected features that have the potential to influence the stock price of KGHM. The data has been sourced from reputable platforms such as Yahoo Finance, Wikipedia, and the official website of KGHM. The dataset has undergone rigorous pre-processing and feature engineering to ensure data quality and relevance for the machine learning models used in the stock price prediction analysis. It serves as a valuable resource for conducting in-depth analysis and developing accurate predictive models for KGHM's stock price movements."

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
P. Anagnostou; S. Tasoulis; A. G. Vrahatis; S. Georgakopoulos; M. Prina; J. L. Ayuso-Mateos; J. Bickenbach; I. Bayes-Marin; F. F. Caballero; L. Egea-Cortés; E. García-Esquinas; M. Leonardi; S. Scherbov; A. Tamosiunas; A. Galas; J. M. Haro; A. Sanchez-Niubo; V. Plagianakos; D. Panagiotakos (2023). Enhancing the Human Health Status Prediction: The ATHLOS Project [Dataset]. http://doi.org/10.6084/m9.figshare.14798079.v1
Organization logo

Data from: Enhancing the Human Health Status Prediction: The ATHLOS Project

Related Article
Explore at:
xlsAvailable download formats
Dataset updated
Jun 3, 2023
Dataset provided by
Taylor & Francishttps://taylorandfrancis.com/
Authors
P. Anagnostou; S. Tasoulis; A. G. Vrahatis; S. Georgakopoulos; M. Prina; J. L. Ayuso-Mateos; J. Bickenbach; I. Bayes-Marin; F. F. Caballero; L. Egea-Cortés; E. García-Esquinas; M. Leonardi; S. Scherbov; A. Tamosiunas; A. Galas; J. M. Haro; A. Sanchez-Niubo; V. Plagianakos; D. Panagiotakos
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Preventive healthcare is a crucial pillar of health as it contributes to staying healthy and having immediate treatment when needed. Mining knowledge from longitudinal studies has the potential to significantly contribute to the improvement of preventive healthcare. Unfortunately, data originated from such studies are characterized by high complexity, huge volume, and a plethora of missing values. Machine Learning, Data Mining and Data Imputation models are utilized a part of solving these challenges, respectively. Toward this direction, we focus on the development of a complete methodology for the ATHLOS Project – funded by the European Union’s Horizon 2020 Research and Innovation Program, which aims to achieve a better interpretation of the impact of aging on health. The inherent complexity of the provided dataset lies in the fact that the project includes 15 independent European and international longitudinal studies of aging. In this work, we mainly focus on the HealthStatus (HS) score, an index that estimates the human status of health, aiming to examine the effect of various data imputation models to the prediction power of classification and regression models. Our results are promising, indicating the critical importance of data imputation in enhancing preventive medicine’s crucial role.

Search
Clear search
Close search
Google apps
Main menu