Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
## Overview
Measure Predict is a dataset for object detection tasks - it contains GClef annotations for 843 images.
## Getting Started
You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
## License
This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset created from a higher education institution (acquired from several disjoint databases) related to students enrolled in different undergraduate degrees, such as agronomy, design, education, nursing, journalism, management, social service, and technologies. The dataset includes information known at the time of student enrollment (academic path, demographics, and social-economic factors) and the students' academic performance at the end of the first and second semesters. The data is used to build classification models to predict students' dropout and academic success. The problem is formulated as a three-category classification task, in which there is a strong imbalance towards one of the classes.
Column name Description Marital status The marital status of the student. (Categorical) Application mode The method of application used by the student. (Categorical) Application order The order in which the student applied. (Numerical) Course The course taken by the student. (Categorical) Daytime/evening attendance Whether the student attends classes during the day or in the evening. (Categorical) Previous qualification The qualification obtained by the student before enrolling in higher education. (Categorical) Nacionality The nationality of the student. (Categorical) Mother's qualification The qualification of the student's mother. (Categorical) Father's qualification The qualification of the student's father. (Categorical) Mother's occupation The occupation of the student's mother. (Categorical) Father's occupation The occupation of the student's father. (Categorical) Displaced Whether the student is a displaced person. (Categorical) Educational special needs Whether the student has any special educational needs. (Categorical) Debtor Whether the student is a debtor. (Categorical) Tuition fees up to date Whether the student's tuition fees are up to date. (Categorical) Gender The gender of the student. (Categorical) Scholarship holder Whether the student is a scholarship holder. (Categorical) Age at enrollment The age of the student at the time of enrollment. (Numerical) International Whether the student is an international student. (Categorical) Curricular units 1st sem (credited) The number of curricular units credited by the student in the first semester. (Numerical) Curricular units 1st sem (enrolled) The number of curricular units enrolled by the student in the first semester. (Numerical) Curricular units 1st sem (evaluations) The number of curricular units evaluated by the student in the first semester. (Numerical) Curricular units 1st sem (approved) The number of curricular units approved by the student in the first semester. (Numerical)
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
jfo150/stems-predict-data dataset hosted on Hugging Face and contributed by the HF Datasets community
Facebook
TwitterThis dataset contains the predicted prices of the asset Predict Crypto over the next 16 years. This data is calculated initially using a default 5 percent annual growth rate, and after page load, it features a sliding scale component where the user can then further adjust the growth rate to their own positive or negative projections. The maximum positive adjustable growth rate is 100 percent, and the minimum adjustable growth rate is -100 percent.
Facebook
TwitterPatient database that contains EEG data sets, executable tasks, and computational tools., THIS RESOURCE IS NO LONGER IN SERVICE. Documented on September 16,2025.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
https://raw.githubusercontent.com/Masterx-AI/Project_Housing_Price_Prediction_/main/hs.jpg" alt="">
A simple yet challenging project, to predict the housing price based on certain factors like house area, bedrooms, furnished, nearness to mainroad, etc. The dataset is small yet, it's complexity arises due to the fact that it has strong multicollinearity. Can you overcome these obstacles & build a decent predictive model?
Harrison, D. and Rubinfeld, D.L. (1978) Hedonic prices and the demand for clean air. J. Environ. Economics and Management 5, 81–102. Belsley D.A., Kuh, E. and Welsch, R.E. (1980) Regression Diagnostics. Identifying Influential Data and Sources of Collinearity. New York: Wiley.
Facebook
Twitterhttps://cubig.ai/store/terms-of-servicehttps://cubig.ai/store/terms-of-service
1) Data Introduction • The Dementia Prediction Dataset is a longitudinal collection of MRI data from 150 subjects aged 60 to 96. Each subject has multiple MRI scans taken over different visits, providing insights into changes over time. This data is valuable for studying and predicting dementia progression.
2) Data Utilization (1) Dementia Prediction data has characteristics that: • It includes detailed longitudinal measurements of cognitive and brain volume attributes, essential for understanding dementia progression and its correlation with brain structure changes. (2) Dementia Prediction data can be used to: • Predictive Modeling: Useful for developing machine learning models to predict dementia onset and progression based on MRI scans and cognitive assessments. • Medical Research: Assists in studying the relationship between brain volume changes and cognitive decline, contributing to a better understanding of dementia-related diseases like Alzheimer's. • Healthcare Planning: Supports healthcare providers in early diagnosis and personalized care planning for dementia patients by analyzing predictive factors and progression patterns.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Big Brands spend a significant amount on popularizing a product. Nevertheless, their efforts go in vain while establishing the merchandise in the hyperlocal market. Based on different geographical conditions same attributes can communicate a piece of much different information about the customer. Hence, insights this is a must for any brand owner.
In this competition, we have brought the data gathered from one of the top apparel brands in India. Provided the details concerning category, score, and presence in the store, participants are challenged to predict the popularity level of the merchandise.
The popularity class decides how popular the product is given the attributes which a store owner can control to make it happen.
Train.csv - 18208 rows x 12 columns (Includes popularity Column as Target variable) Test.csv - 12140 rows x 11 columns Sample Submission.csv - Please check the Evaluation section for more details on how to generate a valid submission
store_ratio basket_ratio category_1 store_score category_2 store_presence score_1 score_2 score_3 score_4 time popularity - Class of popularity (Target Column)
Multi-class Classification Modeling Advance Feature engineering Optimizing Multi-Class log loss score as a metric to generalize well on unseen data
Top-3 winners will get MLDS 2021 passes MLDS (Machine Learning Developer's Summit) INDIA’S NO.1 CONFERENCE EXCLUSIVELY FOR MACHINE LEARNING PRACTITIONERS ECOSYSTEM MLDS21 brings together India’s leading Machine Learning innovators and practitioners to share their ideas and experience about machine learning tools, advanced development in this sphere and gives the attendees a first look at new trends & developer products.
Use y_true as provided as class Labels(y_true) as predicted probabilities per class (y_pred) from the model using the predict_proba() method
You should submit a .csv/.xlsx file with exactly 12140 rows with 5 columns (i.e. 0, 1, 2, 3, 4). Your submission will return an Invalid Score if you have extra columns or rows.
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F5602038%2Ffacdb791dcf4105ce5e606087c0cf8cc%2Fxyz.png?generation=1611324853494826&alt=media" alt="">
The file should have exactly 5 columns.
Using pandas, one can do
submission_df.to_csv('my_submission_file.csv', index=False)
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
The dataset contains synthetic e-commerce sales data with 2,000 unique records, aimed at predicting demand levels (high/low) for products. It includes various features such as product price, promotional discounts, stock levels, weather conditions, and sales history to predict the target variable, which is a binary classification of demand (high or low). The data is intended for use in building machine learning models, specifically for demand forecasting in e-commerce. Each record represents a unique e-commerce transaction or scenario with relevant sales and environmental factors. The dataset is designed to support the analysis of factors affecting product demand and supply chain efficiency.
Column Description: Price: The price of the product, ranging from 5 to 100 units.
Discount: Binary feature indicating whether a discount was applied (0 = no, 1 = yes).
Time_of_Day: Categorical feature representing the time of day (0 = morning, 1 = afternoon, 2 = evening).
Day_of_Week: The day of the week (0 = Monday, 6 = Sunday).
Stock_Level: The number of items available in stock, ranging from 1 to 500.
Previous_Day_Sales: The sales volume of the product on the previous day, ranging from 10 to 200 units.
Promotion: Binary feature indicating if the product was part of a promotion (0 = no, 1 = yes).
Weather: Weather condition impacting sales (0 = bad, 1 = good).
Week_of_Year: The week number in the year (1 to 52).
Product_Category: The category of the product (randomly chosen between 5 categories).
Target: The binary target variable indicating high (1) or low (0) demand for the product.
Dataset Usage: This dataset is used to build and evaluate machine learning models for real-time demand prediction in e-commerce. It helps in understanding the impact of various factors like promotions, weather, and stock on product demand. The insights support better supply chain decisions, inventory management, and customer satisfaction.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
## Overview
Predict Sign is a dataset for object detection tasks - it contains Traffic Signs annotations for 3,680 images.
## Getting Started
You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
## License
This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
Facebook
Twitter**New to machine learning and data science? No question is too basic or too simple. Use this place to post any first-timer clarifying questions for the classification algorithm or related to datasets ** !This file contains demographics about customer and whether that customer clicked the ad or not . You this file to use classification algorithm to predict on the basis of demographics of customer as independent variable
This data set contains the following features:
This data set contains the following features:
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Analyze the Stroke Prediction Dataset to predict stroke risk based on factors like age, gender, heart disease, and smoking status. Perfect for machine learning and research.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Introduction: While falls among the elderly is a public health issue, because of the social, medical, and economic burden they represent, the tools to predict falls are limited. Posturography has been developed to distinguish fallers from non-fallers, however, there is too little data to show how predictions change as older adults' physical abilities improve. The Postadychute-AG clinical trial aims to evaluate the evolution of posturographic parameters in relation to the improvement of balance through adapted physical activity (APA) programs.Methods: In this prospective, multicentre clinical trial, institutionalized seniors over 65 years of age will be followed for a period of 6 months through computer-assisted posturography and automatic gait analysis. During the entire duration of the follow-up, they will benefit from a monthly measurement of their postural and locomotion capacities through a recording of their static balance and gait thanks to a software developed for this purpose. The data gathered will be correlated with the daily record of falls in the institution. Static and dynamic balance measurements aim to extract biomechanical markers and compare them with functional assessments of motor skills (Berg Balance Scale and Mini Motor Test), expecting their superiority in predicting the number of falls. Participants will be followed for 3 months without APA and 3 months with APA in homogeneous group exercises. An analysis of variance will evaluate the variability of monthly measures of balance in order to record the minimum clinically detectable change (MDC) as participants improve their physical condition through APA.Discussion: Previous studies have stated the MDC through repeated measurements of balance but, to our knowledge, none appear to have implemented monthly measurements of balance and gait. Combined with a reliable measure of the number of falls per person, motor capacities and other precipitating factors, this study aims to provide biomechanical markers predictive of fall risk with their sensitivity to improvement in clinical status over the medium term. This trial could provide the basis for posturographic and gait variable values for these elderly people and provide a solution to distinguish those most at risk to be implemented in current practice in nursing homes.Trial Registration: ID-RCB 2017-A02545-48.Protocol Version: Version 4.2 dated January 8, 2020.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Raw, original data and fits data set for "A Simple Model to Predict Future SARS-CoV-2 Infections on a National Level" by Blanco et al. in EXCEL and GraphPad Prism file formats.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Social media in general provide great opportunities for mining massive amounts of text, image, and video-based data. However, what questions can be addressed from analyzing such data? In this review, we are focusing on microblogging services and discuss applications of streaming data from the scientific literature. We will focus on text-based approaches because they represent by far the largest cohort of studies and we present a taxonomy of studied problems.
Facebook
TwitterObjective: To determine whether machine learning (ML) algorithms can improve the prediction of delayed cerebral ischemia (DCI) and functional-outcomes after subarachnoid hemorrhage (SAH). Methods: ML models and standard models (SM) were trained to predict DCI and functional-outcomes with data collected within 3 days of admission. Functional-outcomes at discharge and at 3-months were quantified using the modified Rankin scale (mRS) for neurological disability (dichotomized as ‘good’ (mRS≤3) vs ‘bad’ (mRS≥4) outcomes). Concurrently, clinicians prospectively prognosticated 3-month outcomes of patients. The performance of ML, SM and clinicians are retrospectively compared. Results: DCI status, discharge, and 3-month outcomes were available for 399, 393 and 240 subjects respectively. Prospective clinician (an attending, a fellow and a nurse) prognostication of 3-month outcomes was available for 90 subjects. ML models yielded predictions with the following AUC (area under the receiver o...
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
In recent times, it has been observed that social media exerts a favorable influence on consumer purchasing behavior. Many organizations are adopting the utilization of social media platforms as a means to promote products and services. Hence, it is crucial for enterprises to understand the consumer buying behavior in order to thrive. This article presents a novel approach that combines the theory of planned behavior (TPB) with machine learning techniques to develop accurate predictive models for consumer purchase behavior. This study examines three distinct factors of the theory of planned behavior (attitude, social norm, and perceived behavioral control) that provide insights into the primary determinants influencing online purchasing behavior. A total of eight machine learning algorithms, namely K-nearest neighbor, Decision Tree, Random Forest, Logistic Regression, Naive Bayes, Support Vector Machine, AdaBoost, and Gradient Boosting, were utilized in order to forecast consumer purchasing behavior. Empirical findings indicate that gradient boosting demonstrates superior performance in predicting customer buying behavior, with an accuracy rate of 0.91 and a macro F1 score of 0.91. This holds true when all factors, namely attitude (ATTD), social norm (SN), and perceived behavioral control (PBC), are included in the analysis. Furthermore, we incorporated Explainable AI (XAI), specifically LIME (Local Interpretable Model-Agnostic Explanations), to elucidate how the best machine learning model (i.e. gradient boosting) makes its prediction. The findings indicate that LIME has demonstrated a high level of confidence in accurately predicting the influence of low and high behavior. The outcome presented in this article has several implications. For instance, this article presents a novel way to combine the theory of planned behavior with machine learning techniques in order to predict consumer purchase behavior. This integration allows for a comprehensive analysis of factors influencing online purchasing decisions. Also, the incorporation of Explainable AI enhances the transparency and interpretability of the model. This feature is valuable for organizations seeking insights into factors driving predictions and the reasons behind certain outcomes. Moreover, these observations have the potential to offer valuable insights for businesses in customizing their marketing strategies to align with these influential factors.
Facebook
Twitterhttps://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
Theory suggest that networks of neurons may predict their input. Prediction may underlie most aspects of information processing, and is believed to be involved in motor and cognitive control and decision making. Retinal cells have been shown to be capable of predicting visual stimuli, and there is some evidence for prediction of input in the visual cortex and hippocampus. However, there is no proof that the ability to predict is a generic feature of neural networks. We investigated whether random in vitro neuronal networks can predict stimulation, and how prediction is related to short and long-term memory. To answer these questions we applied two different stimulation modalities. Focal electrical stimulation has been shown to induce long term memory traces, whereas global optogenetic stimulation did not. We used mutual information to quantify how much activity recorded from these networks reduces the uncertainty of upcoming stimuli (prediction) or recent past stimuli (short-term memory). Cortical neural networks did predict future stimuli, with the majority of all predictive information provided by the immediate network response to the stimulus. Interestingly, prediction strongly depended on short-term memory of recent sensory inputs during focal as well as global stimulation. However, prediction required less short-term memory during focal stimulation. Furthermore, the dependency on short-term memory decreased during 20h of focal stimulation, when long-term connectivity changes were induced. These changes are fundamental for long-term memory formation, suggesting that besides short-term memory the formation of long-term memory traces may play a role in efficient prediction.
Facebook
TwitterABSTRACT This study aimed to determine the length-weight relationship and mathematical models to predict dressed and fillet weight and yield and fillet composition of wild traíra, Hoplias malabaricus (Bloch, 1794). A total of 80 marketable-sized fish from 292.28 to 2879.57 g and 32.06 to 61.19 cm were used. The length:weight ratio was estimated using the equation: W = a × L b, in which W is body weight (g) and L is length (cm). The models of dressed and fillet weight and yield and body were elaborated using first-order ( y ^ i = β 0 + β 1 x i ) or second-order ( y ^ i = β 0 + β 1 x i + β 1 x i 2 ) linear regression analyses. The value of slope b in the length:weight ratio was 3.3732 and intercept was 0.0029. The prediction equations obtained for dressed weight, fillet weight, dressed yield, fillet yield, fillet gross energy, moisture, crude protein, crude lipid, and ash were, respectively: y ^ = 0.3244 + 0.9373 W, y ^ = 0.7651 + 0.4181 W, y ^ = 939.8015 + 0.0019 W, y ^ = 420.55170 + 0.0064 W, y ^ = 997.9600 + 0.0630 W, y ^ = 810.6500 − 0.0085 W, y ^ = 184.080 − 0.0111 W, y ^ = 3.1131 + 0.0049 W, and y ^ = 10.6110 + 0.0009 W, in which W is the body weight of fish (g). We demonstrated the possibility of elaborating realistic expressions to describe degutted weight, fillet weight, and fillet composition. However, lower mathematical adjustment was observed to estimate realistic prediction of dressed and fillet yield.
Facebook
Twitterhttps://www.kappasignal.com/p/legal-disclaimer.htmlhttps://www.kappasignal.com/p/legal-disclaimer.html
This analysis presents a rigorous exploration of financial data, incorporating a diverse range of statistical features. By providing a robust foundation, it facilitates advanced research and innovative modeling techniques within the field of finance.
Historical daily stock prices (open, high, low, close, volume)
Fundamental data (e.g., market capitalization, price to earnings P/E ratio, dividend yield, earnings per share EPS, price to earnings growth, debt-to-equity ratio, price-to-book ratio, current ratio, free cash flow, projected earnings growth, return on equity, dividend payout ratio, price to sales ratio, credit rating)
Technical indicators (e.g., moving averages, RSI, MACD, average directional index, aroon oscillator, stochastic oscillator, on-balance volume, accumulation/distribution A/D line, parabolic SAR indicator, bollinger bands indicators, fibonacci, williams percent range, commodity channel index)
Feature engineering based on financial data and technical indicators
Sentiment analysis data from social media and news articles
Macroeconomic data (e.g., GDP, unemployment rate, interest rates, consumer spending, building permits, consumer confidence, inflation, producer price index, money supply, home sales, retail sales, bond yields)
Stock price prediction
Portfolio optimization
Algorithmic trading
Market sentiment analysis
Risk management
Researchers investigating the effectiveness of machine learning in stock market prediction
Analysts developing quantitative trading Buy/Sell strategies
Individuals interested in building their own stock market prediction models
Students learning about machine learning and financial applications
The dataset may include different levels of granularity (e.g., daily, hourly)
Data cleaning and preprocessing are essential before model training
Regular updates are recommended to maintain the accuracy and relevance of the data
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
## Overview
Measure Predict is a dataset for object detection tasks - it contains GClef annotations for 843 images.
## Getting Started
You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
## License
This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).