100+ datasets found
  1. E-Commerce Price Prediction Challenge

    • zenodo.org
    Updated May 21, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    None; None (2024). E-Commerce Price Prediction Challenge [Dataset]. http://doi.org/10.5281/zenodo.11237099
    Explore at:
    Dataset updated
    May 21, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    None; None
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Problem Statement

    Akshat is new to the market and is unaware of the prices of multiple products. he wants to be assured of the price before making any purchase! Help Akshat out in predicting the prices for Products and derive some conclusive evidences!

    What's in there!?

    In the dataset, there are products/images/dimensions/prices/ratings and much more for Feature Engineering.

    What is the Inspiration from this!?

    This type of business problem is typical for any new products being launched into the market and sites like Trivago/Policy Bazaar compare same product over multiple sites to reach a conclusive rate. Can you achieve the same!?

  2. i

    HackerEarth Machine Learning challenge: Predict the price for Good Friday...

    • ieee-dataport.org
    Updated May 19, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Siddharth Kekre (2020). HackerEarth Machine Learning challenge: Predict the price for Good Friday gifts [Dataset]. https://ieee-dataport.org/documents/hackerearth-machine-learning-challenge-predict-price-good-friday-gifts
    Explore at:
    Dataset updated
    May 19, 2020
    Authors
    Siddharth Kekre
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The dataset consists of the following columns:Data description

  3. Social Media Prediction Challenge

    • kaggle.com
    Updated May 19, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Gaurav Dutta (2023). Social Media Prediction Challenge [Dataset]. https://www.kaggle.com/datasets/gauravduttakiit/social-media-prediction-challenge
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 19, 2023
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Gaurav Dutta
    Description

    The objective of this competition is to create a model to predict the number of retweets a tweet will get on Twitter. The data used to train the model will be approximately 2,400 tweets each from 38 major banks and mobile network operators across Africa.

    A machine learning model to predict retweets would be valuable to any business that uses social media to share important information and messages to the public. This model can be used as a tool to help businesses better tailor their tweets to ensure maximum impact and outreach to clients and non-clients.

    The data has been split into a test and training set.

    train.json (zipped) is the dataset that you will use to train your model. This dataset includes about 2,400 consecutive tweets from each of the companies listed below, for a total of 96,562 tweets.

    test_questions.json (zipped) is the dataset to which you will apply your model to test how well it performs. Use your model and this dataset to predict the number of retweets a tweet will receive. The test set are the consecutive tweets that followed the first tweets provided in the training sets. There are a maximum of 800 tweets per company in this test set. This dataset includes the same fields as train.json except for the retweet_count and favorite_count variables.

    sample_submission.csv is a table to provide an example of what your submission file should look like.

  4. Predict Future Sales Supplementary

    • kaggle.com
    Updated May 10, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kazım Anıl Eren (2018). Predict Future Sales Supplementary [Dataset]. https://www.kaggle.com/kazimanil/predict-future-sales-supplementary/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 10, 2018
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Kazım Anıl Eren
    License

    http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/

    Description

    Kaggle Challenge: Predict Future Sales.

    This dataset aims to publish the files that I will use on the Kaggle challenge called Predict Future Sales

    Data

    • I have downloaded test and train data from the competition webpage.
    • I have downloaded shop and item information data from the English translations of @deargle from this post. Then I have made some changes in the data described in this R file.
    • I have collected historical USD/RUB rates from Investing.com. I have used the most recent data for the days which does not include a rate info (i.e. Saturdays and Sundays which markets are closed).
    • I have prepared a calendar depicting public holidays and weekends. Public Holiday info for Russia is collected from this site.
  5. f

    Comparison results of different model.

    • plos.figshare.com
    xls
    Updated Dec 8, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ke Peng; Yan Peng; Wenguang Li (2023). Comparison results of different model. [Dataset]. http://doi.org/10.1371/journal.pone.0289724.t006
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Dec 8, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Ke Peng; Yan Peng; Wenguang Li
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    In recent years, with the continuous improvement of the financial system and the rapid development of the banking industry, the competition of the banking industry itself has intensified. At the same time, with the rapid development of information technology and Internet technology, customers’ choice of financial products is becoming more and more diversified, and customers’ dependence and loyalty to banking institutions is becoming less and less, and the problem of customer churn in commercial banks is becoming more and more prominent. How to predict customer behavior and retain existing customers has become a major challenge for banks to solve. Therefore, this study takes a bank’s business data on Kaggle platform as the research object, uses multiple sampling methods to compare the data for balancing, constructs a bank customer churn prediction model for churn identification by GA-XGBoost, and conducts interpretability analysis on the GA-XGBoost model to provide decision support and suggestions for the banking industry to prevent customer churn. The results show that: (1) The applied SMOTEENN is more effective than SMOTE and ADASYN in dealing with the imbalance of banking data. (2) The F1 and AUC values of the model improved and optimized by XGBoost using genetic algorithm can reach 90% and 99%, respectively, which are optimal compared to other six machine learning models. The GA-XGBoost classifier was identified as the best solution for the customer churn problem. (3) Using Shapley values, we explain how each feature affects the model results, and analyze the features that have a high impact on the model prediction, such as the total number of transactions in the past year, the amount of transactions in the past year, the number of products owned by customers, and the total sales balance. The contribution of this paper is mainly in two aspects: (1) this study can provide useful information from the black box model based on the accurate identification of churned customers, which can provide reference for commercial banks to improve their service quality and retain customers; (2) it can provide reference for customer churn early warning models of other related industries, which can help the banking industry to maintain customer stability, maintain market position and reduce corporate losses.

  6. Dataset for 2nd CMI-PB Vaccine Response Prediction Challenge

    • zenodo.org
    zip
    Updated Mar 4, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Pramod Shinde; Pramod Shinde (2025). Dataset for 2nd CMI-PB Vaccine Response Prediction Challenge [Dataset]. http://doi.org/10.5281/zenodo.14968773
    Explore at:
    zipAvailable download formats
    Dataset updated
    Mar 4, 2025
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Pramod Shinde; Pramod Shinde
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The repository includes the datasets used to organise CMI-PB 2nd Challenge. Read more here: https://www.cmi-pb.org/blog/learn-about-project/#A%20community%20prediction%20challenge

  7. HackerEarth's Churn Risk Rate Challenge

    • kaggle.com
    Updated Mar 20, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rishabh Sethia (2021). HackerEarth's Churn Risk Rate Challenge [Dataset]. https://www.kaggle.com/rishabh6377/hackerearths-churn-risk-rate-challenge
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Mar 20, 2021
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Rishabh Sethia
    Description

    Context

    This data is related to HackerEarth's Customer Churn Rate Prediction Challenge

    Content

    Contains 3 Files. For more info regarding data click on it

    Acknowledgements

    HackerEarth

  8. The signaling response prediction challenge solicited predictions of the...

    • plos.figshare.com
    • datasetcatalog.nlm.nih.gov
    xls
    Updated Jun 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Robert J. Prill; Daniel Marbach; Julio Saez-Rodriguez; Peter K. Sorger; Leonidas G. Alexopoulos; Xiaowei Xue; Neil D. Clarke; Grégoire Altan-Bonnet; Gustavo Stolovitzky (2023). The signaling response prediction challenge solicited predictions of the concentrations of 17 phosphoproteins and 20 cytokines. [Dataset]. http://doi.org/10.1371/journal.pone.0009202.t001
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Robert J. Prill; Daniel Marbach; Julio Saez-Rodriguez; Peter K. Sorger; Leonidas G. Alexopoulos; Xiaowei Xue; Neil D. Clarke; Grégoire Altan-Bonnet; Gustavo Stolovitzky
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The data set underlying this challenge consisted of phosphoprotein and cytokine concentrations in response to 49 combinatoric perturbations of seven protein-specific inhibitors and seven stimuli.

  9. Buyer's Time Prediction Challenge

    • kaggle.com
    zip
    Updated Dec 18, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mohd Aquib (2020). Buyer's Time Prediction Challenge [Dataset]. https://www.kaggle.com/aquib5559/buyers-time-prediction-challenge
    Explore at:
    zip(296363 bytes)Available download formats
    Dataset updated
    Dec 18, 2020
    Authors
    Mohd Aquib
    License

    http://www.gnu.org/licenses/old-licenses/gpl-2.0.en.htmlhttp://www.gnu.org/licenses/old-licenses/gpl-2.0.en.html

    Description

    The Dataset is from Machine Hack.

    Buyers spend a significant amount of time surfing an e-commerce store, since the pandemic the e-commerce has seen a boom in the number of users across the domains. In the meantime, the store owners are also planning to attract customers using various algorithms to leverage customer behavior patterns

    Tracking customer activity is also a great way of understanding customer behavior and figuring out what can actually be done to serve them better. Machine learning and AI has already played a significant role in designing various recommendation engines to lure customers by predicting their buying patterns

    Dataset Description:

    • Train.json - 5429 rows x 9 columns (Includes time_spent Column as Target variable)
    • Test.json - 2327 rows x 8 columns
    • Sample Submission.csv - Please check the Evaluation section for more details on how to generate a valid submission

    Attribute Description:

    • session_id - Unique identifier for every row
    • session_number - Session type identifier
    • client_agent - Client-side software details
    • device_details - Client-side device details
    • date - Datestamp of the session
    • purchased - Binary value for any purchase done
    • added_in_cart - Binary value for cart activity
    • checked_out - Binary value for checking out successfully
    • time_spent - Total time spent in seconds (Target Column)
  10. f

    Prediction errors for spatial application.

    • plos.figshare.com
    xls
    Updated Jun 2, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Icíar Civantos-Gómez; Javier García-Algarra; David García-Callejas; Javier Galeano; Oscar Godoy; Ignasi Bartomeus (2023). Prediction errors for spatial application. [Dataset]. http://doi.org/10.1371/journal.pcbi.1008906.t002
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 2, 2023
    Dataset provided by
    PLOS Computational Biology
    Authors
    Icíar Civantos-Gómez; Javier García-Algarra; David García-Callejas; Javier Galeano; Oscar Godoy; Ignasi Bartomeus
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Prediction errors for spatial application.

  11. ExploreSA The Gawler Challenge - winners and submission data release -...

    • catalog.sarig.sa.gov.au
    Updated Mar 28, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    catalog.sarig.sa.gov.au (2025). ExploreSA The Gawler Challenge - winners and submission data release - Dataset - SARIG catalogue [Dataset]. https://catalog.sarig.sa.gov.au/dataset/mesac792
    Explore at:
    Dataset updated
    Mar 28, 2025
    Dataset provided by
    Government of South Australiahttp://sa.gov.au/
    Area covered
    Gawler
    Description

    ExploreSA: The Gawler Challenge is a global online competition from the Government of South Australia. The challenge is to identify or predict areas of potential mineralisation within the Gawler region, using any technique. This dataset contains a... ExploreSA: The Gawler Challenge is a global online competition from the Government of South Australia. The challenge is to identify or predict areas of potential mineralisation within the Gawler region, using any technique. This dataset contains a list of all team submissions, with links to video pitch, submitted data packages and highlighting all winners in each category for the Unearthed ExploreSA The Gawler challenge: https://unearthed.solutions/u/challenge/gawler-challenge.

  12. Z

    Data from: YJMob100K: City-Scale and Longitudinal Dataset of Anonymized...

    • data.niaid.nih.gov
    • zenodo.org
    Updated Apr 21, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Shimizu, Toru (2024). YJMob100K: City-Scale and Longitudinal Dataset of Anonymized Human Mobility Trajectories [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_8111992
    Explore at:
    Dataset updated
    Apr 21, 2024
    Dataset provided by
    Tsubouchi, Kota
    Shimizu, Toru
    Sekimoto, Yoshihide
    Sezaki, Kaoru
    Yabe, Takahiro
    Pentland, Alex
    Moro, Esteban
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The YJMob100K human mobility datasets (YJMob100K_dataset1.csv.gz and YJMob100K_dataset1.csv.gz) contain the movement of a total of 100,000 individuals across a 75 day period, discretized into 30-minute intervals and 500 meter grid cells. The first dataset contains the movement of 80,000 individuals across a 75-day business-as-usual period, while the second dataset contains the movement of 20,000 individuals across a 75-day period (including the last 15 days during an emergency) with unusual behavior.

    While the name or location of the city is not disclosed, the participants are provided with points-of-interest (POIs; e.g., restaurants, parks) data for each grid cell (~85 dimensional vector) as supplementary information (cell_POIcat.csv.gz). The list of 85 POI categories can be found in POI_datacategories.csv.

    For details of the dataset, see Data Descriptor:

    Yabe, T., Tsubouchi, K., Shimizu, T., Sekimoto, Y., Sezaki, K., Moro, E., & Pentland, A. (2024). YJMob100K: City-scale and longitudinal dataset of anonymized human mobility trajectories. Scientific Data, 11(1), 397. https://www.nature.com/articles/s41597-024-03237-9

    --- Details about the Human Mobility Prediction Challenge 2023 (ended November 13, 2023) ---

    The challenge takes place in a mid-sized and highly populated metropolitan area, somewhere in Japan. The area is divided into 500 meters x 500 meters grid cells, resulting in a 200 x 200 grid cell space.

    The human mobility datasets (task1_dataset.csv.gz and task2_dataset.csv.gz) contain the movement of a total of 100,000 individuals across a 90 day period, discretized into 30-minute intervals and 500 meter grid cells. The first dataset contains the movement of a 75 day business-as-usual period, while the second dataset contains the movement of a 75 day period during an emergency with unusual behavior.

    There are 2 tasks in the Human Mobility Prediction Challenge.

    In task 1, participants are provided with the full time series data (75 days) for 80,000 individuals, and partial (only 60 days) time series movement data for the remaining 20,000 individuals (task1_dataset.csv.gz). Given the provided data, Task 1 of the challenge is to predict the movement patterns of the individuals in the 20,000 individuals during days 60-74. Task 2 is similar task but uses a smaller dataset of 25,000 individuals in total, 2,500 of which have the locations during days 60-74 masked and need to be predicted (task2_dataset.csv.gz).

    While the name or location of the city is not disclosed, the participants are provided with points-of-interest (POIs; e.g., restaurants, parks) data for each grid cell (~85 dimensional vector) as supplementary information (which is optional for use in the challenge) (cell_POIcat.csv.gz).

    For more details, see https://connection.mit.edu/humob-challenge-2023

  13. p

    Predicting Paroxysmal Atrial Fibrillation/Flutter: The PhysioNet/Computing...

    • physionet.org
    Updated Mar 1, 2001
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    George Moody (2001). Predicting Paroxysmal Atrial Fibrillation/Flutter: The PhysioNet/Computing in Cardiology Challenge 2001 [Dataset]. https://physionet.org/challenge/2001/
    Explore at:
    Dataset updated
    Mar 1, 2001
    Authors
    George Moody
    License

    Open Data Commons Attribution License (ODC-By) v1.0https://www.opendatacommons.org/licenses/by/1.0/
    License information was derived automatically

    Description

    Following the success of the first Computers in Cardiology Challenge, we are pleased to offer a new challenge from PhysioNet and Computers in Cardiology 2001. The challenge is to develop a fully automated method to predict the onset of paroxysmal atrial fibrillation/flutter (PAF), based on the ECG prior to the event. The goal of the contest is to stimulate effort and advance the state of the art in this clinically significant problem, and to foster both friendly competition and wide-ranging collaborations.

  14. Data from: Deep Learning-Based Conformal Prediction of Toxicity

    • acs.figshare.com
    xlsx
    Updated Jun 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jin Zhang; Ulf Norinder; Fredrik Svensson (2023). Deep Learning-Based Conformal Prediction of Toxicity [Dataset]. http://doi.org/10.1021/acs.jcim.1c00208.s002
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    ACS Publications
    Authors
    Jin Zhang; Ulf Norinder; Fredrik Svensson
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    Predictive modeling for toxicity can help reduce risks in a range of applications and potentially serve as the basis for regulatory decisions. However, the utility of these predictions can be limited if the associated uncertainty is not adequately quantified. With recent studies showing great promise for deep learning-based models also for toxicity predictions, we investigate the combination of deep learning-based predictors with the conformal prediction framework to generate highly predictive models with well-defined uncertainties. We use a range of deep feedforward neural networks and graph neural networks in a conformal prediction setting and evaluate their performance on data from the Tox21 challenge. We also compare the results from the conformal predictors to those of the underlying machine learning models. The results indicate that highly predictive models can be obtained that result in very efficient conformal predictors even at high confidence levels. Taken together, our results highlight the utility of conformal predictors as a convenient way to deliver toxicity predictions with confidence, adding both statistical guarantees on the model performance as well as better predictions of the minority class compared to the underlying models.

  15. Data from: Predicting ABM Results with Covering Arrays and Random Forests

    • catalog.data.gov
    • data.nist.gov
    Updated Dec 15, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    National Institute of Standards and Technology (2023). Predicting ABM Results with Covering Arrays and Random Forests [Dataset]. https://catalog.data.gov/dataset/predicting-abm-results-with-covering-arrays-and-random-forests-d10a4
    Explore at:
    Dataset updated
    Dec 15, 2023
    Dataset provided by
    National Institute of Standards and Technologyhttp://www.nist.gov/
    Description

    Our goal is to explore the feasibility and usefulness of using a combination of covering arrays and machine learning models for predicting results of an agent- based simulation model within the vast parameter value combination space. The challenge is to select parameter values that are representative of the overall behavior of the model, so that we can train the machine learning model to be able to correctly predict behavior on previously untested areas of the parameter space. We have chosen Wilensky's Heat Bugs model in NetLogo for our study. It is a simple model, amenable to quick data generation, with a limited number of outputs to predict, and with emergent behavior. This model therefore allows exploration of this new approach.We utilize covering arrays to reduce the parameter value space systematically, run the model for each parameter set in the 2-way and 3-way covering arrays, train a random forest model on the 2-way data (33, 351 parameter combinations), and test its ability to predict the outcome of the simulation on the significantly larger 3-way data that was not seen during the training of the model (3, 971, 955 parameter combinations).

  16. d

    Data from: The Role of Governmental Weapons Procurements in Forecasting...

    • search.dataone.org
    Updated Nov 12, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Cornelius Fritz; Marius Mehrl; Paul Thurner; Goran Kauermann (2023). The Role of Governmental Weapons Procurements in Forecasting Monthly Fatalities in Intrastate Conflicts: A Semiparametric Hierarchical Hurdle Model [Dataset]. http://doi.org/10.7910/DVN/AUNPTZ
    Explore at:
    Dataset updated
    Nov 12, 2023
    Dataset provided by
    Harvard Dataverse
    Authors
    Cornelius Fritz; Marius Mehrl; Paul Thurner; Goran Kauermann
    Description

    Accurate and interpretable forecasting models predicting spatially and temporally fine-grained changes in the numbers of intrastate conflict casualties are of crucial importance for policymakers and international non-governmental organisations (NGOs). Using a count data approach, we propose a hierarchical hurdle regression model to address the corresponding prediction challenge at the monthly PRIO-grid level. More precisely, we model the intensity of local armed conflict at a specific point in time as a three-stage process. Stages one and two of our approach estimate whether we will observe any casualties at the country- and grid-cell-level, respectively, while stage three applies a regression model for truncated data to predict the number of such fatalities conditional upon the previous two stages. Within this modelling framework, we focus on the role of governmental arms imports as a processual factor allowing governments to intensify or deter from fighting. We further argue that a grid cell's geographic remoteness is bound to moderate the effects of these military buildups. Out-of-sample predictions corroborate the effectiveness of our parsimonious and theory-driven model, which enables full transparency combined with accuracy in the forecasting process.

  17. n

    Data from: Short-term prediction through ordinal patterns

    • data.niaid.nih.gov
    • datadryad.org
    • +1more
    zip
    Updated Jan 20, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yair Neuman; Yair Neuman; Yohai Cohen; Boaz Tamir (2021). Short-term prediction through ordinal patterns [Dataset]. http://doi.org/10.5061/dryad.vq83bk3r9
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jan 20, 2021
    Dataset provided by
    Gilasio Coding (Israel)
    Ben-Gurion University of the Negev
    Authors
    Yair Neuman; Yair Neuman; Yohai Cohen; Boaz Tamir
    License

    https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html

    Description

    Prediction in natural environments is a challenging task, and there is a lack of clarity around how a myopic organism can make short-term predictions given limited data availability and cognitive resources. In this context, we may ask what kind of resources are available to the organism to help it address the challenge of short-term prediction within its own cognitive limits. We point to one potentially important resource: ordinal patterns, which are extensively used in physics but not in the study of cognitive processes. We explain the potential importance of ordinal patterns for short-term prediction, and how natural constraints imposed through (1) ordinal pattern types, (2) their transition probabilities and (3) their irreversibility signature may support short-term prediction. Having tested these ideas on a massive data set of Bitcoin prices representing a highly fluctuating environment, we provide preliminary empirical support showing how organisms characterized by bounded rationality may generate short-term predictions by relying on ordinal patterns.

    Methods The data file holds 60000 samples of 62 minutes of trade prices in permutations form of the bitcoin exchange bitstamp

    The readme files contain the explanation of the code for the article.

  18. n

    Data from: Thermal performance under constant temperatures can accurately...

    • data.niaid.nih.gov
    • explore.openaire.eu
    • +2more
    zip
    Updated Jun 1, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Loke von Schmalensee; Katrín Hulda Gunnarsdóttir; Joacim Näslund; Karl Gotthard; Philipp Lehmann (2021). Thermal performance under constant temperatures can accurately predict insect development times across naturally variable microclimates [Dataset]. http://doi.org/10.5061/dryad.gtht76hm5
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jun 1, 2021
    Dataset provided by
    Stockholm University
    Authors
    Loke von Schmalensee; Katrín Hulda Gunnarsdóttir; Joacim Näslund; Karl Gotthard; Philipp Lehmann
    License

    https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html

    Description

    External conditions can drive biological rates in ectotherms by directly influencing body temperatures. While estimating the temperature dependence of performance-traits such as growth and development rate is feasible under controlled laboratory settings, predictions in nature are difficult. One major challenge lies in translating performance under constant conditions to fluctuating environments. Using the butterfly Pieris napi as model system, we show that development rate, an important fitness trait, can be accurately predicted in the field using models parameterized under constant laboratory temperatures. Additionally, using a factorial design, we show that accurate predictions can be made across microhabitats, but critically hinge on adequate consideration of nonlinearity in reaction norms, spatial heterogeneity in microclimate, and temporal variation in temperature. Our empirical results are also supported by a comparison of published and simulated data. Conclusively, our combined results suggest that, discounting direct effects of temperature, insect development rates are generally unaffected by thermal fluctuations. Methods Thermal performance in development rate was measured at 8 constant temperatures in the butterfly Pieris napi. Measurements were made for eggs and larvae separately, as well as for the full ontogonetic development between oviposition and pupation (eggs and larvae combined). Thermal performance curves were fit to the data. Prediction models were parameterized based on this data, and validated through field transplants. For the field transplants, microclimate temperatures were frequently sampled at multiple sites. These temperatures were used to predict development times. For comparison, weather station data was also used in the prediction model. Transplanted individuals were monitored and their development times in the field were compared to predictions.

    All raw data necessary to reproduce these results are available here, and compressed to "von_Schmalensee_et_al_2021_ecol_lett_scripts_and_data.rar". Additionally, the scripts used to produce the results and the five main figures are available, with annotation. See the "0_readme.txt" file for more information, and the main manuscript and supporting information for a detailed description of the methods.

  19. Challenges in Augmented Reality

    • statistics.technavio.org
    Updated Jun 15, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Technavio (2025). Challenges in Augmented Reality [Dataset]. https://statistics.technavio.org/challenges-in-augmented-reality
    Explore at:
    Dataset updated
    Jun 15, 2025
    Dataset provided by
    TechNavio
    Authors
    Technavio
    Time period covered
    2021 - 2025
    Area covered
    Worldwide
    Description

    Download Free Sample
    Upon thorough analysis and research, the following factors has been identified as the critical augmented reality (ar) market challenges during the forecast period 2020-2024:

    privacy concerns over AR technology

    The augmented reality (ar) market report also provides several other key information including:

    CAGR of the market during the forecast period 2020-2024
    Detailed information on factors that will drive augmented reality (ar) market growth during the next five years
    Precise estimation of the augmented reality (ar) market size and its contribution to the parent market
    Accurate predictions on upcoming trends and changes in consumer behavior
    The growth of the augmented reality (ar) market industry across APAC, Europe, MEA, North America, and South America
    A thorough analysis of the market’s competitive landscape and detailed information on vendors
    Comprehensive details of factors that will challenge the growth of augmented reality (ar) market vendors
    
  20. Data from: Challenge#8

    • kaggle.com
    zip
    Updated Jun 14, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rajnish Singh (2020). Challenge#8 [Dataset]. https://www.kaggle.com/lucca9211/challenge8
    Explore at:
    zip(71999 bytes)Available download formats
    Dataset updated
    Jun 14, 2020
    Authors
    Rajnish Singh
    Description

    Dataset

    This dataset was created by Rajnish Singh

    Contents

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
None; None (2024). E-Commerce Price Prediction Challenge [Dataset]. http://doi.org/10.5281/zenodo.11237099
Organization logo

E-Commerce Price Prediction Challenge

Explore at:
Dataset updated
May 21, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
None; None
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Problem Statement

Akshat is new to the market and is unaware of the prices of multiple products. he wants to be assured of the price before making any purchase! Help Akshat out in predicting the prices for Products and derive some conclusive evidences!

What's in there!?

In the dataset, there are products/images/dimensions/prices/ratings and much more for Feature Engineering.

What is the Inspiration from this!?

This type of business problem is typical for any new products being launched into the market and sites like Trivago/Policy Bazaar compare same product over multiple sites to reach a conclusive rate. Can you achieve the same!?

Search
Clear search
Close search
Google apps
Main menu