3 datasets found
  1. Numerai crypto data

    • kaggle.com
    Updated Sep 5, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Duuscha (2025). Numerai crypto data [Dataset]. https://www.kaggle.com/datasets/duuuscha/numerai-crypto-data/data
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Sep 5, 2025
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Duuscha
    Description

    Dataset

    This dataset was created by Duuscha

    Released under Other (specified in description)

    Contents

  2. YIEDL Numerai Crypto dataset - Daily

    • kaggle.com
    Updated Jul 10, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Duuscha (2025). YIEDL Numerai Crypto dataset - Daily [Dataset]. https://www.kaggle.com/datasets/duuuscha/yiedl-numerai-crypto-dataset-daily/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 10, 2025
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Duuscha
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Description
  3. ML Competition on Cryptocurrency Market Data

    • kaggle.com
    Updated Nov 23, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    YIEDL (2021). ML Competition on Cryptocurrency Market Data [Dataset]. https://www.kaggle.com/datasets/rocketcapital/ml-competition-on-cryptocurrency-market-data/data
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Nov 23, 2021
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    YIEDL
    Description

    Context

    The world of Asset Management today, from a technological point of view, is mainly linked to mature but inefficient supply chains, which merge discretionary and quantitative forecasting models. The financial industry has been working in the shadows for years to overcome this paradigm, pushing beyond technology, making use not only of automated models (trading systems and dynamic asset allocation systems) but also of the most modern Machine Learning techniques for Time Series Forecasting and Unsupervised Learning for the classification of financial instruments. However, in most cases, it uses proprietary technologies that are limited by definition (workforce, technology investment, scalability). Numerai, an offshoot of Jim Simons’ Renaissance Technologies, was the first to blaze a new path by building a first centralized machine learning competition, in order to gather a swarm of predictors outside the company, to integrate with internal intelligence. The discretionary contribution was therefore eliminated, and the information content generated internally was enriched by thousands of external contributors, in many cases linked to sectors unrelated to the financial industry, such as energy, aerospace, or biotechnology. In fact, the concept that to obtain good market forecasts, it is necessary to have only skills related to the financial world is overcome. What we have just described is the starting point of Rocket Capital Investment. To overcome the limit imposed by Numerai, a new competition has been engineered, which has the ambition to make this project even more “democratic”. How? Decentralizing, thanks to the Blockchain, the entire chain of participant management, collection, and validation of forecasts, as well as decisions relating to the evaluation and remuneration of the participants themselves. In this way, it is possible to make every aspect of the competition completely transparent and inviolable. Everything is managed by a Smart Contract, whose rules are known and shared. Let’s find out in more detail what it is.

    Starting from the idea of Numerai, we have completely re-engineered all aspects related to the management of participants, Scoring, and Reward, following the concept of decentralization of the production chain. To this end, a proprietary token (MUSA token) has been created which acts as an exchange currency and which integrates a smart contract that acts as an autonomous competition manager. The communication interface between the users and the smart contract is a DApp (“Decentralized Application”). But let’s see in more detail how all these elements combine with each other, like in a puzzle.

    Competition Technicalities

    A suitably normalized dataset is issued every week, containing data from over 400 cryptocurrencies. For each asset, the data relating to prices, volumes traded, quantitative elements, as well as alternative data (information on the blockchain and on the sentiment of the various providers) are aggregated. Another difference with Numerai is the ability to distinguish assets for each row (the first column shows the related ticker). The last column instead contains the question to which the Data Scientists are asked to give an answer: the relative strength ranking of each asset, built on the forecast of the percentage change expected in the following week.

    Registration for the Competition takes place by providing, in a completely anonymous way, the address of a crypto wallet on which the MUSA tokens are loaded. From that moment on, the MUSAs become, to all intents and purposes, the currency of exchange between participants and organizers. Every Monday a new Challenge opens, and all Data Scientists registered in the Contest are asked to use their models to generate predictions. By accessing the DApp, the participant can download the new dataset, complete with the history of the previous weeks and the last useful week. At this point the participant can perform two actions in sequence directly from the DApp: - Staking: MUSA tokens are placed on your prediction. - Submission: the forecast for the following week is uploaded to the blockchain.

    Since the forecast consists of a series of numbers between 0 and 1 associated with each asset, it is very easy, the following week, to calculate the error committed in terms of RMSE (“Root Mean Square Error”). This allows creating a ranking on the participants, to be able to reward them accordingly with additional MUSA tokens. But let’s see in more detail how the Smart Contract, which was created, allows us to differentiate the reward based on different items (all, again, in a completely transparent and verifiable way): - Staking Reward: the mere fact of participating in the competition is remunerated. In future versions, it will also be possible to bet on the goodness of the other participants’ predictions. - Challenge Rew...

  4. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Duuscha (2025). Numerai crypto data [Dataset]. https://www.kaggle.com/datasets/duuuscha/numerai-crypto-data/data
Organization logo

Numerai crypto data

Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Sep 5, 2025
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Duuscha
Description

Dataset

This dataset was created by Duuscha

Released under Other (specified in description)

Contents

Search
Clear search
Close search
Google apps
Main menu