3 datasets found
  1. ML Competition on Cryptocurrency Market Data

    • kaggle.com
    zip
    Updated Nov 23, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    YIEDL (2021). ML Competition on Cryptocurrency Market Data [Dataset]. https://www.kaggle.com/datasets/rocketcapital/ml-competition-on-cryptocurrency-market-data
    Explore at:
    zip(291744236 bytes)Available download formats
    Dataset updated
    Nov 23, 2021
    Authors
    YIEDL
    Description

    Context

    The world of Asset Management today, from a technological point of view, is mainly linked to mature but inefficient supply chains, which merge discretionary and quantitative forecasting models. The financial industry has been working in the shadows for years to overcome this paradigm, pushing beyond technology, making use not only of automated models (trading systems and dynamic asset allocation systems) but also of the most modern Machine Learning techniques for Time Series Forecasting and Unsupervised Learning for the classification of financial instruments. However, in most cases, it uses proprietary technologies that are limited by definition (workforce, technology investment, scalability). Numerai, an offshoot of Jim Simons’ Renaissance Technologies, was the first to blaze a new path by building a first centralized machine learning competition, in order to gather a swarm of predictors outside the company, to integrate with internal intelligence. The discretionary contribution was therefore eliminated, and the information content generated internally was enriched by thousands of external contributors, in many cases linked to sectors unrelated to the financial industry, such as energy, aerospace, or biotechnology. In fact, the concept that to obtain good market forecasts, it is necessary to have only skills related to the financial world is overcome. What we have just described is the starting point of Rocket Capital Investment. To overcome the limit imposed by Numerai, a new competition has been engineered, which has the ambition to make this project even more “democratic”. How? Decentralizing, thanks to the Blockchain, the entire chain of participant management, collection, and validation of forecasts, as well as decisions relating to the evaluation and remuneration of the participants themselves. In this way, it is possible to make every aspect of the competition completely transparent and inviolable. Everything is managed by a Smart Contract, whose rules are known and shared. Let’s find out in more detail what it is.

    Starting from the idea of Numerai, we have completely re-engineered all aspects related to the management of participants, Scoring, and Reward, following the concept of decentralization of the production chain. To this end, a proprietary token (MUSA token) has been created which acts as an exchange currency and which integrates a smart contract that acts as an autonomous competition manager. The communication interface between the users and the smart contract is a DApp (“Decentralized Application”). But let’s see in more detail how all these elements combine with each other, like in a puzzle.

    Competition Technicalities

    A suitably normalized dataset is issued every week, containing data from over 400 cryptocurrencies. For each asset, the data relating to prices, volumes traded, quantitative elements, as well as alternative data (information on the blockchain and on the sentiment of the various providers) are aggregated. Another difference with Numerai is the ability to distinguish assets for each row (the first column shows the related ticker). The last column instead contains the question to which the Data Scientists are asked to give an answer: the relative strength ranking of each asset, built on the forecast of the percentage change expected in the following week.

    Registration for the Competition takes place by providing, in a completely anonymous way, the address of a crypto wallet on which the MUSA tokens are loaded. From that moment on, the MUSAs become, to all intents and purposes, the currency of exchange between participants and organizers. Every Monday a new Challenge opens, and all Data Scientists registered in the Contest are asked to use their models to generate predictions. By accessing the DApp, the participant can download the new dataset, complete with the history of the previous weeks and the last useful week. At this point the participant can perform two actions in sequence directly from the DApp: - Staking: MUSA tokens are placed on your prediction. - Submission: the forecast for the following week is uploaded to the blockchain.

    Since the forecast consists of a series of numbers between 0 and 1 associated with each asset, it is very easy, the following week, to calculate the error committed in terms of RMSE (“Root Mean Square Error”). This allows creating a ranking on the participants, to be able to reward them accordingly with additional MUSA tokens. But let’s see in more detail how the Smart Contract, which was created, allows us to differentiate the reward based on different items (all, again, in a completely transparent and verifiable way): - Staking Reward: the mere fact of participating in the competition is remunerated. In future versions, it will also be possible to bet on the goodness of the other participants’ predictions. - Challenge Rew...

  2. d

    Marginal beta probability density functions for predator-prey diet linkages...

    • dataone.org
    • data.griidc.org
    • +1more
    Updated Feb 5, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ainsworth, Cameron (2025). Marginal beta probability density functions for predator-prey diet linkages for the Gulf of Mexico fitted using maximum likelihood method, April 2013-May 2015 [Dataset]. http://doi.org/10.7266/N7TB14XF
    Explore at:
    Dataset updated
    Feb 5, 2025
    Dataset provided by
    GRIIDC
    Authors
    Ainsworth, Cameron
    Area covered
    Gulf of Mexico (Gulf of America)
    Description

    This is a dataset of the marginal beta probability density functions (PDFs) representing the percent contribution of prey to predator diet. Predator and prey are provided at the level of functional groups, where functional groups correspond to those used in an Atlantis biogeochemical ecosystem model of the Gulf of Mexico published by Ainsworth et al. 2015 (NOAA Tech. Memo. NMFS-SEFSC-676). The PDFs have been provided in a CSV file, and also graphically (Fig1.tif, Fig2.tif, Fig3.tif, Fig4.tif). This data set only deals with predator groups that are fish. Note that this data is also provided in a summarized form (mode and 95 % confidence intervals) at this location (DOI:10.7266/N7SX6B5H). This dataset supports the publication: Tarnecki, J.H., Wallace, A., Simons, J. and Ainsworth, C.H. 2016. Progression of a Gulf of Mexico Food Web Supporting Atlantis Ecosystem Model Development. Fisheries Research (DOI: 10.1016/j.fishres.2016.02.023). The data has been used to parameterize the diet matrix of Atlantis. As described in Tarnecki et al., this dataset is based on data previously published in Masi, M.D., Ainsworth, C.H., Chagaris, D., 2014. (Ecol. Model. 284, 60-74) but has been expanded to include new stomach sampling and information collated in the Gulf of Mexico Species Interaction Database (GoMexSI) maintained by Jim Simons at Texas A&M Corpus Christi. The main improvement over the original Masi et al. diet matrix is that more species are considered and from a wider geographic range (including waters of the western Gulf of Mexico). Tarnecki et al use this improved diet matrix in Atlantis to show improved model performance; they also compared this revised diet matrix against previously published diet matrices for the Gulf of Mexico.

  3. d

    Data from: Predator-prey diet linkages with error range for the Gulf of...

    • dataone.org
    • data.griidc.org
    • +1more
    Updated Jul 9, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    GRIIDC (2019). Predator-prey diet linkages with error range for the Gulf of Mexico fitted using maximum likelihood method, April 2013- May 2015 [Dataset]. https://dataone.org/datasets/R4-x267-182-0003-0006
    Explore at:
    Dataset updated
    Jul 9, 2019
    Dataset provided by
    GRIIDC
    Time period covered
    Apr 1, 2013 - May 1, 2015
    Area covered
    Description

    Data is from Tarnecki, J.H., Wallace, A., Simons, J. and Ainsworth, C.H. 2016. Progression of a Gulf of Mexico Food Web Supporting Atlantis Ecosystem Model Development. Fisheries Research (doi:10.1016/j.fishres.2016.02.023). This is a dataset representing predator-prey linkages with associated error ranges for Gulf of Mexico fish functional groups. The data will be used in an Atlantis biogeochemical trophic ecosystem model of the Gulf of Mexico described by Ainsworth et al. 2015 (NOAA Technical Memorandum NMFS-SEFSC-676). This diet dataset is based on data previously published in Masi, M.D., Ainsworth, C.H., Chagaris, D., 2014. (Ecol. Model. 284, 60-74) and expanded to include new original stomach sampling described by Tarnecki et al., as well as information collated in the Gulf of Mexico Species Interaction Database (GoMexSI) maintained by Jim Simons at Texas A&M Corpus Christi. The main improvement over the original Masi et al. diet matrix is that more species are considered and from a wider geographic range (including waters of the western Gulf of Mexico). Tarnecki et al use this improved diet matrix in Atlantis to show improved model performance; they also compared this revised diet matrix against previously published diet matrices for the Gulf of Mexico. The format of this data is similar to the original Masi et al. diet data located here: http://dx.doi.org/10.7266/N7Q23X72 (UDI: R1.x135.120:0007).

  4. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
YIEDL (2021). ML Competition on Cryptocurrency Market Data [Dataset]. https://www.kaggle.com/datasets/rocketcapital/ml-competition-on-cryptocurrency-market-data
Organization logo

ML Competition on Cryptocurrency Market Data

First Decentralized and Distributed ML Competition

Explore at:
zip(291744236 bytes)Available download formats
Dataset updated
Nov 23, 2021
Authors
YIEDL
Description

Context

The world of Asset Management today, from a technological point of view, is mainly linked to mature but inefficient supply chains, which merge discretionary and quantitative forecasting models. The financial industry has been working in the shadows for years to overcome this paradigm, pushing beyond technology, making use not only of automated models (trading systems and dynamic asset allocation systems) but also of the most modern Machine Learning techniques for Time Series Forecasting and Unsupervised Learning for the classification of financial instruments. However, in most cases, it uses proprietary technologies that are limited by definition (workforce, technology investment, scalability). Numerai, an offshoot of Jim Simons’ Renaissance Technologies, was the first to blaze a new path by building a first centralized machine learning competition, in order to gather a swarm of predictors outside the company, to integrate with internal intelligence. The discretionary contribution was therefore eliminated, and the information content generated internally was enriched by thousands of external contributors, in many cases linked to sectors unrelated to the financial industry, such as energy, aerospace, or biotechnology. In fact, the concept that to obtain good market forecasts, it is necessary to have only skills related to the financial world is overcome. What we have just described is the starting point of Rocket Capital Investment. To overcome the limit imposed by Numerai, a new competition has been engineered, which has the ambition to make this project even more “democratic”. How? Decentralizing, thanks to the Blockchain, the entire chain of participant management, collection, and validation of forecasts, as well as decisions relating to the evaluation and remuneration of the participants themselves. In this way, it is possible to make every aspect of the competition completely transparent and inviolable. Everything is managed by a Smart Contract, whose rules are known and shared. Let’s find out in more detail what it is.

Starting from the idea of Numerai, we have completely re-engineered all aspects related to the management of participants, Scoring, and Reward, following the concept of decentralization of the production chain. To this end, a proprietary token (MUSA token) has been created which acts as an exchange currency and which integrates a smart contract that acts as an autonomous competition manager. The communication interface between the users and the smart contract is a DApp (“Decentralized Application”). But let’s see in more detail how all these elements combine with each other, like in a puzzle.

Competition Technicalities

A suitably normalized dataset is issued every week, containing data from over 400 cryptocurrencies. For each asset, the data relating to prices, volumes traded, quantitative elements, as well as alternative data (information on the blockchain and on the sentiment of the various providers) are aggregated. Another difference with Numerai is the ability to distinguish assets for each row (the first column shows the related ticker). The last column instead contains the question to which the Data Scientists are asked to give an answer: the relative strength ranking of each asset, built on the forecast of the percentage change expected in the following week.

Registration for the Competition takes place by providing, in a completely anonymous way, the address of a crypto wallet on which the MUSA tokens are loaded. From that moment on, the MUSAs become, to all intents and purposes, the currency of exchange between participants and organizers. Every Monday a new Challenge opens, and all Data Scientists registered in the Contest are asked to use their models to generate predictions. By accessing the DApp, the participant can download the new dataset, complete with the history of the previous weeks and the last useful week. At this point the participant can perform two actions in sequence directly from the DApp: - Staking: MUSA tokens are placed on your prediction. - Submission: the forecast for the following week is uploaded to the blockchain.

Since the forecast consists of a series of numbers between 0 and 1 associated with each asset, it is very easy, the following week, to calculate the error committed in terms of RMSE (“Root Mean Square Error”). This allows creating a ranking on the participants, to be able to reward them accordingly with additional MUSA tokens. But let’s see in more detail how the Smart Contract, which was created, allows us to differentiate the reward based on different items (all, again, in a completely transparent and verifiable way): - Staking Reward: the mere fact of participating in the competition is remunerated. In future versions, it will also be possible to bet on the goodness of the other participants’ predictions. - Challenge Rew...

Search
Clear search
Close search
Google apps
Main menu