Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
OHLCV is an abbreviation for the five critical data points: Open, High, Low, Close, and Volume. It refers to the key points in analyzing an asset such as Bitcoin (BTC) in the market over a specified time. The dataset is important for not only traders and analysts but also for data scientists who work on BTC market prediction using artificial intelligence. The 'Open' and 'Close' prices represent the starting and ending price levels, while the 'High' and 'Low' are the highest and lowest prices during that period (a daily time frame (24h)). The 'Volume' is a measure of the total number of trades. This dataset provides five OHLCV data columns for BTC along with a column called "Next day close price" for regression problems and machine learning applications. The dataset includes daily information from 1/1/2012 to 8/6/2022.
Bitcoin's blockchain size was close to reaching 5450 gigabytes in 2024, as the database saw exponential growth by nearly one gigabyte every few days. The Bitcoin blockchain contains a continuously growing and tamper-evident list of all Bitcoin transactions and records since its initial release in January 2009. Bitcoin has a set limit of 21 million coins, the last of which will be mined around 2140, according to a forecast made in 2017. Bitcoin mining: A somewhat uncharted world Despite interest in the topic, there are few accurate figures on how big Bitcoin mining is on a country-by-country basis. Bitcoin's design philosophy is at the heart of this. Created out of protest against governments and central banks, Bitcoin's blockchain effectively hides both the country of origin and the destination country within a (mining) transaction. Research involving IP addresses placed the United States as the world's most Bitcoin mining country in 2022 - but the source admits IP addresses can easily be manipulated using VPN. Note that mining figures are different from figures on Bitcoin trading: Africa and Latin America were more interested in buying and selling BTC than some of the world's developed economies. Bitcoin developments Bitcoin's trade volume slowed in the second quarter of 2023, after hitting a noticeable growth at the beginning of the year. The coin outperformed most of the market. Some attribute this to the announcement in June 203 that BlackRock filed for a Bitcoin ETF. This iShares Bitcoin Trust was to use Coinbase Custody as its custodian. Regulators in the United States had not yet approved any applications for spot ETFs on Bitcoin.
Bitcoin's transaction volume was at its highest in December 2023, when the network processed over 724,000 coins on the same day. Bitcoin generally has a higher transaction activity than other cryptocurrencies, except Ethereum. This cryptocurrency is often processed more than one million times per day. Note that the transaction volume here refers to transactions registered within the Bitcoin blockchain. It should not be confused with Bitcoin's 24-hour trade volume, a metric associated with crypto exchanges. The more Bitcoin transactions, the more it is used in B2C payments? A Bitcoin transaction recorded in the blockchain can be any transaction, including B2C but also P2P. While it is possible to see in the blockchain which address sent Bitcoin to whom, details on who this person is and where they are from are often missing. Bitcoin was designed to go against monetary authorities and prides itself on being anonymous. An important argument against Bitcoin replacing cash or cards in payments is that the cryptocurrency was not allowed for such a task: Bitcoin ranks among the slowest cryptocurrencies in terms of transaction speed. Are cryptocurrencies taking over payments? Cryptocurrency payments are set to grow at a CAGR of nearly 17 percent between 2022 and 2029, although the market is relatively small. The forecast is according to a market estimate made in early 2023, based on various conditions and sources available at that time. Research across 40 countries during the same time suggested that the market share of cryptocurrency in e-commerce transactions was "less than one percent" in all surveyed countries, with predictions being this would not change in the future.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Cryptocurrency historical datasets from January 2012 (if available) to October 2021 were obtained and integrated from various sources and Application Programming Interfaces (APIs) including Yahoo Finance, Cryptodownload, CoinMarketCap, various Kaggle datasets, and multiple APIs. While these datasets used various formats of time (e.g., minutes, hours, days), in order to integrate the datasets days format was used for in this research study. The integrated cryptocurrency historical datasets for 80 cryptocurrencies including but not limited to Bitcoin (BTC), Ethereum (ETH), Binance Coin (BNB), Cardano (ADA), Tether (USDT), Ripple (XRP), Solana (SOL), Polkadot (DOT), USD Coin (USDC), Dogecoin (DOGE), Tron (TRX), Bitcoin Cash (BCH), Litecoin (LTC), EOS (EOS), Cosmos (ATOM), Stellar (XLM), Wrapped Bitcoin (WBTC), Uniswap (UNI), Terra (LUNA), SHIBA INU (SHIB), and 60 more cryptocurrencies were uploaded in this online Mendeley data repository. Although the primary attribute of including the mentioned cryptocurrencies was the Market Capitalization, a subject matter expert i.e., a professional trader has also guided the initial selection of the cryptocurrencies by analyzing various indicators such as Relative Strength Index (RSI), Moving Average Convergence/Divergence (MACD), MYC Signals, Bollinger Bands, Fibonacci Retracement, Stochastic Oscillator and Ichimoku Cloud. The primary features of this dataset that were used as the decision-making criteria of the CLUS-MCDA II approach are Timestamps, Open, High, Low, Closed, Volume (Currency), % Change (7 days and 24 hours), Market Cap and Weighted Price values. The available excel and CSV files in this data set are just part of the integrated data and other databases, datasets and API References that was used in this study are as follows: [1] https://finance.yahoo.com/ [2] https://coinmarketcap.com/historical/ [3] https://cryptodatadownload.com/ [4] https://kaggle.com/philmohun/cryptocurrency-financial-data [5] https://kaggle.com/deepshah16/meme-cryptocurrency-historical-data [6] https://kaggle.com/sudalairajkumar/cryptocurrencypricehistory [7] https://min-api.cryptocompare.com/data/price?fsym=BTC&tsyms=USD [8] https://min-api.cryptocompare.com/ [9] https://p.nomics.com/cryptocurrency-bitcoin-api [10] https://www.coinapi.io/ [11] https://www.coingecko.com/en/api [12] https://cryptowat.ch/ [13] https://www.alphavantage.co/ This dataset is part of the CLUS-MCDA (Cluster analysis for improving Multiple Criteria Decision Analysis) and CLUS-MCDAII Project: https://aimaghsoodi.github.io/CLUSMCDA-R-Package/ https://github.com/Aimaghsoodi/CLUS-MCDA-II https://github.com/azadkavian/CLUS-MCDA
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This is the market data of Bitcoin in terms of price and volume from August 2015 to August 2021. The time interval of sampling is selected as four-hour, that is to say, we choose every kind of price and volume every of four-hour as the original data. The original market data of Bitcoin are obtained from Poloniex, one of the most active crypto-asset exchanges. Download link on XBlock: http://xblock.pro/#/dataset/5
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
We present a high-frequency dataset of algorithmic trading. Given that, the dataset contains different time intervals depending on the timestamp when an arbitrage opportunity occurred. Our dataset has 9,799,130 tick-level records of the Bitcoin-to-Euro exchange rate starting from 2019-01-01 00:00:31 until 2020-03-30 23:59:48. Data covered information about different cryptocurrency pairs from 18 cryptocurrency exchanges. These pairs contained information about exchanges in which it was possible to buy and sell simultaneously. Each row presented the amount of arbitrage that it was possible to earn if a transaction would have been executed. The dataset contains information about the amount of arbitrage that could be earned after executing a transaction in given cryptocurrency exchanges, the quantity which had to be bought to earn arbitrage, the best sell, and the best buy prices, the balance of fiat currency in “Exchange 1” and the balance of cryptocurrency in “Exchange 2”. If there was enough fiat currency in “Exchange 1” and enough cryptocurrency in “Exchange 2” it means that the transaction was successfully executed and given arbitrage amount was earned. This information could be used by investors to discover potential earning capabilities, and create effective arbitrage trading strategies. Moreover, this dataset could serve academics for deeper analysis of efficiency and liquidity questions as well as it could be used to spot and evaluate risks in the market, identify patterns in the market. Short description of the dataset: ID - Unique ID arb_timestamp - timestamp of arbitrage opportunity arb_exch1 - presents exchanges where one was able to successfully buy Bitcoin arb_exch2 - presents exchanges where one was able to successfully sell Bitcoin arb_ticker - BTCEUR exchange rate arb_prc - percentage earned compared to the invested amount arb_amount - the amount of arbitrage that would be earned if a transaction had been executed arb_quantity - Bitcoin quantity that needed to be bought in order to execute a transaction and to earn arbitrage best_sell_price - best price at which it was possible to sell Bitcoin in "Exchange 2" best_buy_price - best price at which it was possible to buy Bitcoin in "Exchange 1" balance_fiat - the amount of Euros available in “Exchange 1” balance_crypto - the amount of Bitcoin available in “Exchange 2”
This dataset is an extra updating dataset for the G-Research Crypto Forecasting competition.
This is a daily updated dataset, automaticlly collecting market data for G-Research crypto forecasting competition. The data is of the 1-minute resolution, collected for all competition assets and both retrieval and uploading are fully automated. see discussion topic.
For every asset in the competition, the following fields from Binance's official API endpoint for historical candlestick data are collected, saved, and processed.
1. **timestamp** - A timestamp for the minute covered by the row.
2. **Asset_ID** - An ID code for the cryptoasset.
3. **Count** - The number of trades that took place this minute.
4. **Open** - The USD price at the beginning of the minute.
5. **High** - The highest USD price during the minute.
6. **Low** - The lowest USD price during the minute.
7. **Close** - The USD price at the end of the minute.
8. **Volume** - The number of cryptoasset u units traded during the minute.
9. **VWAP** - The volume-weighted average price for the minute.
10. **Target** - 15 minute residualized returns. See the 'Prediction and Evaluation section of this notebook for details of how the target is calculated.
11. **Weight** - Weight, defined by the competition hosts [here](https://www.kaggle.com/cstein06/tutorial-to-the-g-research-crypto-competition)
12. **Asset_Name** - Human readable Asset name.
The dataframe is indexed by timestamp
and sorted from oldest to newest.
The first row starts at the first timestamp available on the exchange, which is July 2017 for the longest-running pairs.
The following is a collection of simple starter notebooks for Kaggle's Crypto Comp showing PurgedTimeSeries in use with the collected dataset. Purged TimesSeries is explained here. There are many configuration variables below to allow you to experiment. Use either GPU or TPU. You can control which years are loaded, which neural networks are used, and whether to use feature engineering. You can experiment with different data preprocessing, model architecture, loss, optimizers, and learning rate schedules. The extra datasets contain the full history of the assets in the same format as the competition, so you can input that into your model too.
These notebooks follow the ideas presented in my "Initial Thoughts" here. Some code sections have been reused from Chris' great (great) notebook series on SIIM ISIC melanoma detection competition here
This is a work in progress and will be updated constantly throughout the competition. At the moment, there are some known issues that still needed to be addressed:
Opening price with an added indicator (MA50):
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F2234678%2Fb8664e6f26dc84e9a40d5a3d915c9640%2Fdownload.png?generation=1582053879538546&alt=media" alt="">
Volume and number of trades:
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F2234678%2Fcd04ed586b08c1576a7b67d163ad9889%2Fdownload-1.png?generation=1582053899082078&alt=media" alt="">
This data is being collected automatically from the crypto exchange Binance.
The following dataset contains the attributes: Date: Specific date to be observed for the corresponding price. Open: The opening price for the day High: The maximum price it has touched for the day Low: The minimum price it has touched for the day Close: The closing price for the day percent_change_24h: Percentage change for the last 24hours Volume: Volume of Bitcoin traded at the date Market Cap: Market Value of traded Bitcoin
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Each file contains klines for 1 month period with 1 minute intervals. File name formating looks like mm-yyyy-SMB1SMB2 (e.g. 11-2017-XRPBTC).
This data set contains now only XRP/BTC and ETH/USDT symbol pair now, but it will be expand soon.
This dataset was collected from Binance Exchange | Worlds Largest Crypto Exchange
This data set could inspire you on most efficient trading algorithms.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Authors, through Twitter API, collected this database over eight months. These data are tweets of over 50 experts regarding market analysis of 40 cryptocurrencies. These experts are known as influencers on social networks such as Twitter. The theory of Behavioral economics shows that the opinions of people, especially experts, can impact the stock market trend (here, cryptocurrencies). Existing databases often cover tweets related to one or more cryptocurrencies. Also, in these databases, no attention is paid to the user's expertise, and most of the data is extracted using hashtags. Failure to pay attention to the user's expertise causes the irrelevant volume to increase and the neutral polarity to increase considerably. This database has a main table named "Tweets1" with 11 columns and 40 tables to separate comments related to each cryptocurrency. The columns of the main table and the cryptocurrency tables are explained in the attached document. Researchers can use this dataset in various machine learning tasks, such as sentiment analysis and deep transfer learning with sentiment analysis. Also, this data can be used to check the impact of influencers' opinions on the cryptocurrency market trend. The use of this database is allowed by mentioning the source. Also, in this version, we have added the excel version of the database and Python code to extract the names of influencers and tweets. in Version(3): In the new version, three datasets related to historical prices and sentiments related to Bitcoin, Ethereum, and Binance have been added as Excel files from January 1, 2023, to June 12, 2023. Also, two datasets of 52 influential tweets in cryptocurrencies have been published, along with the score and polarity of sentiments regarding more than 300 cryptocurrencies from February 2021 to June 2023. Also, two Python codes related to the sentiment analysis algorithm of tweets with Python have been published. This algorithm combines RoBERTa pre-trained deep neural network and BiGRU deep neural network with an attention layer (see code Preprocessing_and_sentiment_analysis with python).
This is a submission for Challenge #24 by Desights User
Click here for Challenge Details Note: This submission is in REVIEW state and is only accessible by Challenge Reviewers. So you might get errors when you try to download this asset directly from Ocean Market.
Submission Description
This report explores the relationship between Google Trends data and cryptocurrency price trends, focusing on Bitcoin. We found a significant correlation between search interest and Bitcoin's price, suggesting that heightened public interest often precedes price movements. A machine learning approach using Random Forest yielded the most accurate predictions for Bitcoin's search interest, indicating that this algorithm can effectively forecast short-term trends. The ideal lag time for predicting Bitcoin's search interest was zero, reinforcing the notion that current public interest influences market behavior. Additionally, Ethereum's weekly volume data provided insights into broader market trends, showing fluctuations that could reflect general cryptocurrency market activity. These findings offer valuable insights for traders and investors and highlight the potential for machine learning models to enhance market analysis. All code and datasets used in this study are available on GitHub, providing a resource for further exploration and validation of the results.
The code and datasets used in this report are publicly available on GitHub, providing transparency and allowing others to replicate or build upon this work. The repository can be found at https://github.com/mawutory/crypto-trends. This repository contains all the script s and data used in the analyses, allowing researchers, developers, and enthusiasts to explore and extend the findings.
Within the repository, two primary subfolders organize the resources: trends and prices. The trends folder contains datasets related to Google Trends data, including the search interest for Bitcoin and other cryptocurrencies. The prices folder holds price trend data for various cryptocurrencies, including Bitcoin, Ethereum, and others, with detailed information on price movements, volumes, and other related metrics.
To explore the code, you can navigate to the respective subfolders to find Python scripts used to preprocess data, train machine learning models, and predict trends. The datasets are also available for download, providing a basis for additional analyses or custom implementations.
https://www.kappasignal.com/p/legal-disclaimer.htmlhttps://www.kappasignal.com/p/legal-disclaimer.html
This analysis presents a rigorous exploration of financial data, incorporating a diverse range of statistical features. By providing a robust foundation, it facilitates advanced research and innovative modeling techniques within the field of finance.
Historical daily stock prices (open, high, low, close, volume)
Fundamental data (e.g., market capitalization, price to earnings P/E ratio, dividend yield, earnings per share EPS, price to earnings growth, debt-to-equity ratio, price-to-book ratio, current ratio, free cash flow, projected earnings growth, return on equity, dividend payout ratio, price to sales ratio, credit rating)
Technical indicators (e.g., moving averages, RSI, MACD, average directional index, aroon oscillator, stochastic oscillator, on-balance volume, accumulation/distribution A/D line, parabolic SAR indicator, bollinger bands indicators, fibonacci, williams percent range, commodity channel index)
Feature engineering based on financial data and technical indicators
Sentiment analysis data from social media and news articles
Macroeconomic data (e.g., GDP, unemployment rate, interest rates, consumer spending, building permits, consumer confidence, inflation, producer price index, money supply, home sales, retail sales, bond yields)
Stock price prediction
Portfolio optimization
Algorithmic trading
Market sentiment analysis
Risk management
Researchers investigating the effectiveness of machine learning in stock market prediction
Analysts developing quantitative trading Buy/Sell strategies
Individuals interested in building their own stock market prediction models
Students learning about machine learning and financial applications
The dataset may include different levels of granularity (e.g., daily, hourly)
Data cleaning and preprocessing are essential before model training
Regular updates are recommended to maintain the accuracy and relevance of the data
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
OHLCV is an abbreviation for the five critical data points: Open, High, Low, Close, and Volume. It refers to the key points in analyzing an asset such as Bitcoin (BTC) in the market over a specified time. The dataset is important for not only traders and analysts but also for data scientists who work on BTC market prediction using artificial intelligence. The 'Open' and 'Close' prices represent the starting and ending price levels, while the 'High' and 'Low' are the highest and lowest prices during that period (a daily time frame (24h)). The 'Volume' is a measure of the total number of trades. This dataset provides five OHLCV data columns for BTC along with a column called "Next day close price" for regression problems and machine learning applications. The dataset includes daily information from 1/1/2012 to 8/6/2022.