https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset contains historical daily prices for all tickers currently trading on NASDAQ. The up to date list is available from nasdaqtrader.com. The historic data is retrieved from Yahoo finance via yfinance python package.
It contains prices for up to 01 of April 2020. If you need more up to date data, just fork and re-run data collection script also available from Kaggle.
The date for every symbol is saved in CSV format with common fields:
All that ticker data is then stored in either ETFs or stocks folder, depending on a type. Moreover, each filename is the corresponding ticker symbol. At last, symbols_valid_meta.csv
contains some additional metadata for each ticker such as full name.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Material published at "https://opencodecom.net/post/2021-07-22-como-baixar-e-zipar-csv-utilizando-python/"
This dataset includes daily historical price data for Bitcoin (BTC-USD) from 2014 to 2025, obtained through web scraping from the Yahoo Finance page using Selenium. The primary data source can be accessed at Yahoo Finance - Bitcoin Historical Data . The dataset contains daily information such as opening price (Open), highest price (High), lowest price (Low), closing price (Close), adjusted closing price (Adj Close), and trading volume (Volume).
About Bitcoin: Bitcoin (BTC) is the world's first decentralized digital currency, introduced in 2009 by an anonymous creator known as Satoshi Nakamoto. It operates on a peer-to-peer network powered by blockchain technology, enabling secure, transparent, and trustless transactions without the need for intermediaries like banks. Bitcoin's limited supply of 21 million coins and its growing adoption have made it a popular asset for investment, trading, and as a hedge against inflation.
We are excited to share this dataset and look forward to seeing the insights it can provide. We hope it will inspire collaboration and innovation within the community. By leveraging this daily data, we can explore trends, develop predictive models, and design innovative trading strategies that deepen our understanding of Bitcoin's market behavior. Together, we can unlock new opportunities and contribute to the collective advancement of cryptocurrency research and analysis.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset presents an extensive record of daily historical stock prices for Tesla, Inc. (TSLA), one of the world’s most innovative and closely watched electric vehicle and clean energy companies. The data was sourced from Yahoo Finance, a widely used and trusted provider of financial market data, and covers a significant period spanning from Tesla’s initial public offering (IPO) to the most recent date available at the time of extraction.
The dataset includes critical trading metrics for each market day, such as the opening price, highest and lowest prices of the day, closing price, adjusted closing price (accounting for dividends and splits), and total trading volume. This rich dataset supports a variety of use cases, including financial market analysis, investment research, time series forecasting, development and backtesting of trading algorithms, and educational projects in data science and finance.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The data files contain seven low-dimensional financial research data (in .txt format) and four high-dimensional daily stock prices data (in .csv format). The low-dimensional data sets are provided by Lorenzo Garlappi on his website, while the high-dimensional data sets are downloaded from Yahoo!Finance by the contributor's own efforts. The description of the low-dimensional data sets can be found in DeMiguel et al. (2009, RFS).
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Cryptocurrency historical datasets from January 2012 (if available) to October 2021 were obtained and integrated from various sources and Application Programming Interfaces (APIs) including Yahoo Finance, Cryptodownload, CoinMarketCap, various Kaggle datasets, and multiple APIs. While these datasets used various formats of time (e.g., minutes, hours, days), in order to integrate the datasets days format was used for in this research study. The integrated cryptocurrency historical datasets for 80 cryptocurrencies including but not limited to Bitcoin (BTC), Ethereum (ETH), Binance Coin (BNB), Cardano (ADA), Tether (USDT), Ripple (XRP), Solana (SOL), Polkadot (DOT), USD Coin (USDC), Dogecoin (DOGE), Tron (TRX), Bitcoin Cash (BCH), Litecoin (LTC), EOS (EOS), Cosmos (ATOM), Stellar (XLM), Wrapped Bitcoin (WBTC), Uniswap (UNI), Terra (LUNA), SHIBA INU (SHIB), and 60 more cryptocurrencies were uploaded in this online Mendeley data repository. Although the primary attribute of including the mentioned cryptocurrencies was the Market Capitalization, a subject matter expert i.e., a professional trader has also guided the initial selection of the cryptocurrencies by analyzing various indicators such as Relative Strength Index (RSI), Moving Average Convergence/Divergence (MACD), MYC Signals, Bollinger Bands, Fibonacci Retracement, Stochastic Oscillator and Ichimoku Cloud. The primary features of this dataset that were used as the decision-making criteria of the CLUS-MCDA II approach are Timestamps, Open, High, Low, Closed, Volume (Currency), % Change (7 days and 24 hours), Market Cap and Weighted Price values. The available excel and CSV files in this data set are just part of the integrated data and other databases, datasets and API References that was used in this study are as follows: [1] https://finance.yahoo.com/ [2] https://coinmarketcap.com/historical/ [3] https://cryptodatadownload.com/ [4] https://kaggle.com/philmohun/cryptocurrency-financial-data [5] https://kaggle.com/deepshah16/meme-cryptocurrency-historical-data [6] https://kaggle.com/sudalairajkumar/cryptocurrencypricehistory [7] https://min-api.cryptocompare.com/data/price?fsym=BTC&tsyms=USD [8] https://min-api.cryptocompare.com/ [9] https://p.nomics.com/cryptocurrency-bitcoin-api [10] https://www.coinapi.io/ [11] https://www.coingecko.com/en/api [12] https://cryptowat.ch/ [13] https://www.alphavantage.co/ This dataset is part of the CLUS-MCDA (Cluster analysis for improving Multiple Criteria Decision Analysis) and CLUS-MCDAII Project: https://aimaghsoodi.github.io/CLUSMCDA-R-Package/ https://github.com/Aimaghsoodi/CLUS-MCDA-II https://github.com/azadkavian/CLUS-MCDA
Stock Market Supplementary Data: Company Names and Ticker Symbols
Company Names and Ticker Symbols
nasdaq.csv
nyse.csv
sp500.csv
forbes2000.csv
yahootickers.xlsx
Data from - https://datahub.io/core/nasdaq-listings (License) - https://datahub.io/core/s-and-p-500-companies (License) - https://datahub.io/core/nyse-other-listings (License) - https://investexcel.net/all-yahoo-finance-stock-tickers (open data) - https://www.kaggle.com/ash316/forbes-top-2000-companies (open data)
Banner Photo: https://unsplash.com/photos/VP4WmibxvcY
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Experience a decade of NASDAQ market dynamics with this comprehensive historical price dataset from 2014 to 2024.
The NASDAQ Composite is a benchmark index representing the performance of more than 2,500 stocks listed on the NASDAQ stock exchange, encompassing various sectors including technology, healthcare, and finance. This dataset, sourced meticulously from Yahoo Finance, offers daily insights into the index's opening, highest, lowest, and closing prices, along with adjusted close prices and daily volume.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
The file provides daily trading data for Bitcoin (BTC) in USD, covering the period from January 1, 2015, to June 30, 2024. The dataset includes key indicators such as Open Price, High Price, Low Price, Close Price, Adjusted Close Price, and Volume.This data originates from Yahoo Finance and serves as a foundation for time series analysis, forecasting, and machine learning models, focusing on identifying price patterns, volatility trends, and trading behaviors within the cryptocurrency market.Raw data (CSV file). Source: Yahoo Finance.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset contains daily stock data for Meta Platforms, Inc. (META), formerly Facebook Inc., from May 19, 2012, to January 20, 2025. It offers a comprehensive view of Meta’s stock performance and market fluctuations during a period of significant growth, acquisitions, and technological advancements. This dataset is valuable for financial analysis, market prediction, machine learning projects, and evaluating the impact of Meta’s business decisions on its stock price.
The dataset includes the following key features:
Open: Stock price at the start of the trading day. High: Highest stock price during the trading day. Low: Lowest stock price during the trading day. Close: Stock price at the end of the trading day. Adj Close: Adjusted closing price, accounting for corporate actions like stock splits, dividends, and other financial adjustments. Volume: Total number of shares traded during the trading day.
Date: The date of the trading day, formatted as YYYY-MM-DD. Open: The stock price at the start of the trading day. High: The highest price reached by the stock during the trading day. Low: The lowest price reached by the stock during the trading day. Close: The stock price at the end of the trading day. Adj Close: The adjusted closing price, which reflects corporate actions like stock splits and dividend payouts. Volume: The total number of shares traded on that specific day.
This dataset was sourced from reliable public APIs such as Yahoo Finance or Alpha Vantage. It is provided for educational and research purposes and is not affiliated with Meta Platforms, Inc. Users are encouraged to adhere to the terms of use of the original data provider.
http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/
This dataset contains daily OHLCV data for ~ 2000 Indian Stocks listed on the National Stock Exchange for all time. The columns are multi-index columns, so this needs to be taken into account when reading and using the data. Source : Yahoo Finance Type: All files are CSV format. Currency : INR
All the tickers have been collected from here : https://www.nseindia.com/market-data/securities-available-for-trading
If using pandas
, the following function is a utility to read any of the CSV files:
```
import pandas as pd
def read_ohlcv(filename):
"read a given ohlcv data file downloaded from yfinance"
return pd.read_csv(
filename,
skiprows=[0, 1, 2], # remove the multiindex rows that cause trouble
names=["Date", "Close", "High", "Low", "Open", "Volume"],
index_col="Date",
parse_dates=["Date"],
)
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Actually, I prepare this dataset for students on my Deep Learning and NLP course.
But I am also very happy to see kagglers play around with it.
Have fun!
Description:
There are two channels of data provided in this dataset:
News data: I crawled historical news headlines from Reddit WorldNews Channel (/r/worldnews). They are ranked by reddit users' votes, and only the top 25 headlines are considered for a single date. (Range: 2008-06-08 to 2016-07-01)
Stock data: Dow Jones Industrial Average (DJIA) is used to "prove the concept". (Range: 2008-08-08 to 2016-07-01)
I provided three data files in .csv format:
RedditNews.csv: two columns The first column is the "date", and second column is the "news headlines". All news are ranked from top to bottom based on how hot they are. Hence, there are 25 lines for each date.
DJIA_table.csv: Downloaded directly from Yahoo Finance: check out the web page for more info.
Combined_News_DJIA.csv: To make things easier for my students, I provide this combined dataset with 27 columns. The first column is "Date", the second is "Label", and the following ones are news headlines ranging from "Top1" to "Top25".
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Predicting the stock market is one of the most commonly performed projects when someone is learning about ML and Data Science. After all, who wouldn't want to delegate the task of picking stocks to a model and reap the rewards for themselves? However, one of the most difficult and tedious steps to predict what stocks to invest in is actually gathering the data to use. There are so many options and it is important to get sufficient information for each. But, what if you can skip this step and just download a dataset that has all that information easily available for you? Look no further as this is the answer to this problem.
This dataset contains information of 4447 stocks traded under Nasdaq across various exchanges. There is a file that contains information for all 4447 stocks but also has several null fields, which is why I labeled it as full_financial_stocks_raw.csv --it has minimal modifications to the values inside the rows. The second file, dividend_stocks_only.csv, is still a raw-ish style dataset but it only contains stocks that pay out dividends to its shareholders. Interestingly, it seems dividend-paying stocks have more information about them, which explains why this file has significantly fewer rows with null values.
Update: In the next 24 hours, I will be uploading an optimized, feature-engineered dataset that has fewer columns overall and fewer rows with null values. This dataset is intended to be a fully cleaned option to directly feed into ML/DL models.
I would like to thank the sources where I obtained my data, which are the FTP Nasdaq Trader website and the Yahoo Finance API.
Analyzing the stock market is one of the most intriguing endeavors I could think of as the ways it can be influenced are so broad and distinct from one another. A news article can influence how investors view a particular company, social media can directly fluctuate a company's share price, and there are numerous calculations and formulas that can show what stocks are worth investing in.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
ETH USD price CSV staring from 1 JAN 2021 till 24 OCT 2024. Information shared here as is provided as per data imported using Yahoo finance.
Update Frequency Since new stock market data is generated and made available every day, in order to have the latest and most useful information, the dataset will be updated once a month.
Acknowledgements Yahoo Finance : https://finance.yahoo.com/ Teckgeekz
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F16731800%2F32e64021eb6b6e1d4dcf950ae6700e2f%2FDesign%20sem%20nome.png?generation=1706391327847470&alt=media" alt="">
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F16731800%2F0e4269c105a1cf21e25070bf13321c5f%2Fsddsdssd.png?generation=1706391479391619&alt=media" alt="">
The dataset comprises the historical stock prices of PT Bank Central Asia Tbk (BBCA.JK) retrieved from the Yahoo Finance website, spanning from January 2019 to the present date. This data provides valuable information for various analytical purposes, such as forecasting future stock prices, implementing machine learning models, and conducting data analysis or visualization tasks. By making this dataset available to the Kaggle community, contributors can explore and utilize it for research, modeling, and educational purposes. The dataset is regularly updated through an automated process scheduled on Kaggle, ensuring its reliability and relevance for ongoing projects and analyses.
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
All data compiled from Yahoo Finance
If you have questions, e-mail me: jiunyyen@gmail.com
Happy mining!
http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/
The following information can also be found at https://www.kaggle.com/davidwallach/financial-tweets. Out of curosity, I just cleaned the .csv files to perform a sentiment analysis. So both the .csv files in this dataset are created by me.
Anything you read in the description is written by David Wallach and using all this information, I happen to perform my first ever sentiment analysis.
"I have been interested in using public sentiment and journalism to gather sentiment profiles on publicly traded companies. I first developed a Python package (https://github.com/dwallach1/Stocker) that scrapes the web for articles written about companies, and then noticed the abundance of overlap with Twitter. I then developed a NodeJS project that I have been running on my RaspberryPi to monitor Twitter for all tweets coming from those mentioned in the content section. If one of them tweeted about a company in the stocks_cleaned.csv file, then it would write the tweet to the database. Currently, the file is only from earlier today, but after about a month or two, I plan to update the tweets.csv file (hopefully closer to 50,000 entries.
I am not quite sure how this dataset will be relevant, but I hope to use these tweets and try to generate some sense of public sentiment score."
This dataset has all the publicly traded companies (tickers and company names) that were used as input to fill the tweets.csv. The influencers whose tweets were monitored were: ['MarketWatch', 'business', 'YahooFinance', 'TechCrunch', 'WSJ', 'Forbes', 'FT', 'TheEconomist', 'nytimes', 'Reuters', 'GerberKawasaki', 'jimcramer', 'TheStreet', 'TheStalwart', 'TruthGundlach', 'Carl_C_Icahn', 'ReformedBroker', 'benbernanke', 'bespokeinvest', 'BespokeCrypto', 'stlouisfed', 'federalreserve', 'GoldmanSachs', 'ianbremmer', 'MorganStanley', 'AswathDamodaran', 'mcuban', 'muddywatersre', 'StockTwits', 'SeanaNSmith'
The data used here is gathered from a project I developed : https://github.com/dwallach1/StockerBot
I hope to develop a financial sentiment text classifier that would be able to track Twitter's (and the entire public's) feelings about any publicly traded company (and cryptocurrency)
This dataset contains the price movements of the SOL cryptocurrency over the last four years. The data has been collected through the Yahoo Finance API. The dataset consists of the following columns:
DATE: Date and time the price information pertains to.
OPEN: Opening price on the specified date.
HIGH: Highest price reached on the specified date.
LOW: Lowest price reached on the specified date.
CLOSE: Closing price on the specified date.
VOLUME: Volume of transactions that occurred on the specified date.
This dataset can be utilized to analyze recent price movements of the SOL cryptocurrency, identify trends, and make future price predictions. It can be used for various purposes including financial analysis, training machine learning models, and understanding market trends.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
The dataset contains the file required for training and testing and split accordingly.
There are two groups of features that you can use for prediction:
Files found in Fundamentals folder is a processed format of the files found in raw folder. Ratios and other values are stretched to match the length of the closing price column such that the value in the pe_ratio column for example is the PE ratio from the most recent quarter and this applies for every column.
Technical indicators are calculated with the default parameters used in Pandas_TA package.
Data is collected form finance.yahoo.com and macrotrends.net Timeframe for the given data is different from one ticker to another because of unavailability of some stocks for a given time frame on either of the websites.
All code required to collect the data and perform preprocessing and feature engineering to get the data in the given format can be found in the following notebooks:
Columns names are supposed to be self-explanatory assuming you are familiar with the stock market. Some acronyms you may encounter:
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset contains historical daily prices for all tickers currently trading on NASDAQ. The up to date list is available from nasdaqtrader.com. The historic data is retrieved from Yahoo finance via yfinance python package.
It contains prices for up to 01 of April 2020. If you need more up to date data, just fork and re-run data collection script also available from Kaggle.
The date for every symbol is saved in CSV format with common fields:
All that ticker data is then stored in either ETFs or stocks folder, depending on a type. Moreover, each filename is the corresponding ticker symbol. At last, symbols_valid_meta.csv
contains some additional metadata for each ticker such as full name.