https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset contains historical daily prices for all tickers currently trading on NASDAQ. The up to date list is available from nasdaqtrader.com. The historic data is retrieved from Yahoo finance via yfinance python package.
It contains prices for up to 01 of April 2020. If you need more up to date data, just fork and re-run data collection script also available from Kaggle.
The date for every symbol is saved in CSV format with common fields:
All that ticker data is then stored in either ETFs or stocks folder, depending on a type. Moreover, each filename is the corresponding ticker symbol. At last, symbols_valid_meta.csv
contains some additional metadata for each ticker such as full name.
http://www.gnu.org/licenses/lgpl-3.0.htmlhttp://www.gnu.org/licenses/lgpl-3.0.html
The National Stock Exchange of India Ltd. (NSE) is an Indian stock exchange located at Mumbai, Maharashtra, India. National Stock Exchange (NSE) was established in 1992 as a demutualized electronic exchange. It was promoted by leading financial institutions on request of the Government of India. It is India’s largest exchange by turnover. In 1994, it launched electronic screen-based trading. Thereafter, it went on to launch index futures and internet trading in 2000, which were the first of its kind in the country.
With the help of NSE, you can trade in the following segments:
Equities
Indices
Mutual Funds
Exchange Traded Funds
Initial Public Offerings
Security Lending and Borrowing Scheme
https://cdn6.newsnation.in/images/2019/06/24/Sharemarket-164616041_6.jpg" alt="Stock image">
Companies on successful IPOs gets their Stocks traded over different Stock Exchnage platforms. NSE is one important platofrm in India. There are thousands of companies trading their stocks in NSE. But, I have chosen two popular and high rated IT service companies of India; TCS and INFOSYS. and the third one is the benchmark for Indian IT companies , i.e. NIFTY_IT_INDEX .
The dataset contains three csv files. Each resembling to INFOSYS, NIFTY_IT_INDEX, and TCS, respectively. One can easily identify that by the name of CSV files.
Timeline of Data recording : 1-1-2015 to 31-12-2015.
Source of Data : Official NSE website.
Method : We have used the NSEpy api to fetch the data from NSE site. I have also mentioned my approach in this Kernel - "**WebScraper to download data for NSE**". Please go though that to better understand the nature of this dataset.
INFOSYS - 248 x 15 || NIFTY_IT_INDEX - 248 x 7 || **TCS - 248 x 15
Colum Descriptors:
Date
: date on which data is recorded
Symbol
: NSE symbol of the stock
Series
: Series of that stock | EQ - Equity
OTHER SERIES' ARE:
EQ: It stands for Equity. In this series intraday trading is possible in addition to delivery.
BE: It stands for Book Entry. Shares falling in the Trade-to-Trade or T-segment are traded in this series and no intraday is allowed. This means trades can only be settled by accepting or giving the delivery of shares.
BL: This series is for facilitating block deals. Block deal is a trade, with a minimum quantity of 5 lakh shares or minimum value of Rs. 5 crore, executed through a single transaction, on the special “Block Deal window”. The window is opened for only 35 minutes in the morning from 9:15 to 9:50AM.
BT: This series provides an exit route to small investors having shares in the physical form with a cap of maximum 500 shares.
GC: This series allows Government Securities and Treasury Bills to be traded under this category.
IL: This series allows only FIIs to trade among themselves. Permissible only in those securities where maximum permissible limit for FIIs is not breached.
Prev Close
: Last day close point
Open
: current day open point
High
: current day highest point
Low
: current day lowest point
Last
: the final quoted trading price for a particular stock, or stock-market index, during the most recent day of trading.
Close
: Closing point for the current day
VWAP
: volume-weighted average price is the ratio of the value traded to total volume traded over a particular time horizon
Volume
: the amount of a security that was traded during a given period of time. For every buyer, there is a seller, and each
transaction contributes to the count of total volume.
Turnover
: Total Turnover of the stock till that day
Trades
: Number of buy or Sell of the stock.
Deliverable
: Volumethe quantity of shares which actually move from one set of people (who had those shares in their demat account before today and are selling today) to another set of people (who have purchased those shares and will get those shares by T+2 days in their demat account).
%Deliverble
: percentage deliverables of that stock
I woul dlike to acknowledge all my sincere thanks to the brains behind NSEpy api, and in particular SWAPNIL JARIWALA , who is also maintaining an amazing open source github repo for this api.
I have also built a starter kernel for this dataset. You can find that right here .
I am so excited to see your magical approaches for the same dataset.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Hascol
Unfortunately, the API this dataset used to pull the stock data isn't free anymore. Instead of having this auto-updating, I dropped the last version of the data files in here, so at least the historic data is still usable.
This dataset provides free end of day data for all stocks currently in the Dow Jones Industrial Average. For each of the 30 components of the index, there is one CSV file named by the stock's symbol (e.g. AAPL for Apple). Each file provides historically adjusted market-wide data (daily, max. 5 years back). See here for description of the columns: https://iextrading.com/developer/docs/#chart
Since this dataset uses remote URLs as files, it is automatically updated daily by the Kaggle platform and automatically represents the latest data.
List of stocks and symbols as per https://en.wikipedia.org/wiki/Dow_Jones_Industrial_Average
Thanks to https://iextrading.com for providing this data for free!
Data provided for free by IEX. View IEX’s Terms of Use.
https://fred.stlouisfed.org/legal/#copyright-pre-approvalhttps://fred.stlouisfed.org/legal/#copyright-pre-approval
View data of the S&P 500, an index of the stocks of 500 leading companies in the US economy, which provides a gauge of the U.S. equity market.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset presents an extensive record of daily historical stock prices for Tesla, Inc. (TSLA), one of the world’s most innovative and closely watched electric vehicle and clean energy companies. The data was sourced from Yahoo Finance, a widely used and trusted provider of financial market data, and covers a significant period spanning from Tesla’s initial public offering (IPO) to the most recent date available at the time of extraction.
The dataset includes critical trading metrics for each market day, such as the opening price, highest and lowest prices of the day, closing price, adjusted closing price (accounting for dividends and splits), and total trading volume. This rich dataset supports a variety of use cases, including financial market analysis, investment research, time series forecasting, development and backtesting of trading algorithms, and educational projects in data science and finance.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The dataset contains historical technical data of Dhaka Stock Exchange (DSE). The data was collected from different sources found in the internet where the data was publicly available. The data available here are used for information and research purposes and though to the best of our knowledge, it does not contain any mistakes, there might still be some mistakes. It is not encourages to use this dataset for portfolio management purposes and use this dataset out of your own interest. The contributors do not hold any liability if it is used for any purposes.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Tracking United HealthCare Stock Performance Since IPO
This dataset provides historical stock data for UnitedHealth Group (UHG), one of the largest healthcare and insurance companies in the world. It covers stock prices, market capitalization, and trading volumes from the company's IPO to the present. As a Fortune 500 company with a significant market presence, analyzing UHG's stock performance can provide valuable insights into healthcare market trends, investment opportunities, and economic indicators.
This dataset is useful for:
CC0 (Public Domain) – This dataset is freely available for public and commercial use.
Get Nasdaq real-time and historical data with support for fast market replay at over 19 million book updates per second. Test our data for free with only 4 lines of code.
Nasdaq TotalView-ITCH is a proprietary data feed that disseminates full order book depth and last sale data from the Nasdaq stock market (XNAS). It delivers every quote and order at each price level, along with any event that updates the order book after an order is placed, such as trade executions, modifications, or cancellations. Nasdaq is the most active US equity exchange by volume and represented 13.03% of the average daily volume (ADV) as of January 2025.
With its L3 granularity, Nasdaq TotalView-ITCH captures information beyond the L1, top-of-book data available through SIP feeds and enables more accurate modeling of book imbalances, trade directionality, quote lifetimes, and more. This includes explicit trade aggressor side, odd lots, auction imbalance data, and the Net Order Imbalance Indicator (NOII) for the Nasdaq Opening and Closing Crosses and Nasdaq IPO/Halt Cross—the best predictor of Nasdaq opening and closing prices available. Other key advantages of Nasdaq TotalView-ITCH over SIP data include faster real-time dissemination and precise exchange-side timestamping directly from Nasdaq.
Real-time Nasdaq TotalView-ITCH data is included with a Plus or Unlimited subscription through our Databento US Equities service. Historical data is available for usage-based rates or with any subscription. Visit our pricing page for more details or to upgrade your plan.
Breadth of coverage: 20,329 products
Asset class(es): Equities
Origin: Directly captured at Equinix NY4 (Secaucus, NJ) with an FPGA-based network card and hardware timestamping. Synchronized to UTC with PTP.
Supported data encodings: DBN, CSV, JSON Learn more
Supported market data schemas: MBO, MBP-1, MBP-10, BBO-1s, BBO-1m, TBBO, Trades, OHLCV-1s, OHLCV-1m, OHLCV-1h, OHLCV-1d, Definition, Statistics, Status, Imbalance Learn more
Resolution: Immediate publication, nanosecond-resolution timestamps
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Experience a decade of NASDAQ market dynamics with this comprehensive historical price dataset from 2014 to 2024.
The NASDAQ Composite is a benchmark index representing the performance of more than 2,500 stocks listed on the NASDAQ stock exchange, encompassing various sectors including technology, healthcare, and finance. This dataset, sourced meticulously from Yahoo Finance, offers daily insights into the index's opening, highest, lowest, and closing prices, along with adjusted close prices and daily volume.
Enhancing Financial Market Predictions: Causality-Driven Feature Selection This paper introduces FinSen dataset that revolutionizes financial market analysis by integrating economic and financial news articles from 197 countries with stock market data. The dataset’s extensive coverage spans 15 years from 2007 to 2023 with temporal information, offering a rich, global perspective 160,000 records on financial market news. Our study leverages causally validated sentiment scores and LSTM models to enhance market forecast accuracy and reliability.
Our FinSen Dataset
This repository contains the dataset for Enhancing Financial Market Predictions: Causality-Driven Feature Selection, which has been accepted in ADMA 2024.
If the dataset or the paper has been useful in your research, please add a citation to our work:
@article{liang2024enhancing, title={Enhancing Financial Market Predictions: Causality-Driven Feature Selection}, author={Liang, Wenhao and Li, Zhengyang and Chen, Weitong}, journal={arXiv e-prints}, pages={arXiv--2408}, year={2024} }
Datasets [FinSen] can be downloaded manually from the repository as csv file. Sentiment and its score are generated by FinBert model from the Hugging Face Transformers library under the identifier "ProsusAI/finbert". (Araci, Dogu. "Finbert: Financial sentiment analysis with pre-trained language models." arXiv preprint arXiv:1908.10063 (2019).)
We only provide US for research purpose usage, please contact w.liang@adelaide.edu.au for other countries (total 197 included) if necessary.
We also provide other NLP datasets for text classification tasks here, please cite them correspondingly once you used them in your research if any.
20Newsgroups. Joachims, T., et al.: A probabilistic analysis of the rocchio algorithm with tfidf for text categorization. In: ICML. vol. 97, pp. 143–151. Citeseer (1997) AG News. Zhang, X., Zhao, J., LeCun, Y.: Character-level convolutional networks for text classification. Advances in neural information processing systems 28 (2015) Financial PhraseBank. Malo, P., Sinha, A., Korhonen, P., Wallenius, J., Takala, P.: Good debt or bad debt: Detecting semantic orientations in economic texts. Journal of the Association for Information Science and Technology 65(4), 782–796 (2014)
Dataloader for FinSen We provide the preprocessing file finsen.py for our FinSen dataset under dataloaders directory for more convienient usage.
Models - Text Classification
DAN-3.
Gobal Pooling CNN.
Models - Regression Prediction
LSTM
Using Sentiment Score from FinSen Predict Result on S&P500 Dependencies The code is based on PyTorch under code frame of https://github.com/torrvision/focal_calibration, please cite their work if you found it is useful.
:smiley: ☺ Happy Research !
This data is downloaded from the official Bombay Stock Exchange Website (BSE). This file contains the last 10 years of Historical Stock Price (By Security & Period) Security Name - Nestle India Ltd. Period - Daily Start Date - 2nd January 2012 End Date - 21st April 2022. This is one of the Best datasets for Regression Supervised Machine Learning. You can Perform SImple as well as Multiple Linear Regression on this Dataset.
https://fred.stlouisfed.org/legal/#copyright-citation-requiredhttps://fred.stlouisfed.org/legal/#copyright-citation-required
Graph and download economic data for CBOE Volatility Index: VIX (VIXCLS) from 1990-01-02 to 2025-07-10 about VIX, volatility, stock market, and USA.
Browse Advanced Micro Devices Inc (AMD) market data. Get instant pricing estimates and make batch downloads of binary, CSV, and JSON flat files.
Consolidated last sale, exchange BBO and national BBO across all US equity options exchanges. Includes single name stock options (e.g. TSLA), options on ETFs (e.g. SPY, QQQ), index options (e.g. VIX), and some indices (e.g. SPIKE and VSPKE). This dataset is based on the newer, binary OPRA feed after the migration to SIAC's OPRA Pillar SIP in 2021. OPRA is notable for the size of its data and we recommend users to anticipate several TBs of data per day for the full dataset in its highest granularity (MBP-1).
Origin: Options Price Reporting Authority
Supported data encodings: DBN, JSON, CSV Learn more
Supported market data schemas: MBP-1, OHLCV-1s, OHLCV-1m, OHLCV-1h, OHLCV-1d, TBBO, Trades, Statistics, Definition Learn more
Resolution: Immediate publication, nanosecond-resolution timestamps
https://crawlfeeds.com/privacy_policyhttps://crawlfeeds.com/privacy_policy
Unlock the power of online marketplace analytics with our comprehensive eBay products dataset. This premium collection contains 1.29 million products from eBay's global marketplace, providing extensive insights into one of the world's largest e-commerce platforms. Perfect for competitive analysis, pricing strategies, market research, and machine learning applications in e-commerce.
ODC Public Domain Dedication and Licence (PDDL) v1.0http://www.opendatacommons.org/licenses/pddl/1.0/
License information was derived automatically
S&P 500 index data including level, dividend, earnings and P/E ratio on a monthly basis since 1870. The S&P 500 (Standard and Poor's 500) is a free-float, capitalization-weighted index of the top ...
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Predicting the stock market is one of the most commonly performed projects when someone is learning about ML and Data Science. After all, who wouldn't want to delegate the task of picking stocks to a model and reap the rewards for themselves? However, one of the most difficult and tedious steps to predict what stocks to invest in is actually gathering the data to use. There are so many options and it is important to get sufficient information for each. But, what if you can skip this step and just download a dataset that has all that information easily available for you? Look no further as this is the answer to this problem.
This dataset contains information of 4447 stocks traded under Nasdaq across various exchanges. There is a file that contains information for all 4447 stocks but also has several null fields, which is why I labeled it as full_financial_stocks_raw.csv --it has minimal modifications to the values inside the rows. The second file, dividend_stocks_only.csv, is still a raw-ish style dataset but it only contains stocks that pay out dividends to its shareholders. Interestingly, it seems dividend-paying stocks have more information about them, which explains why this file has significantly fewer rows with null values.
Update: In the next 24 hours, I will be uploading an optimized, feature-engineered dataset that has fewer columns overall and fewer rows with null values. This dataset is intended to be a fully cleaned option to directly feed into ML/DL models.
I would like to thank the sources where I obtained my data, which are the FTP Nasdaq Trader website and the Yahoo Finance API.
Analyzing the stock market is one of the most intriguing endeavors I could think of as the ways it can be influenced are so broad and distinct from one another. A news article can influence how investors view a particular company, social media can directly fluctuate a company's share price, and there are numerous calculations and formulas that can show what stocks are worth investing in.
https://crawlfeeds.com/privacy_policyhttps://crawlfeeds.com/privacy_policy
Gain access to a structured dataset featuring thousands of products listed on Amazon India. This dataset is ideal for e-commerce analytics, competitor research, pricing strategies, and market trend analysis.
Product Details: Name, Brand, Category, and Unique ID
Pricing Information: Current Price, Discounted Price, and Currency
Availability & Ratings: Stock Status, Customer Ratings, and Reviews
Seller Information: Seller Name and Fulfillment Details
Additional Attributes: Product Description, Specifications, and Images
Format: CSV
Number of Records: 50,000+
Delivery Time: 3 Days
Price: $149.00
Availability: Immediate
This dataset provides structured and actionable insights to support e-commerce businesses, pricing strategies, and product optimization. If you're looking for more datasets for e-commerce analysis, explore our E-commerce datasets for a broader selection.
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
This dataset contains historical data of stocks listed on IHSG with time ranges per minutes, hourly, and daily. The source of the dataset is taken from Yahoo Finance's public data and the IDX website which is listed in the metadata tab. This dataset was created with the intention of academic research purposes and not to be commercialized. If you have questions about the dataset, please ask in the discussion tab. Code snippet: https://github.com/muamkh/IHSGstockscraper
Stock minutes data is taken from 1 November 2021 until 6 January 2023. Stock hourly data is taken from 16 April 2020 until 6 January 2023. Stock daily data is taken from 16 April 2001 until 6 January 2023. All of the data is using CSV format. Stock data isnt adjusted with dividend, stock split, and other corporate action.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset contains historical daily prices for all tickers currently trading on NASDAQ. The up to date list is available from nasdaqtrader.com. The historic data is retrieved from Yahoo finance via yfinance python package.
It contains prices for up to 01 of April 2020. If you need more up to date data, just fork and re-run data collection script also available from Kaggle.
The date for every symbol is saved in CSV format with common fields:
All that ticker data is then stored in either ETFs or stocks folder, depending on a type. Moreover, each filename is the corresponding ticker symbol. At last, symbols_valid_meta.csv
contains some additional metadata for each ticker such as full name.