MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
This dataset captures historical financial market data and macroeconomic indicators spanning over three decades, from 1990 onwards. It is designed for financial analysis, time series forecasting, and exploring relationships between market volatility, stock indices, and macroeconomic factors. This dataset is particularly relevant for researchers, data scientists, and enthusiasts interested in studying: - Volatility forecasting (VIX) - Stock market trends (S&P 500, DJIA, HSI) - Macroeconomic influences on markets (joblessness, interest rates, etc.) - The effect of geopolitical and economic uncertainty (EPU, GPRD)
The data has been aggregated from a mix of historical financial records and publicly available macroeconomic datasets: - VIX (Volatility Index): Chicago Board Options Exchange (CBOE). - Stock Indices (S&P 500, DJIA, HSI): Yahoo Finance and historical financial databases. - Volume Data: Extracted from official exchange reports. - Macroeconomic Indicators: Bureau of Economic Analysis (BEA), Federal Reserve, and other public records. - Uncertainty Metrics (EPU, GPRD): Economic Policy Uncertainty Index and Global Policy Uncertainty Database.
dt
: Date of observation in YYYY-MM-DD format.vix
: VIX (Volatility Index), a measure of expected market volatility.sp500
: S&P 500 index value, a benchmark of the U.S. stock market.sp500_volume
: Daily trading volume for the S&P 500.djia
: Dow Jones Industrial Average (DJIA), another key U.S. market index.djia_volume
: Daily trading volume for the DJIA.hsi
: Hang Seng Index, representing the Hong Kong stock market.ads
: Aruoba-Diebold-Scotti (ADS) Business Conditions Index, reflecting U.S. economic activity.us3m
: U.S. Treasury 3-month bond yield, a short-term interest rate proxy.joblessness
: U.S. unemployment rate, reported as quartiles (1 represents lowest quartile and so on).epu
: Economic Policy Uncertainty Index, quantifying policy-related economic uncertainty.GPRD
: Geopolitical Risk Index (Daily), measuring geopolitical risk levels.prev_day
: Previous day’s S&P 500 closing value, added for lag-based time series analysis.Feel free to use this dataset for academic, research, or personal projects.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The main stock market index of United States, the US500, fell to 6738 points on October 7, 2025, losing 0.04% from the previous session. Over the past month, the index has climbed 3.73% and is up 17.15% compared to the same time last year, according to trading on a contract for difference (CFD) that tracks this benchmark index from United States. United States Stock Market Index - values, historical data, forecasts and news - updated on October of 2025.
Unfortunately, the API this dataset used to pull the stock data isn't free anymore. Instead of having this auto-updating, I dropped the last version of the data files in here, so at least the historic data is still usable.
This dataset provides free end of day data for all stocks currently in the Dow Jones Industrial Average. For each of the 30 components of the index, there is one CSV file named by the stock's symbol (e.g. AAPL for Apple). Each file provides historically adjusted market-wide data (daily, max. 5 years back). See here for description of the columns: https://iextrading.com/developer/docs/#chart
Since this dataset uses remote URLs as files, it is automatically updated daily by the Kaggle platform and automatically represents the latest data.
List of stocks and symbols as per https://en.wikipedia.org/wiki/Dow_Jones_Industrial_Average
Thanks to https://iextrading.com for providing this data for free!
Data provided for free by IEX. View IEX’s Terms of Use.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset contains historical daily prices for all tickers currently trading on NASDAQ. The up to date list is available from nasdaqtrader.com. The historic data is retrieved from Yahoo finance via yfinance python package.
It contains prices for up to 01 of April 2020. If you need more up to date data, just fork and re-run data collection script also available from Kaggle.
The date for every symbol is saved in CSV format with common fields:
All that ticker data is then stored in either ETFs or stocks folder, depending on a type. Moreover, each filename is the corresponding ticker symbol. At last, symbols_valid_meta.csv
contains some additional metadata for each ticker such as full name.
https://fred.stlouisfed.org/legal/#copyright-pre-approvalhttps://fred.stlouisfed.org/legal/#copyright-pre-approval
View data of the S&P 500, an index of the stocks of 500 leading companies in the US economy, which provides a gauge of the U.S. equity market.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The main stock market index of United States, the US500, rose to 6745 points on October 6, 2025, gaining 0.43% from the previous session. Over the past month, the index has climbed 3.85% and is up 18.42% compared to the same time last year, according to trading on a contract for difference (CFD) that tracks this benchmark index from United States. United States Stock Market Index - values, historical data, forecasts and news - updated on October of 2025.
https://fred.stlouisfed.org/legal/#copyright-pre-approvalhttps://fred.stlouisfed.org/legal/#copyright-pre-approval
Graph and download economic data for Dow Jones Industrial Average (DJIA) from 2015-10-05 to 2025-10-03 about stock market, average, industry, and USA.
https://dataful.in/terms-and-conditionshttps://dataful.in/terms-and-conditions
The dataset contains All India Daily Foreign Equity Market Index Values – Dow Jones Industrial Average and NASDAQ-100
https://www.kappasignal.com/p/legal-disclaimer.htmlhttps://www.kappasignal.com/p/legal-disclaimer.html
This analysis presents a rigorous exploration of financial data, incorporating a diverse range of statistical features. By providing a robust foundation, it facilitates advanced research and innovative modeling techniques within the field of finance.
Historical daily stock prices (open, high, low, close, volume)
Fundamental data (e.g., market capitalization, price to earnings P/E ratio, dividend yield, earnings per share EPS, price to earnings growth, debt-to-equity ratio, price-to-book ratio, current ratio, free cash flow, projected earnings growth, return on equity, dividend payout ratio, price to sales ratio, credit rating)
Technical indicators (e.g., moving averages, RSI, MACD, average directional index, aroon oscillator, stochastic oscillator, on-balance volume, accumulation/distribution A/D line, parabolic SAR indicator, bollinger bands indicators, fibonacci, williams percent range, commodity channel index)
Feature engineering based on financial data and technical indicators
Sentiment analysis data from social media and news articles
Macroeconomic data (e.g., GDP, unemployment rate, interest rates, consumer spending, building permits, consumer confidence, inflation, producer price index, money supply, home sales, retail sales, bond yields)
Stock price prediction
Portfolio optimization
Algorithmic trading
Market sentiment analysis
Risk management
Researchers investigating the effectiveness of machine learning in stock market prediction
Analysts developing quantitative trading Buy/Sell strategies
Individuals interested in building their own stock market prediction models
Students learning about machine learning and financial applications
The dataset may include different levels of granularity (e.g., daily, hourly)
Data cleaning and preprocessing are essential before model training
Regular updates are recommended to maintain the accuracy and relevance of the data
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The dataset consists of daily data from US Dow Jones for 501 large companies, over the time span 2010-2016, while monthly publicly available indexes are also used.
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Actually, I prepare this dataset for students on my Deep Learning and NLP course.
But I am also very happy to see kagglers play around with it.
Have fun!
Description:
There are two channels of data provided in this dataset:
News data: I crawled historical news headlines from Reddit WorldNews Channel (/r/worldnews). They are ranked by reddit users' votes, and only the top 25 headlines are considered for a single date. (Range: 2008-06-08 to 2016-07-01)
Stock data: Dow Jones Industrial Average (DJIA) is used to "prove the concept". (Range: 2008-08-08 to 2016-07-01)
I provided three data files in .csv format:
RedditNews.csv: two columns The first column is the "date", and second column is the "news headlines". All news are ranked from top to bottom based on how hot they are. Hence, there are 25 lines for each date.
DJIA_table.csv: Downloaded directly from Yahoo Finance: check out the web page for more info.
Combined_News_DJIA.csv: To make things easier for my students, I provide this combined dataset with 27 columns. The first column is "Date", the second is "Label", and the following ones are news headlines ranging from "Top1" to "Top25".
=========================================
To my students:
I made this a binary classification task. Hence, there are only two labels:
"1" when DJIA Adj Close value rose or stayed as the same;
"0" when DJIA Adj Close value decreased.
For task evaluation, please use data from 2008-08-08 to 2014-12-31 as Training Set, and Test Set is then the following two years data (from 2015-01-02 to 2016-07-01). This is roughly a 80%/20% split.
And, of course, use AUC as the evaluation metric.
=========================================
+++++++++++++++++++++++++++++++++++++++++
To all kagglers:
Please upvote this dataset if you like this idea for market prediction.
If you think you coded an amazing trading algorithm,
friendly advice
do play safe with your own money :)
+++++++++++++++++++++++++++++++++++++++++
Feel free to contact me if there is any question~
And, remember me when you become a millionaire :P
Note: If you'd like to cite this dataset in your publications, please use:
Sun, J. (2016, August). Daily News for Stock Market Prediction, Version 1. Retrieved [Date You Retrieved This Data] from https://www.kaggle.com/aaron7sun/stocknews.
https://www.kappasignal.com/p/legal-disclaimer.htmlhttps://www.kappasignal.com/p/legal-disclaimer.html
This analysis presents a rigorous exploration of financial data, incorporating a diverse range of statistical features. By providing a robust foundation, it facilitates advanced research and innovative modeling techniques within the field of finance.
Historical daily stock prices (open, high, low, close, volume)
Fundamental data (e.g., market capitalization, price to earnings P/E ratio, dividend yield, earnings per share EPS, price to earnings growth, debt-to-equity ratio, price-to-book ratio, current ratio, free cash flow, projected earnings growth, return on equity, dividend payout ratio, price to sales ratio, credit rating)
Technical indicators (e.g., moving averages, RSI, MACD, average directional index, aroon oscillator, stochastic oscillator, on-balance volume, accumulation/distribution A/D line, parabolic SAR indicator, bollinger bands indicators, fibonacci, williams percent range, commodity channel index)
Feature engineering based on financial data and technical indicators
Sentiment analysis data from social media and news articles
Macroeconomic data (e.g., GDP, unemployment rate, interest rates, consumer spending, building permits, consumer confidence, inflation, producer price index, money supply, home sales, retail sales, bond yields)
Stock price prediction
Portfolio optimization
Algorithmic trading
Market sentiment analysis
Risk management
Researchers investigating the effectiveness of machine learning in stock market prediction
Analysts developing quantitative trading Buy/Sell strategies
Individuals interested in building their own stock market prediction models
Students learning about machine learning and financial applications
The dataset may include different levels of granularity (e.g., daily, hourly)
Data cleaning and preprocessing are essential before model training
Regular updates are recommended to maintain the accuracy and relevance of the data
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Experimental studies in the area of Psychology and Behavioral Economics have suggested that people change their search pattern in response to positive and negative events. Using Internet search data provided by Google, we investigated the relationship between stock-specific events and related Google searches. We studied daily data from 13 stocks from the Dow-Jones and NASDAQ100 indices, over a period of 4 trading years. Focusing on periods in which stocks were extensively searched (Intensive Search Periods), we found a correlation between the magnitude of stock returns at the beginning of the period and the volume, peak, and duration of search generated during the period. This relation between magnitudes of stock returns and subsequent searches was considerably magnified in periods following negative stock returns. Yet, we did not find that intensive search periods following losses were associated with more Google searches than periods following gains. Thus, rather than increasing search, losses improved the fit between people’s search behavior and the extent of real-world events triggering the search. The findings demonstrate the robustness of the attentional effect of losses.
There's a story behind every dataset and here's your opportunity to share yours.
This is a collection of daily (OHLC, Adj Close and Volume) data for the Dow Jones Companies from 1st January 2000 to Dec 6th 2017.
The Dow Jones is an index that shows how 30 large publicly owned companies based in the United States have traded during a standard trading session in the stock market. It is the second-oldest U.S. market index after the Dow Jones Transportation Average, which was also created by Dow.
Yahoo Finance
Possible exploration: Stock Correlation over the last 17 years
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Index Time Series for Dow Jones Islamic Market China A 100 Index. The frequency of the observation is daily. Moving average series are also typically included.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
China's main stock market index, the SHANGHAI, rose to 3883 points on September 30, 2025, gaining 0.52% from the previous session. Over the past month, the index has climbed 0.19% and is up 11.26% compared to the same time last year, according to trading on a contract for difference (CFD) that tracks this benchmark index from China. China Shanghai Composite Stock Market Index - values, historical data, forecasts and news - updated on October of 2025.
https://www.kappasignal.com/p/legal-disclaimer.htmlhttps://www.kappasignal.com/p/legal-disclaimer.html
This analysis presents a rigorous exploration of financial data, incorporating a diverse range of statistical features. By providing a robust foundation, it facilitates advanced research and innovative modeling techniques within the field of finance.
Historical daily stock prices (open, high, low, close, volume)
Fundamental data (e.g., market capitalization, price to earnings P/E ratio, dividend yield, earnings per share EPS, price to earnings growth, debt-to-equity ratio, price-to-book ratio, current ratio, free cash flow, projected earnings growth, return on equity, dividend payout ratio, price to sales ratio, credit rating)
Technical indicators (e.g., moving averages, RSI, MACD, average directional index, aroon oscillator, stochastic oscillator, on-balance volume, accumulation/distribution A/D line, parabolic SAR indicator, bollinger bands indicators, fibonacci, williams percent range, commodity channel index)
Feature engineering based on financial data and technical indicators
Sentiment analysis data from social media and news articles
Macroeconomic data (e.g., GDP, unemployment rate, interest rates, consumer spending, building permits, consumer confidence, inflation, producer price index, money supply, home sales, retail sales, bond yields)
Stock price prediction
Portfolio optimization
Algorithmic trading
Market sentiment analysis
Risk management
Researchers investigating the effectiveness of machine learning in stock market prediction
Analysts developing quantitative trading Buy/Sell strategies
Individuals interested in building their own stock market prediction models
Students learning about machine learning and financial applications
The dataset may include different levels of granularity (e.g., daily, hourly)
Data cleaning and preprocessing are essential before model training
Regular updates are recommended to maintain the accuracy and relevance of the data
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The main stock market index of United States, the US500, fell to 6737 points on October 7, 2025, losing 0.05% from the previous session. Over the past month, the index has climbed 3.72% and is up 17.14% compared to the same time last year, according to trading on a contract for difference (CFD) that tracks this benchmark index from United States. United States Stock Market Index - values, historical data, forecasts and news - updated on October of 2025.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset contains several daily features of S&P 500, NASDAQ Composite, Dow Jones Industrial Average, RUSSELL 2000, and NYSE Composite from 2010 to 2017. It covers features from various categories of technical indicators, futures contracts, price of commodities, important indices of markets around the world, price of major companies in the U.S. market, and treasury bill rates. Sources and thorough description of features have been mentioned in the paper of "CNNpred: CNN-based stock market prediction using a diverse set of variables".
https://www.kappasignal.com/p/legal-disclaimer.htmlhttps://www.kappasignal.com/p/legal-disclaimer.html
This analysis presents a rigorous exploration of financial data, incorporating a diverse range of statistical features. By providing a robust foundation, it facilitates advanced research and innovative modeling techniques within the field of finance.
Historical daily stock prices (open, high, low, close, volume)
Fundamental data (e.g., market capitalization, price to earnings P/E ratio, dividend yield, earnings per share EPS, price to earnings growth, debt-to-equity ratio, price-to-book ratio, current ratio, free cash flow, projected earnings growth, return on equity, dividend payout ratio, price to sales ratio, credit rating)
Technical indicators (e.g., moving averages, RSI, MACD, average directional index, aroon oscillator, stochastic oscillator, on-balance volume, accumulation/distribution A/D line, parabolic SAR indicator, bollinger bands indicators, fibonacci, williams percent range, commodity channel index)
Feature engineering based on financial data and technical indicators
Sentiment analysis data from social media and news articles
Macroeconomic data (e.g., GDP, unemployment rate, interest rates, consumer spending, building permits, consumer confidence, inflation, producer price index, money supply, home sales, retail sales, bond yields)
Stock price prediction
Portfolio optimization
Algorithmic trading
Market sentiment analysis
Risk management
Researchers investigating the effectiveness of machine learning in stock market prediction
Analysts developing quantitative trading Buy/Sell strategies
Individuals interested in building their own stock market prediction models
Students learning about machine learning and financial applications
The dataset may include different levels of granularity (e.g., daily, hourly)
Data cleaning and preprocessing are essential before model training
Regular updates are recommended to maintain the accuracy and relevance of the data
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
This dataset captures historical financial market data and macroeconomic indicators spanning over three decades, from 1990 onwards. It is designed for financial analysis, time series forecasting, and exploring relationships between market volatility, stock indices, and macroeconomic factors. This dataset is particularly relevant for researchers, data scientists, and enthusiasts interested in studying: - Volatility forecasting (VIX) - Stock market trends (S&P 500, DJIA, HSI) - Macroeconomic influences on markets (joblessness, interest rates, etc.) - The effect of geopolitical and economic uncertainty (EPU, GPRD)
The data has been aggregated from a mix of historical financial records and publicly available macroeconomic datasets: - VIX (Volatility Index): Chicago Board Options Exchange (CBOE). - Stock Indices (S&P 500, DJIA, HSI): Yahoo Finance and historical financial databases. - Volume Data: Extracted from official exchange reports. - Macroeconomic Indicators: Bureau of Economic Analysis (BEA), Federal Reserve, and other public records. - Uncertainty Metrics (EPU, GPRD): Economic Policy Uncertainty Index and Global Policy Uncertainty Database.
dt
: Date of observation in YYYY-MM-DD format.vix
: VIX (Volatility Index), a measure of expected market volatility.sp500
: S&P 500 index value, a benchmark of the U.S. stock market.sp500_volume
: Daily trading volume for the S&P 500.djia
: Dow Jones Industrial Average (DJIA), another key U.S. market index.djia_volume
: Daily trading volume for the DJIA.hsi
: Hang Seng Index, representing the Hong Kong stock market.ads
: Aruoba-Diebold-Scotti (ADS) Business Conditions Index, reflecting U.S. economic activity.us3m
: U.S. Treasury 3-month bond yield, a short-term interest rate proxy.joblessness
: U.S. unemployment rate, reported as quartiles (1 represents lowest quartile and so on).epu
: Economic Policy Uncertainty Index, quantifying policy-related economic uncertainty.GPRD
: Geopolitical Risk Index (Daily), measuring geopolitical risk levels.prev_day
: Previous day’s S&P 500 closing value, added for lag-based time series analysis.Feel free to use this dataset for academic, research, or personal projects.