MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
This dataset captures historical financial market data and macroeconomic indicators spanning over three decades, from 1990 onwards. It is designed for financial analysis, time series forecasting, and exploring relationships between market volatility, stock indices, and macroeconomic factors. This dataset is particularly relevant for researchers, data scientists, and enthusiasts interested in studying: - Volatility forecasting (VIX) - Stock market trends (S&P 500, DJIA, HSI) - Macroeconomic influences on markets (joblessness, interest rates, etc.) - The effect of geopolitical and economic uncertainty (EPU, GPRD)
The data has been aggregated from a mix of historical financial records and publicly available macroeconomic datasets: - VIX (Volatility Index): Chicago Board Options Exchange (CBOE). - Stock Indices (S&P 500, DJIA, HSI): Yahoo Finance and historical financial databases. - Volume Data: Extracted from official exchange reports. - Macroeconomic Indicators: Bureau of Economic Analysis (BEA), Federal Reserve, and other public records. - Uncertainty Metrics (EPU, GPRD): Economic Policy Uncertainty Index and Global Policy Uncertainty Database.
dt
: Date of observation in YYYY-MM-DD format.vix
: VIX (Volatility Index), a measure of expected market volatility.sp500
: S&P 500 index value, a benchmark of the U.S. stock market.sp500_volume
: Daily trading volume for the S&P 500.djia
: Dow Jones Industrial Average (DJIA), another key U.S. market index.djia_volume
: Daily trading volume for the DJIA.hsi
: Hang Seng Index, representing the Hong Kong stock market.ads
: Aruoba-Diebold-Scotti (ADS) Business Conditions Index, reflecting U.S. economic activity.us3m
: U.S. Treasury 3-month bond yield, a short-term interest rate proxy.joblessness
: U.S. unemployment rate, reported as quartiles (1 represents lowest quartile and so on).epu
: Economic Policy Uncertainty Index, quantifying policy-related economic uncertainty.GPRD
: Geopolitical Risk Index (Daily), measuring geopolitical risk levels.prev_day
: Previous day’s S&P 500 closing value, added for lag-based time series analysis.Feel free to use this dataset for academic, research, or personal projects.
Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
License information was derived automatically
Sentinel Hub NBR description: To detect burned areas, the NBR-RAW index is the most appropriate choice. Using bands 8 and 12 it highlights burnt areas in large fire zones greater than 500 acres. To observe burn severity, you may subtract the post-fire NBR image from the pre-fire NBR image. Darker pixels indicate burned areas.
NBR = (NIR – SWIR) / (NIR + SWIR)
Sentinel-2 NBR = (B08 - B12) / (B08 + B12)
These data have been created by the Joint Nature Conservation Committee (JNCC) as part of a Defra Natural Capital & Ecosystem Assessment (NCEA) project to produce a regional, and ultimately national, system for detecting a change in habitat condition at a land parcel level. The first stage of the project is focused on Yorkshire, UK, and therefore the dataset includes granules and scenes covering Yorkshire and surrounding areas only. The dataset contains the following indices derived from Defra and JNCC Sentinel-2 Analysis Ready Data.
NDVI, NDMI, NDWI, NBR, and EVI files are generated for the following Sentinel-2 granules: • T30UWE • T30UXF • T30UWF • T30UXE • T31UCV • T30UYE • T31UCA
As the project continues, JNCC will expand the geographical coverage of this dataset and will provide continuous updates as ARD becomes available.
https://www.kappasignal.com/p/legal-disclaimer.htmlhttps://www.kappasignal.com/p/legal-disclaimer.html
This analysis presents a rigorous exploration of financial data, incorporating a diverse range of statistical features. By providing a robust foundation, it facilitates advanced research and innovative modeling techniques within the field of finance.
Historical daily stock prices (open, high, low, close, volume)
Fundamental data (e.g., market capitalization, price to earnings P/E ratio, dividend yield, earnings per share EPS, price to earnings growth, debt-to-equity ratio, price-to-book ratio, current ratio, free cash flow, projected earnings growth, return on equity, dividend payout ratio, price to sales ratio, credit rating)
Technical indicators (e.g., moving averages, RSI, MACD, average directional index, aroon oscillator, stochastic oscillator, on-balance volume, accumulation/distribution A/D line, parabolic SAR indicator, bollinger bands indicators, fibonacci, williams percent range, commodity channel index)
Feature engineering based on financial data and technical indicators
Sentiment analysis data from social media and news articles
Macroeconomic data (e.g., GDP, unemployment rate, interest rates, consumer spending, building permits, consumer confidence, inflation, producer price index, money supply, home sales, retail sales, bond yields)
Stock price prediction
Portfolio optimization
Algorithmic trading
Market sentiment analysis
Risk management
Researchers investigating the effectiveness of machine learning in stock market prediction
Analysts developing quantitative trading Buy/Sell strategies
Individuals interested in building their own stock market prediction models
Students learning about machine learning and financial applications
The dataset may include different levels of granularity (e.g., daily, hourly)
Data cleaning and preprocessing are essential before model training
Regular updates are recommended to maintain the accuracy and relevance of the data
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Disclaimer!!! Data uploaded here are collected from the internet. The sole purposes of uploading these data are to provide this Kaggle community with a good source of data for analysis and research. I don't own these datasets and am also not responsible for them legally by any means. I am not charging anything (either monetary or any favor) for this dataset.
For the first time, Nifty 50 stocks data and two indices data, along with 55 technical indicators used by Market experts are calculated and made available. Kindly download the data and make sure to share your code in public and if you like this data, do upvote. Thank you.
The NIFTY 50 index is a well-diversified 50 companies index reflecting overall market conditions. NIFTY 50 Index is computed using the free float market capitalization method.
NIFTY 50 can be used for a variety of purposes such as benchmarking fund portfolios, launching of index funds, ETFs and structured products.
This dataset contains historical daily prices for Nifty 100 stocks and indices currently trading on the Indian Stock Market. - Data samples are of 5-minute intervals and the availability of data is from Jan 2015 to Feb 2022. - Along with OHLCV (Open, High, Low, Close, and Volume) data, we have created 55 technical indicators. - More details about these technical indicators are provided in the Data description file.
The whole dataset is around 33 GB (compressed here to 13 GB), and 100 stocks (Nifty 100 stocks) and 2 indices (Nifty 50 and Nifty Bank indices) are present in this dataset. Details about each file are - - OHLCV (Open, High, Low, Close, and Volume) data - 55 Technical indicator values are also present
Stock Names
| ACC | ADANIENT | ADANIGREEN | ADANIPORTS | AMBUJACEM | | -- | -- | -- | -- | -- | | APOLLOHOSP | ASIANPAINT | AUROPHARMA | AXISBANK | BAJAJ-AUTO | | BAJAJFINSV | BAJAJHLDNG | BAJFINANCE | BANDHANBNK | BANKBARODA | | BERGEPAINT | BHARTIARTL | BIOCON | BOSCHLTD | BPCL | | BRITANNIA | CADILAHC | CHOLAFIN | CIPLA | COALINDIA | | COLPAL | DABUR | DIVISLAB | DLF | DMART | | DRREDDY | EICHERMOT | GAIL | GLAND | GODREJCP | | GRASIM | HAVELLS | HCLTECH | HDFC | HDFCAMC | | HDFCBANK | HDFCLIFE | HEROMOTOCO | HINDALCO | HINDPETRO | | HINDUNILVR | ICICIBANK | ICICIGI | ICICIPRULI | IGL | | INDIGO | INDUSINDBK | INDUSTOWER | INFY | IOC | | ITC | JINDALSTEL | JSWSTEEL | JUBLFOOD | KOTAKBANK | | LICI | LT | LTI | LUPIN | M&M | | MARICO | MARUTI | MCDOWELL-N | MUTHOOTFIN | NAUKRI | | NESTLEIND | NIFTY 50 | NIFTY BANK | NMDC | NTPC | | ONGC | PEL | PGHH | PIDILITIND | PIIND | | PNB | POWERGRID | RELIANCE | SAIL | SBICARD | | SBILIFE | SBIN | SHREECEM | SIEMENS | SUNPHARMA | | TATACONSUM | TATAMOTORS | TATASTEEL | TCS | TECHM | | TITAN | TORNTPHARM | ULTRACEMCO | UPL | VEDL | | WIPRO | YESBANK | | | |
Cbonds collects and normalizes indices data, offering daily updated and historical data on over 40,000 indices, including macroeconomic indicators, yield curves and spreads, currency markets, stock and funds markets, and commodities. Using the Indices API, you can access an index's holdings, such as its assets, sectors, and weight, as well as basic data on the asset. You can obtain end-of-day, and historical API indicator prices in CSV, XLS, and JSON formats. Cbonds provides a free Indices API for a limited test period of two weeks or for a longer period with a limited number of instruments.
https://www.lseg.com/en/policies/website-disclaimerhttps://www.lseg.com/en/policies/website-disclaimer
Browse LSEG's Global Equity Indices, discover our range of data, indices & benchmarks. Our Data Catalogue offers unrivaled data and delivery mechanisms.
Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
License information was derived automatically
NDWI is used to identify water bodies and detect changes in their extent. It is calculated as the ratio of the green and near-infrared (NIR) bands. Positive index values generally indicate the presence of water, with higher values corresponding to water bodies.
NDWI = (GREEN – NIR) / (GREEN + NIR)
Sentinel-2 NDWI (Defra/JNCC ARD bands) = (B02 – B07) / (B02 + B07)
Equivalent ESA Sentinel-2 bands: B03, B08
Data are provided in EPSG: 27700 OSGB36 / British National Grid, with a pixel size of 10m, and data is pixel-aligned to the source ARD file. No-data pixels are set to a value of -9999.
These data have been created by the Joint Nature Conservation Committee (JNCC) as part of the “Earth observation-based habitat change detection” project. This project is funded by the Department for Environment, Food and Rural Affairs (Defra) as part of the Natural Capital and Ecosystem Assessment (NCEA) programme. The project seeks to facilitate the effective uptake and use of Earth Observation data by producing data and tools for investigating and detecting parcel-level change in habitats and habitat condition.
The dataset contains NDVI, NDMI, NDWI, NBR and EVI2 indices derived from Defra and JNCC Sentinel-2 ARD. Index files have been generated for Sentinel-2 granules covering England and Scotland for the period from 2015 to 2025. Note that new unmasked index files (v2) have superseded the previous masked index files (v1). Masked files will no longer be produced. Users can mask the new index files if required using the cloud and topographic shadow masks provided with the ARD, or masks of their choice.
Contains modified Copernicus Sentinel data 2015-2025
https://www.lseg.com/en/policies/website-disclaimerhttps://www.lseg.com/en/policies/website-disclaimer
Explore LSEG's Indices, Constituents and Weightings (ICW), and find real time content for all major indices from exchange to vendor offerings.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
To investigate the issue of inflation-hedging to find appropriate hedging assets against inflation by using the VAR or VECM model. We have collected data encompassing housing price indices, stock indices, price indexes, and money supply from five countries: the United States, Hong Kong, South Korea, Singapore, and Taiwan. The housing price index focuses on the transaction prices of listed residential houses in the metropolitan area as the benchmark, the stock price index is the ordinary stock market index of various countries, the price index is the consumer price index (CPI), and the money supply is M2 aggregate. The time period for obtaining data on the housing price index and stock price index is not the same.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset, titled "Cryptocurrency Market Sentiment & Prediction," is a synthetic collection of real-time crypto market data designed for advanced analysis and predictive modeling. It captures a comprehensive range of features including price movements, social sentiment, news impact, and trading patterns for 10 major cryptocurrencies. Tailored for data scientists and analysts, this dataset is ideal for exploring market volatility, sentiment analysis, and price prediction, particularly in the context of significant events like the Bitcoin halving in 2024 and increasing institutional adoption.
Key Features Overview: - Price Movements: Tracks current prices and 24-hour price change percentages to reflect market dynamics. - Social Sentiment: Measures sentiment scores from social media platforms, ranging from -1 (negative) to 1 (positive), to gauge public perception. - News Sentiment and Impact: Evaluates sentiment from news sources and quantifies their potential impact on market behavior. - Trading Patterns: Includes data on 24-hour trading volumes and market capitalization, crucial for understanding market activity. - Technical Indicators: Features metrics like the Relative Strength Index (RSI), volatility index, and fear/greed index for in-depth technical analysis. - Prediction Confidence: Provides a confidence score for predictive models, aiding in assessing forecast reliability.
Purpose and Applications: - Perfect for machine learning tasks such as price prediction, sentiment-price correlation studies, and volatility classification. - Supports time series analysis for forecasting price movements and identifying volatility clusters. - Valuable for research into the influence of social media and news on cryptocurrency markets, especially during high-impact events.
Dataset Scope: - Covers a simulated 30-day period, offering a snapshot of market behavior under varying conditions. - Focuses on major cryptocurrencies including Bitcoin, Ethereum, Cardano, Solana, and others, ensuring relevance to current market trends.
Dataset Structure Table:
Column Name | Description | Data Type | Range/Value Example |
---|---|---|---|
timestamp | Date and time of data record | datetime | Last 30 days (e.g., 2025-06-04 20:36:49) |
cryptocurrency | Name of the cryptocurrency | string | 10 major cryptos (e.g., Bitcoin) |
current_price_usd | Current trading price in USD | float | Market-realistic (e.g., 47418.4096) |
price_change_24h_percent | 24-hour price change percentage | float | -25% to +27% (e.g., 1.05) |
trading_volume_24h | 24-hour trading volume | float | Variable (e.g., 1800434.38) |
market_cap_usd | Market capitalization in USD | float | Calculated (e.g., 343755257516049.1) |
social_sentiment_score | Sentiment score from social media | float | -1 to 1 (e.g., -0.728) |
news_sentiment_score | Sentiment score from news sources | float | -1 to 1 (e.g., -0.274) |
news_impact_score | Quantified impact of news on market | float | 0 to 10 (e.g., 2.73) |
social_mentions_count | Number of mentions on social media | integer | Variable (e.g., 707) |
fear_greed_index | Market fear and greed index | float | 0 to 100 (e.g., 35.3) |
volatility_index | Price volatility index | float | 0 to 100 (e.g., 36.0) |
rsi_technical_indicator | Relative Strength Index | float | 0 to 100 (e.g., 58.3) |
prediction_confidence | Confidence level of predictive models | float | 0 to 100 (e.g., 88.7) |
Dataset Statistics Table:
Statistic | Value |
---|---|
Total Rows | 2,063 |
Total Columns | 14 |
Cryptocurrencies | 10 major tokens |
Time Range | Last 30 days |
File Format | CSV |
Data Quality | Realistic correlations between features |
This dataset is a powerful resource for machine learning projects, sentiment analysis, and crypto market research, providing a robust foundation for AI/ML model development and testing.
Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
License information was derived automatically
Sentinel-Hub NDWI description: The NDWI is used to monitor changes related to water content in water bodies. As water bodies strongly absorb light in visible to the infrared electromagnetic spectrum, NDWI uses green and near-infrared bands to highlight water bodies. It is sensitive to built-up land and can result in the over-estimation of water bodies. Index values greater than 0.5 usually correspond to water bodies. Vegetation usually corresponds to much smaller values and built-up areas to values between zero and 0.2.
NDWI = (GREEN – NIR) / (GREEN + NIR)
Sentinel-2 NDWI = (B03 - B08) / (B03 + B08)
These data have been created by the Joint Nature Conservation Committee (JNCC) as part of a Defra Natural Capital and Ecosystem Assessment (NCEA) project to produce a regional, and ultimately national, system for detecting a change in habitat conditions at a land parcel level. The first stage of the project is focused on Yorkshire, UK, and therefore the dataset includes granules and scenes covering Yorkshire and surrounding areas only. The dataset contains Normalised Difference Water Index (NDWI) data derived from Defra and JNCC Sentinel-2 Analysis Ready Data.
NDWI files are generated for the following Sentinel-2 granules: • T30UWE • T30UXF • T30UWF • T30UXE • T31UCV • T30UYE • T31UCA
As the project continues, JNCC will expand the geographical coverage of this dataset and will provide continuous updates as ARD becomes available.
Trend Departure Index (TDI) is computed as: (number of quantiles for which the trend result departs from the result for the associated reference site) / (total number of quantiles). In a Quantile-Kendall analysis for the annual time frame, the denominator, or total number of quantiles, is 365. TDI varies from 0 to 1; a TDI of 0 indicates the trend results for the site are identical to the reference site and any value larger than 0 indicates some departure compared to the reference condition. The larger the number (the closer to 1) the more departure or deviation relative to the reference site in trend across all the quantiles for the site.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Observed p + q mean values are very close to the value of 1 for Nikkei and DJIA and despite non stationarity, are also close to 1 for Nasdaq and IPC markets. Restricting our analysis to dates thereafter year 1999, clearly p + q values are even nearest to the value of 1 for all markets.
Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
License information was derived automatically
Sentinel-Hub NDVI description: NDVI is a simple, but effective index for quantifying green vegetation. It normalizes green leaf scattering in Near Infra-red wavelengths with chlorophyll absorption in red wavelengths.
The value range of the NDVI is -1 to 1. Negative values of NDVI (values approaching -1) correspond to water. Values close to zero (-0.1 to 0.1) generally correspond to barren areas of rock, sand, or snow. Low, positive values represent shrub and grassland (approximately 0.2 to 0.4), while high values indicate temperate and tropical rainforests (values approaching 1). It is a good proxy for live green vegetation.
NDVI = (NIR – Red) / (NIR + RED)
Sentinel-2 NDVI = (B8 - B4) / (B8 + B4)
These data have been created by the Joint Nature Conservation Committee (JNCC) as part of a Defra Natural Capital and Ecosystem Assessment (NCEA) project to produce a regional, and ultimately national, system for detecting a change in habitat conditions at a land parcel level. The first stage of the project is focused on Yorkshire, UK, and therefore the dataset includes granules and scenes covering Yorkshire and surrounding areas only. The dataset contains Normalised Difference Vegetation Index (NDVI) data derived from Defra and JNCC Sentinel-2 Analysis Ready Data.
NDVI files are generated for the following Sentinel-2 granules: • T30UWE • T30UXF • T30UWF • T30UXE • T31UCV • T30UYE • T31UCA
As the project continues, JNCC will expand the geographical coverage of this dataset and will provide continuous updates as ARD becomes available.
Version 1 contains masked index files (using the Defra and JNCC ARD cloud and topographic shadow masks).
https://www.kappasignal.com/p/legal-disclaimer.htmlhttps://www.kappasignal.com/p/legal-disclaimer.html
This analysis presents a rigorous exploration of financial data, incorporating a diverse range of statistical features. By providing a robust foundation, it facilitates advanced research and innovative modeling techniques within the field of finance.
Historical daily stock prices (open, high, low, close, volume)
Fundamental data (e.g., market capitalization, price to earnings P/E ratio, dividend yield, earnings per share EPS, price to earnings growth, debt-to-equity ratio, price-to-book ratio, current ratio, free cash flow, projected earnings growth, return on equity, dividend payout ratio, price to sales ratio, credit rating)
Technical indicators (e.g., moving averages, RSI, MACD, average directional index, aroon oscillator, stochastic oscillator, on-balance volume, accumulation/distribution A/D line, parabolic SAR indicator, bollinger bands indicators, fibonacci, williams percent range, commodity channel index)
Feature engineering based on financial data and technical indicators
Sentiment analysis data from social media and news articles
Macroeconomic data (e.g., GDP, unemployment rate, interest rates, consumer spending, building permits, consumer confidence, inflation, producer price index, money supply, home sales, retail sales, bond yields)
Stock price prediction
Portfolio optimization
Algorithmic trading
Market sentiment analysis
Risk management
Researchers investigating the effectiveness of machine learning in stock market prediction
Analysts developing quantitative trading Buy/Sell strategies
Individuals interested in building their own stock market prediction models
Students learning about machine learning and financial applications
The dataset may include different levels of granularity (e.g., daily, hourly)
Data cleaning and preprocessing are essential before model training
Regular updates are recommended to maintain the accuracy and relevance of the data
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Context
The dataset tabulates the Index population over the last 20 plus years. It lists the population for each year, along with the year on year change in population, as well as the change in percentage terms for each year. The dataset can be utilized to understand the population change of Index across the last two decades. For example, using this dataset, we can identify if the population is declining or increasing. If there is a change, when the population peaked, or if it is still growing and has not reached its peak. We can also compare the trend with the overall trend of United States population over the same period of time.
Key observations
In 2023, the population of Index was 157, a 1.29% increase year-by-year from 2022. Previously, in 2022, Index population was 155, an increase of 1.31% compared to a population of 153 in 2021. Over the last 20 plus years, between 2000 and 2023, population of Index decreased by 3. In this period, the peak population was 211 in the year 2019. The numbers suggest that the population has already reached its peak and is showing a trend of decline. Source: U.S. Census Bureau Population Estimates Program (PEP).
When available, the data consists of estimates from the U.S. Census Bureau Population Estimates Program (PEP).
Data Coverage:
Variables / Data Columns
Good to know
Margin of Error
Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.
Custom data
If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.
Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.
This dataset is a part of the main dataset for Index Population by Year. You can refer the same here
https://www.kappasignal.com/p/legal-disclaimer.htmlhttps://www.kappasignal.com/p/legal-disclaimer.html
This analysis presents a rigorous exploration of financial data, incorporating a diverse range of statistical features. By providing a robust foundation, it facilitates advanced research and innovative modeling techniques within the field of finance.
Historical daily stock prices (open, high, low, close, volume)
Fundamental data (e.g., market capitalization, price to earnings P/E ratio, dividend yield, earnings per share EPS, price to earnings growth, debt-to-equity ratio, price-to-book ratio, current ratio, free cash flow, projected earnings growth, return on equity, dividend payout ratio, price to sales ratio, credit rating)
Technical indicators (e.g., moving averages, RSI, MACD, average directional index, aroon oscillator, stochastic oscillator, on-balance volume, accumulation/distribution A/D line, parabolic SAR indicator, bollinger bands indicators, fibonacci, williams percent range, commodity channel index)
Feature engineering based on financial data and technical indicators
Sentiment analysis data from social media and news articles
Macroeconomic data (e.g., GDP, unemployment rate, interest rates, consumer spending, building permits, consumer confidence, inflation, producer price index, money supply, home sales, retail sales, bond yields)
Stock price prediction
Portfolio optimization
Algorithmic trading
Market sentiment analysis
Risk management
Researchers investigating the effectiveness of machine learning in stock market prediction
Analysts developing quantitative trading Buy/Sell strategies
Individuals interested in building their own stock market prediction models
Students learning about machine learning and financial applications
The dataset may include different levels of granularity (e.g., daily, hourly)
Data cleaning and preprocessing are essential before model training
Regular updates are recommended to maintain the accuracy and relevance of the data
Data tables for Energy Price Indices and Discount Factors for Life-Cycle Cost Analysis - 2022: Annual Supplement to NIST Handbook 13Starting in the 2022 Annual Supplement to Handbook 135, the data tables within the text document has been extracted and provided in a supplemental spreadsheet. The reasons for creating a separate data file is to (1) make the text document smaller and easier to navigate, (2) provide the data in a format that is more accessible to a user, particularly those that want to incorporate the data tables into their own calculations or tools, and (3) streamline the process for the annual release of the data.There are numerous data sources used in developing these data tables, including EIA, OMB, and Federal Reserve. Process is discussed in Annual Supplement to Handbook 135.
https://www.kappasignal.com/p/legal-disclaimer.htmlhttps://www.kappasignal.com/p/legal-disclaimer.html
This analysis presents a rigorous exploration of financial data, incorporating a diverse range of statistical features. By providing a robust foundation, it facilitates advanced research and innovative modeling techniques within the field of finance.
Historical daily stock prices (open, high, low, close, volume)
Fundamental data (e.g., market capitalization, price to earnings P/E ratio, dividend yield, earnings per share EPS, price to earnings growth, debt-to-equity ratio, price-to-book ratio, current ratio, free cash flow, projected earnings growth, return on equity, dividend payout ratio, price to sales ratio, credit rating)
Technical indicators (e.g., moving averages, RSI, MACD, average directional index, aroon oscillator, stochastic oscillator, on-balance volume, accumulation/distribution A/D line, parabolic SAR indicator, bollinger bands indicators, fibonacci, williams percent range, commodity channel index)
Feature engineering based on financial data and technical indicators
Sentiment analysis data from social media and news articles
Macroeconomic data (e.g., GDP, unemployment rate, interest rates, consumer spending, building permits, consumer confidence, inflation, producer price index, money supply, home sales, retail sales, bond yields)
Stock price prediction
Portfolio optimization
Algorithmic trading
Market sentiment analysis
Risk management
Researchers investigating the effectiveness of machine learning in stock market prediction
Analysts developing quantitative trading Buy/Sell strategies
Individuals interested in building their own stock market prediction models
Students learning about machine learning and financial applications
The dataset may include different levels of granularity (e.g., daily, hourly)
Data cleaning and preprocessing are essential before model training
Regular updates are recommended to maintain the accuracy and relevance of the data
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
This dataset captures historical financial market data and macroeconomic indicators spanning over three decades, from 1990 onwards. It is designed for financial analysis, time series forecasting, and exploring relationships between market volatility, stock indices, and macroeconomic factors. This dataset is particularly relevant for researchers, data scientists, and enthusiasts interested in studying: - Volatility forecasting (VIX) - Stock market trends (S&P 500, DJIA, HSI) - Macroeconomic influences on markets (joblessness, interest rates, etc.) - The effect of geopolitical and economic uncertainty (EPU, GPRD)
The data has been aggregated from a mix of historical financial records and publicly available macroeconomic datasets: - VIX (Volatility Index): Chicago Board Options Exchange (CBOE). - Stock Indices (S&P 500, DJIA, HSI): Yahoo Finance and historical financial databases. - Volume Data: Extracted from official exchange reports. - Macroeconomic Indicators: Bureau of Economic Analysis (BEA), Federal Reserve, and other public records. - Uncertainty Metrics (EPU, GPRD): Economic Policy Uncertainty Index and Global Policy Uncertainty Database.
dt
: Date of observation in YYYY-MM-DD format.vix
: VIX (Volatility Index), a measure of expected market volatility.sp500
: S&P 500 index value, a benchmark of the U.S. stock market.sp500_volume
: Daily trading volume for the S&P 500.djia
: Dow Jones Industrial Average (DJIA), another key U.S. market index.djia_volume
: Daily trading volume for the DJIA.hsi
: Hang Seng Index, representing the Hong Kong stock market.ads
: Aruoba-Diebold-Scotti (ADS) Business Conditions Index, reflecting U.S. economic activity.us3m
: U.S. Treasury 3-month bond yield, a short-term interest rate proxy.joblessness
: U.S. unemployment rate, reported as quartiles (1 represents lowest quartile and so on).epu
: Economic Policy Uncertainty Index, quantifying policy-related economic uncertainty.GPRD
: Geopolitical Risk Index (Daily), measuring geopolitical risk levels.prev_day
: Previous day’s S&P 500 closing value, added for lag-based time series analysis.Feel free to use this dataset for academic, research, or personal projects.