Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
The "Stock Market Dataset for AI-Driven Prediction and Trading Strategy Optimization" is designed to simulate real-world stock market data for training and evaluating machine learning models. This dataset includes a combination of technical indicators, market metrics, sentiment scores, and macroeconomic factors, providing a comprehensive foundation for developing and testing AI models for stock price prediction and trading strategy optimization.
Key Features Market Metrics:
Open, High, Low, Close Prices: Daily stock price movement. Volume: Represents the trading activity during the day. Technical Indicators:
RSI (Relative Strength Index): A momentum oscillator to measure the speed and change of price movements. MACD (Moving Average Convergence Divergence): An indicator to reveal changes in strength, direction, momentum, and duration of a trend. Bollinger Bands: Upper and lower bands around a stock price to measure volatility. Sentiment Analysis:
Sentiment Score: Simulated sentiment derived from financial news and social media, ranging from -1 (negative) to 1 (positive). Macroeconomic Factors:
GDP Growth: Indicates the overall health and growth of the economy. Inflation Rate: Reflects changes in purchasing power and economic stability. Target Variable:
Buy/Sell Signal: Binary classification (1 = Buy, 0 = Sell) based on price movement thresholds, simulating actionable trading decisions. Use Cases AI Model Training: Ideal for building stock prediction models using LSTM, Gradient Boosting, Random Forest, etc. Trading Strategy Optimization: Enables testing of trading algorithms and strategies in a simulated environment. Sentiment Analysis Research: Useful for understanding how sentiment influences stock movements. Feature Engineering and Selection: Provides a diverse set of features for experimentation with advanced techniques like PCA and LDA. Dataset Highlights Synthetic Yet Realistic: Carefully designed to mimic real-world financial data trends and relationships. Comprehensive Coverage: Includes key indicators and metrics used by traders and analysts. Scalable: Suitable for use in both small-scale academic projects and larger AI-driven trading platforms. Accessible for All Levels: The intuitive structure ensures that even beginners can utilize this dataset for financial machine learning applications. File Format The dataset is provided in CSV format, where:
Rows represent individual trading days. Columns represent features (technical indicators, market metrics, etc.) and the target variable. Acknowledgments This dataset is synthetically generated and is intended for research and educational purposes. It is not based on real market data and should not be used for actual trading.
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Title: Stock Prices of 500 Biggest Companies by Market Cap (Last 5 Years)
Description: This dataset comprises historical stock market data extracted from Yahoo Finance, spanning a period of five years. It includes daily records of stock performance metrics for the top 500 companies based on market capitalization.
Attributes: 1. Date: The date corresponding to the recorded stock market data. 2. Open: The opening price of the stock on a given date. 3. High: The highest price of the stock reached during the trading day. 4. Low: The lowest price of the stock observed during the trading day. 5. Close: The closing price of the stock on a specific date. 6. Volume: The volume of shares traded on the given date. 7. Dividends: Any dividend payments made by the company on that date (if applicable). 8. Stock Splits: Information regarding any stock splits occurring on that date. 9. Company: Ticker symbol or identifier representing the respective company.
Usefulness: - Investors and analysts can leverage this dataset to conduct various analyses such as trend analysis, volatility assessment, and predictive modeling. - Researchers can explore correlations between stock prices of different companies, sector-wise performance, and market trends over the specified duration. - Machine learning enthusiasts can employ this dataset for developing predictive models for stock price forecasting or anomaly detection.
Note: Prior to using this dataset, it's recommended to perform data cleaning, handling missing values, and verifying the consistency of data across companies and time periods.
License: The dataset is sourced from Yahoo Finance and is provided for analytical purposes. Refer to Yahoo Finance's terms of use for further details on data usage and licensing.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
The provided dataset is extracted from yahoo finance using pandas and yahoo finance library in python. This deals with stock market index of the world best economies. The code generated data from Jan 01, 2003 to Jun 30, 2023 that’s more than 20 years. There are 18 CSV files, dataset is generated for 16 different stock market indices comprising of 7 different countries. Below is the list of countries along with number of indices extracted through yahoo finance library, while two CSV files deals with annualized return and compound annual growth rate (CAGR) has been computed from the extracted data.
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F15657145%2F90ce8a986761636e3edbb49464b304d8%2FNumber%20of%20Index.JPG?generation=1688490342207096&alt=media" alt="">
This dataset is useful for research purposes, particularly for conducting comparative analyses involving capital market performance and could be used along with other economic indicators.
There are 18 distinct CSV files associated with this dataset. First 16 CSV files deals with number of indices and last two CSV file deals with annualized return of each year and CAGR of each index. If data in any column is blank, it portrays that index was launch in later years, for instance: Bse500 (India), this index launch in 2007, so earlier values are blank, similarly China_Top300 index launch in year 2021 so early fields are blank too.
The extraction process involves applying different criteria, like in 16 CSV files all columns are included, Adj Close is used to calculate annualized return. The algorithm extracts data based on index name (code given by the yahoo finance) according start and end date.
Annualized return and CAGR has been calculated and illustrated in below image along with machine readable file (CSV) attached to that.
To extract the data provided in the attachment, various criteria were applied:
Content Filtering: The data was filtered based on several attributes, including the index name, start and end date. This filtering process ensured that only relevant data meeting the specified criteria.
Collaborative Filtering: Another filtering technique used was collaborative filtering using yahoo finance, which relies on index similarity. This approach involves finding indices that are similar to other index or extended dataset scope to other countries or economies. By leveraging this method, the algorithm identifies and extracts data based on similarities between indices.
In the last two CSV files, one belongs to annualized return, that was calculated based on the Adj close column and new DataFrame created to store its outcome. Below is the image of annualized returns of all index (if unreadable, machine-readable or CSV format is attached with the dataset).
As far as annualised rate of return is concerned, most of the time India stock market indices leading, followed by USA, Canada and Japan stock market indices.
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F15657145%2F37645bd90623ea79f3708a958013c098%2FAnnualized%20Return.JPG?generation=1688525901452892&alt=media" alt="">
The best performing index based on compound growth is Sensex (India) that comprises of top 30 companies is 15.60%, followed by Nifty500 (India) that is 11.34% and Nasdaq (USA) all is 10.60%.
The worst performing index is China top300, however this is launch in 2021 (post pandemic), so would not possible to examine at that stage (due to less data availability). Furthermore, UK and Russia indices are also top 5 in the worst order.
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F15657145%2F58ae33f60a8800749f802b46ec1e07e7%2FCAGR.JPG?generation=1688490409606631&alt=media" alt="">
Geography: Stock Market Index of the World Top Economies
Time period: Jan 01, 2003 – June 30, 2023
Variables: Stock Market Index Title, Open, High, Low, Close, Adj Close, Volume, Year, Month, Day, Yearly_Return and CAGR
File Type: CSV file
This is not a financial advice; due diligence is required in each investment decision.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset contains historical daily prices for all tickers currently trading on NASDAQ. The up to date list is available from nasdaqtrader.com. The historic data is retrieved from Yahoo finance via yfinance python package.
It contains prices for up to 01 of April 2020. If you need more up to date data, just fork and re-run data collection script also available from Kaggle.
The date for every symbol is saved in CSV format with common fields:
All that ticker data is then stored in either ETFs or stocks folder, depending on a type. Moreover, each filename is the corresponding ticker symbol. At last, symbols_valid_meta.csv contains some additional metadata for each ticker such as full name.
Facebook
TwitterThe dataset contains a total of 25,161 rows, each row representing the stock market data for a specific company on a given date. The information collected through web scraping from www.nasdaq.com includes the stock prices and trading volumes for the companies listed, such as Apple, Starbucks, Microsoft, Cisco Systems, Qualcomm, Meta, Amazon.com, Tesla, Advanced Micro Devices, and Netflix.
Data Analysis Tasks:
1) Exploratory Data Analysis (EDA): Analyze the distribution of stock prices and volumes for each company over time. Visualize trends, seasonality, and patterns in the stock market data using line charts, bar plots, and heatmaps.
2)Correlation Analysis: Investigate the correlations between the closing prices of different companies to identify potential relationships. Calculate correlation coefficients and visualize correlation matrices.
3)Top Performers Identification: Identify the top-performing companies based on their stock price growth and trading volumes over a specific time period.
4)Market Sentiment Analysis: Perform sentiment analysis using Natural Language Processing (NLP) techniques on news headlines related to each company. Determine whether positive or negative news impacts the stock prices and volumes.
5)Volatility Analysis: Calculate the volatility of each company's stock prices using metrics like Standard Deviation or Bollinger Bands. Analyze how volatile stocks are in comparison to others.
Machine Learning Tasks:
1)Stock Price Prediction: Use time-series forecasting models like ARIMA, SARIMA, or Prophet to predict future stock prices for a particular company. Evaluate the models' performance using metrics like Mean Squared Error (MSE) or Root Mean Squared Error (RMSE).
2)Classification of Stock Movements: Create a binary classification model to predict whether a stock will rise or fall on the next trading day. Utilize features like historical price changes, volumes, and technical indicators for the predictions. Implement classifiers such as Logistic Regression, Random Forest, or Support Vector Machines (SVM).
3)Clustering Analysis: Cluster companies based on their historical stock performance using unsupervised learning algorithms like K-means clustering. Explore if companies with similar stock price patterns belong to specific industry sectors.
4)Anomaly Detection: Detect anomalies in stock prices or trading volumes that deviate significantly from the historical trends. Use techniques like Isolation Forest or One-Class SVM for anomaly detection.
5)Reinforcement Learning for Portfolio Optimization: Formulate the stock market data as a reinforcement learning problem to optimize a portfolio's performance. Apply algorithms like Q-Learning or Deep Q-Networks (DQN) to learn the optimal trading strategy.
The dataset provided on Kaggle, titled "Stock Market Stars: Historical Data of Top 10 Companies," is intended for learning purposes only. The data has been gathered from public sources, specifically from web scraping www.nasdaq.com, and is presented in good faith to facilitate educational and research endeavors related to stock market analysis and data science.
It is essential to acknowledge that while we have taken reasonable measures to ensure the accuracy and reliability of the data, we do not guarantee its completeness or correctness. The information provided in this dataset may contain errors, inaccuracies, or omissions. Users are advised to use this dataset at their own risk and are responsible for verifying the data's integrity for their specific applications.
This dataset is not intended for any commercial or legal use, and any reliance on the data for financial or investment decisions is not recommended. We disclaim any responsibility or liability for any damages, losses, or consequences arising from the use of this dataset.
By accessing and utilizing this dataset on Kaggle, you agree to abide by these terms and conditions and understand that it is solely intended for educational and research purposes.
Please note that the dataset's contents, including the stock market data and company names, are subject to copyright and other proprietary rights of the respective sources. Users are advised to adhere to all applicable laws and regulations related to data usage, intellectual property, and any other relevant legal obligations.
In summary, this dataset is provided "as is" for learning purposes, without any warranties or guarantees, and users should exercise due diligence and judgment when using the data for any purpose.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Alphabet Inc. is a listed US holding company of the former Google LLC, which continues to exist as a subsidiary. The headquarters is Mountain View in Silicon Valley. The company is led by Sundar Pichai as CEO.
With sales of $137 billion, a profit of $30.7 billion and a market value of $ 863.2 billion, Alphabet Inc. ranks 17th among the world's largest companies according to Forbes Global 2000 (as of 4th November 2019). The company had a market cap of $ 766.4 billion in early 2018. In 2019, Alphabet had annual sales of $161.9 billion and an annual profit of $34.3 billion.
Market capitalization of Alphabet (Google) (GOOG)
Market cap: $2.442 Trillion USD
As of August 2025 Alphabet (Google) has a market cap of $2.442 Trillion USD. This makes Alphabet (Google) the world's 4th most valuable company by market cap according to our data. The market capitalization, commonly called market cap, is the total market value of a publicly traded company's outstanding shares and is commonly used to measure how much a company is worth.
Geography: USA
Time period: August 2004- August 2025
Unit of analysis: Google Stock Data 2025
| Variable | Description |
|---|---|
| date | date |
| open | The price at market open. |
| high | The highest price for that day. |
| low | The lowest price for that day. |
| close | The price at market close, adjusted for splits. |
| adj_close | The closing price after adjustments for all applicable splits and dividend distributions. Data is adjusted using appropriate split and dividend multipliers, adhering to Center for Research in Security Prices (CRSP) standards. |
| volume | The number of shares traded on that day. |
This dataset belongs to me. I’m sharing it here for free. You may do with it as you wish.
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F18335022%2F84937d0d9ac664fa6c705c0da59564e0%2FScreenshot%202024-12-18%20153807.png?generation=1734532695847825&alt=media" alt="">
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F18335022%2Fa927d7f9ef11a23685bbb86a25b44d8d%2FScreenshot%202024-12-18%20153822.png?generation=1734532715073647&alt=media" alt="">
Facebook
TwitterHere is the first dataset of the Pakistan Stock Exchange (PSX) for the KSE 100 Index gathered from the archives of the Pakistan Stock Exchange (PSX) and Karachi Stock Exchange (KSE) 100 Index.
Facebook
Twitterhttp://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/
This dataset encapsulates a detailed examination of market dynamics over a five-year period, focusing on the fluctuation of prices and trading volumes across a diversified portfolio. It covers various sectors including energy commodities like natural gas and crude oil, metals such as copper, platinum, silver, and gold, cryptocurrencies including Bitcoin and Ethereum, and key stock indices and companies like the S&P 500, Nasdaq 100, Apple, Tesla, Microsoft, Google, Nvidia, Berkshire Hathaway, Netflix, Amazon, and Meta Platforms. This dataset serves as a valuable resource for analyzing trends and patterns in global markets.
Date: The date of the recorded data, formatted as DD-MM-YYYY. Natural_Gas_Price: Price of natural gas in USD per million British thermal units (MMBtu). Natural_Gas_Vol.: Trading volume of natural gas Crude_oil_Price: Price of crude oil in USD per barrel. Crude_oil_Vol.: Trading volume of crude oil Copper_Price: Price of copper in USD per pound. Copper_Vol.: Trading volume of copper Bitcoin_Price: Price of Bitcoin in USD. Bitcoin_Vol.: Trading volume of Bitcoin Platinum_Price: Price of platinum in USD per troy ounce. Platinum_Vol.: Trading volume of platinum Ethereum_Price: Price of Ethereum in USD. Ethereum_Vol.: Trading volume of Ethereum S&P_500_Price: Price index of the S&P 500. Nasdaq_100_Price: Price index of the Nasdaq 100. Nasdaq_100_Vol.: Trading volume for the Nasdaq 100 index Apple_Price: Stock price of Apple Inc. in USD. Apple_Vol.: Trading volume of Apple Inc. stock Tesla_Price: Stock price of Tesla Inc. in USD. Tesla_Vol.: Trading volume of Tesla Inc. stock Microsoft_Price: Stock price of Microsoft Corporation in USD. Microsoft_Vol.: Trading volume of Microsoft Corporation stock Silver_Price: Price of silver in USD per troy ounce. Silver_Vol.: Trading volume of silver Google_Price: Stock price of Alphabet Inc. (Google) in USD. Google_Vol.: Trading volume of Alphabet Inc. stock Nvidia_Price: Stock price of Nvidia Corporation in USD. Nvidia_Vol.: Trading volume of Nvidia Corporation stock Berkshire_Price: Stock price of Berkshire Hathaway Inc. in USD. Berkshire_Vol.: Trading volume of Berkshire Hathaway Inc. stock Netflix_Price: Stock price of Netflix Inc. in USD. Netflix_Vol.: Trading volume of Netflix Inc. stock Amazon_Price: Stock price of Amazon.com Inc. in USD. Amazon_Vol.: Trading volume of Amazon.com Inc. stock Meta_Price: Stock price of Meta Platforms, Inc. (formerly Facebook) in USD. Meta_Vol.: Trading volume of Meta Platforms, Inc. stock Gold_Price: Price of gold in USD per troy ounce. Gold_Vol.: Trading volume of gold
Image attribute : Image by Freepik
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Historical dataset of the United States Stock Market Index (S&P 500), covering values from 1928-01-01 to 2025-11-28, with the latest releases and long-term trends. Available for free download in CSV format.
Facebook
Twitterhttps://fred.stlouisfed.org/legal/#copyright-pre-approvalhttps://fred.stlouisfed.org/legal/#copyright-pre-approval
View data of the S&P 500, an index of the stocks of 500 leading companies in the US economy, which provides a gauge of the U.S. equity market.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
About the Google Stock Price Dataset
The Google Stock Price Dataset consists of two CSV (Comma Separated Values) files containing historical stock price data for training and evaluation. Each row in the dataset represents a trading day, and the columns provide various information related to Google's stock for that day.
Columns:
Date: The date of the trading day in the format "YYYY-MM-DD."
Open: The opening price of Google's stock on that trading day.
High: The highest price reached during the trading day.
Low: The lowest price reached during the trading day.
Close: The closing price of Google's stock on that trading day.
Adj Close: The adjusted closing price, accounting for any corporate actions (e.g., stock splits, dividends) that may affect the stock's value.
Volume: The trading volume, representing the number of shares traded on that trading day.
Time Period: The train dataset spans from January 1, 2010, to December 31, 2022, providing twelve years of daily stock price information for model training. The test dataset spans from January 1, 2023, to July 30, 2023, providing seven month of daily stock price data for model evaluation.
Data Source:
The dataset was collected from Yahoo Finance (finance.yahoo.com), a reputable and widely-used financial data platform.
Use Case:
The Google Stock Price Dataset can be utilized for various purposes, such as predicting future stock prices, analyzing historical stock trends, and building machine learning models for financial forecasting.
Potential Applications:
Time Series Analysis: Explore stock price patterns and seasonality. Financial Modeling: Develop predictive models to forecast stock prices. Algorithmic Trading: Create trading strategies based on historical stock data. Risk Management: Assess potential risks and volatilities in the stock market.
Citation:
If you use this dataset in your research or analysis, please provide proper attribution and citation to acknowledge the source.
License: This dataset is provided under the Creative Commons CC0 1.0 Universal (CC0 1.0) Public Domain Dedication, making it freely available for use without any restrictions or attribution requirements.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Historical dataset of the France Stock Market Index (CAC 40), covering values from 1987-07-01 to 2025-11-27, with the latest releases and long-term trends. Available for free download in CSV format.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Historical dataset of the Thailand Stock Market Index (SET 50), covering values from 1995-08-01 to 2025-12-01, with the latest releases and long-term trends. Available for free download in CSV format.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Historical dataset of the Israel Stock Market Index (TA-125), covering values from 1992-10-01 to 2025-12-03, with the latest releases and long-term trends. Available for free download in CSV format.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Historical dataset of the Sweden Stock Market Index (OMX Stockholm 30), covering values from 1986-10-01 to 2025-11-27, with the latest releases and long-term trends. Available for free download in CSV format.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Historical dataset of the Mexico Stock Market Index (Mexico IPC), covering values from 1987-01-01 to 2025-12-02, with the latest releases and long-term trends. Available for free download in CSV format.
Facebook
Twitterhttps://fred.stlouisfed.org/legal/#copyright-citation-requiredhttps://fred.stlouisfed.org/legal/#copyright-citation-required
Graph and download economic data for CBOE Volatility Index: VIX (VIXCLS) from 1990-01-02 to 2025-12-01 about VIX, volatility, stock market, and USA.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The US_Stock_Data.csv dataset offers a comprehensive view of the US stock market and related financial instruments, spanning from January 2, 2020, to February 2, 2024. This dataset includes 39 columns, covering a broad spectrum of financial data points such as prices and volumes of major stocks, indices, commodities, and cryptocurrencies. The data is presented in a structured CSV file format, making it easily accessible and usable for various financial analyses, market research, and predictive modeling. This dataset is ideal for anyone looking to gain insights into the trends and movements within the US financial markets during this period, including the impact of major global events.
The dataset captures daily financial data across multiple assets, providing a well-rounded perspective of market dynamics. Key features include:
The dataset’s structure is designed for straightforward integration into various analytical tools and platforms. Each column is dedicated to a specific asset's daily price or volume, enabling users to perform a wide range of analyses, from simple trend observations to complex predictive models. The inclusion of intraday data for Bitcoin provides a detailed view of market movements.
This dataset is highly versatile and can be utilized for various financial research purposes:
The dataset’s daily updates ensure that users have access to the most current data, which is crucial for real-time analysis and decision-making. Whether for academic research, market analysis, or financial modeling, the US_Stock_Data.csv dataset provides a valuable foundation for exploring the complexities of financial markets over the specified period.
This dataset would not be possible without the contributions of Dhaval Patel, who initially curated the US stock market data spanning from 2020 to 2024. Full credit goes to Dhaval Patel for creating and maintaining the dataset. You can find the original dataset here: US Stock Market 2020 to 2024.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Historical dataset of the Hong Kong Stock Market Index (Hang Seng), covering values from 1964-08-01 to 2025-11-24, with the latest releases and long-term trends. Available for free download in CSV format.
Facebook
TwitterThis dataset offers comprehensive historical stock market data covering over 9,000 tickers from 1962 to the present day. It includes essential daily trading information, making it suitable for various financial analyses, trend studies, and algorithmic trading model development.
This dataset is ideal for: - Time-Series Analysis: Track stock price trends over time, examining daily, monthly, and yearly patterns across sectors. - Algorithmic Trading: Develop and backtest trading strategies using historical price movements and volume data. - Machine Learning Applications: Train models for stock price prediction, volatility forecasting, or portfolio optimization. - Quantitative Research: Perform event studies, analyze the impact of dividends and stock splits, and assess long-term investment strategies. - Comparative Analysis: Evaluate performance across industries or against broader market trends by analyzing multiple tickers in one dataset.
This dataset serves as a robust resource for academic research, quantitative finance studies, and financial technology development.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
The "Stock Market Dataset for AI-Driven Prediction and Trading Strategy Optimization" is designed to simulate real-world stock market data for training and evaluating machine learning models. This dataset includes a combination of technical indicators, market metrics, sentiment scores, and macroeconomic factors, providing a comprehensive foundation for developing and testing AI models for stock price prediction and trading strategy optimization.
Key Features Market Metrics:
Open, High, Low, Close Prices: Daily stock price movement. Volume: Represents the trading activity during the day. Technical Indicators:
RSI (Relative Strength Index): A momentum oscillator to measure the speed and change of price movements. MACD (Moving Average Convergence Divergence): An indicator to reveal changes in strength, direction, momentum, and duration of a trend. Bollinger Bands: Upper and lower bands around a stock price to measure volatility. Sentiment Analysis:
Sentiment Score: Simulated sentiment derived from financial news and social media, ranging from -1 (negative) to 1 (positive). Macroeconomic Factors:
GDP Growth: Indicates the overall health and growth of the economy. Inflation Rate: Reflects changes in purchasing power and economic stability. Target Variable:
Buy/Sell Signal: Binary classification (1 = Buy, 0 = Sell) based on price movement thresholds, simulating actionable trading decisions. Use Cases AI Model Training: Ideal for building stock prediction models using LSTM, Gradient Boosting, Random Forest, etc. Trading Strategy Optimization: Enables testing of trading algorithms and strategies in a simulated environment. Sentiment Analysis Research: Useful for understanding how sentiment influences stock movements. Feature Engineering and Selection: Provides a diverse set of features for experimentation with advanced techniques like PCA and LDA. Dataset Highlights Synthetic Yet Realistic: Carefully designed to mimic real-world financial data trends and relationships. Comprehensive Coverage: Includes key indicators and metrics used by traders and analysts. Scalable: Suitable for use in both small-scale academic projects and larger AI-driven trading platforms. Accessible for All Levels: The intuitive structure ensures that even beginners can utilize this dataset for financial machine learning applications. File Format The dataset is provided in CSV format, where:
Rows represent individual trading days. Columns represent features (technical indicators, market metrics, etc.) and the target variable. Acknowledgments This dataset is synthetically generated and is intended for research and educational purposes. It is not based on real market data and should not be used for actual trading.