92 datasets found

i
Dataset for Stock Market Prediction
ieee-dataport.org
Updated Jul 8, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Umara Umar (2024). Dataset for Stock Market Prediction [Dataset]. https://ieee-dataport.org/documents/dataset-stock-market-prediction
Explore at:
Dataset updated
Jul 8, 2024
Authors
Umara Umar
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Hascol
Stock Market Dataset for Predictive Analysis
kaggle.com
Updated Feb 24, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
WARNER (2025). Stock Market Dataset for Predictive Analysis [Dataset]. https://www.kaggle.com/datasets/s3programmer/stock-market-dataset-for-predictive-analysis
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Feb 24, 2025
Dataset provided by
Kagglehttp://kaggle.com/
Authors
WARNER
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
This Stock Market Dataset is designed for predictive analysis and machine learning applications in financial markets. It includes 13647 records of simulated stock trading data with features commonly used in stock price forecasting.

🔹 Key Features Date – Trading day timestamps (business days only) Open, High, Low, Close – Simulated stock prices Volume – Trading volume per day RSI (Relative Strength Index) – Measures market momentum MACD (Moving Average Convergence Divergence) – Trend-following momentum indicator Sentiment Score – Simulated market sentiment from financial news & social media Target – Binary label (1: Price goes up, 0: Price goes down) for next-day prediction This dataset is useful for training hybrid deep learning models such as LSTM, CNN, and Attention-based networks for stock market forecasting. It enables financial analysts, traders, and AI researchers to experiment with market trends, technical analysis, and sentiment-based predictions.
i
datasets of stock market indices.
ieee-dataport.org
Updated Apr 7, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Enrique Gonzalez Nunez (2024). datasets of stock market indices. [Dataset]. https://ieee-dataport.org/documents/datasets-stock-market-indices
Explore at:
Dataset updated
Apr 7, 2024
Authors
Enrique Gonzalez Nunez
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
DAX
Machine Learning stock prediction: HD Stock Prediction (Forecast)
kappasignal.com
Updated Oct 13, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
KappaSignal (2022). Machine Learning stock prediction: HD Stock Prediction (Forecast) [Dataset]. https://www.kappasignal.com/2022/10/machine-learning-stock-prediction-hd.html
Explore at:
Dataset updated
Oct 13, 2022
Dataset authored and provided by
KappaSignal
License
https://www.kappasignal.com/p/legal-disclaimer.htmlhttps://www.kappasignal.com/p/legal-disclaimer.html
Description
This analysis presents a rigorous exploration of financial data, incorporating a diverse range of statistical features. By providing a robust foundation, it facilitates advanced research and innovative modeling techniques within the field of finance.

Machine Learning stock prediction: HD Stock Prediction

Financial data:

Historical daily stock prices (open, high, low, close, volume)

Fundamental data (e.g., market capitalization, price to earnings P/E ratio, dividend yield, earnings per share EPS, price to earnings growth, debt-to-equity ratio, price-to-book ratio, current ratio, free cash flow, projected earnings growth, return on equity, dividend payout ratio, price to sales ratio, credit rating)

Technical indicators (e.g., moving averages, RSI, MACD, average directional index, aroon oscillator, stochastic oscillator, on-balance volume, accumulation/distribution A/D line, parabolic SAR indicator, bollinger bands indicators, fibonacci, williams percent range, commodity channel index)

Machine learning features:

Feature engineering based on financial data and technical indicators

Sentiment analysis data from social media and news articles

Macroeconomic data (e.g., GDP, unemployment rate, interest rates, consumer spending, building permits, consumer confidence, inflation, producer price index, money supply, home sales, retail sales, bond yields)

Potential Applications:

Stock price prediction

Portfolio optimization

Algorithmic trading

Market sentiment analysis

Risk management

Use Cases:

Researchers investigating the effectiveness of machine learning in stock market prediction

Analysts developing quantitative trading Buy/Sell strategies

Individuals interested in building their own stock market prediction models

Students learning about machine learning and financial applications

Additional Notes:

The dataset may include different levels of granularity (e.g., daily, hourly)

Data cleaning and preprocessing are essential before model training

Regular updates are recommended to maintain the accuracy and relevance of the data
c
Yahoo Stocks Dataset
crawlfeeds.com
csv, zip
Updated Apr 27, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Crawl Feeds (2025). Yahoo Stocks Dataset [Dataset]. https://crawlfeeds.com/datasets/yahoo-stocks-dataset
Explore at:
zip, csvAvailable download formats
Dataset updated
Apr 27, 2025
Dataset authored and provided by
Crawl Feeds
License
https://crawlfeeds.com/privacy_policyhttps://crawlfeeds.com/privacy_policy
Description
The Yahoo Stocks Dataset is an invaluable resource for analysts, traders, and developers looking to enhance their financial data models or trading strategies. Sourced from Yahoo Finance, this dataset includes historical stock prices, market trends, and financial indicators. With its accurate and comprehensive data, it empowers users to analyze patterns, forecast trends, and build robust machine learning models.

Whether you're a seasoned stock market analyst or a beginner in financial data science, this dataset is tailored to meet diverse needs. It features details like stock prices, trading volume, and market capitalization, enabling a deep dive into investment opportunities and market dynamics.

For machine learning and AI enthusiasts, the Yahoo Stocks Dataset is a goldmine. It’s perfect for developing predictive models, such as stock price forecasting and sentiment analysis. The dataset's structured format ensures seamless integration into Python, R, and other analytics platforms, making data visualization and reporting effortless.

Additionally, this dataset supports long-term trend analysis, helping investors make informed decisions. It’s also an essential resource for those conducting research in algorithmic trading and portfolio management.

Key benefits include:

Historical Stock Data: Access years of trading data to analyze market behaviors.

Versatile Applications: Use it for financial modeling, data analytics, or academic research.

SEO Benefits for Finance Websites: Boost your content with insights derived from this dataset.

Download the Yahoo Stocks Dataset today and harness the power of financial data for your projects. Whether for AI, financial reporting, or trend analysis, this dataset equips you with the tools to succeed in the dynamic world of stock markets.
Cloudflare (NET) Navigates the Web of Growth (Forecast)
kappasignal.com
Updated Sep 26, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
KappaSignal (2024). Cloudflare (NET) Navigates the Web of Growth (Forecast) [Dataset]. https://www.kappasignal.com/2024/09/cloudflare-net-navigates-web-of-growth.html
Explore at:
Dataset updated
Sep 26, 2024
Dataset authored and provided by
KappaSignal
License
https://www.kappasignal.com/p/legal-disclaimer.htmlhttps://www.kappasignal.com/p/legal-disclaimer.html
Description
This analysis presents a rigorous exploration of financial data, incorporating a diverse range of statistical features. By providing a robust foundation, it facilitates advanced research and innovative modeling techniques within the field of finance.

Cloudflare (NET) Navigates the Web of Growth

Financial data:

Historical daily stock prices (open, high, low, close, volume)

Fundamental data (e.g., market capitalization, price to earnings P/E ratio, dividend yield, earnings per share EPS, price to earnings growth, debt-to-equity ratio, price-to-book ratio, current ratio, free cash flow, projected earnings growth, return on equity, dividend payout ratio, price to sales ratio, credit rating)

Technical indicators (e.g., moving averages, RSI, MACD, average directional index, aroon oscillator, stochastic oscillator, on-balance volume, accumulation/distribution A/D line, parabolic SAR indicator, bollinger bands indicators, fibonacci, williams percent range, commodity channel index)

Machine learning features:

Feature engineering based on financial data and technical indicators

Sentiment analysis data from social media and news articles

Macroeconomic data (e.g., GDP, unemployment rate, interest rates, consumer spending, building permits, consumer confidence, inflation, producer price index, money supply, home sales, retail sales, bond yields)

Potential Applications:

Stock price prediction

Portfolio optimization

Algorithmic trading

Market sentiment analysis

Risk management

Use Cases:

Researchers investigating the effectiveness of machine learning in stock market prediction

Analysts developing quantitative trading Buy/Sell strategies

Individuals interested in building their own stock market prediction models

Students learning about machine learning and financial applications

Additional Notes:

The dataset may include different levels of granularity (e.g., daily, hourly)

Data cleaning and preprocessing are essential before model training

Regular updates are recommended to maintain the accuracy and relevance of the data
H
Enhancing Stock Market Forecasting with Machine Learning A PineScript-Driven...
dataverse.harvard.edu
Updated Nov 19, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Gautam Narla (2024). Enhancing Stock Market Forecasting with Machine Learning A PineScript-Driven Approach [Dataset]. http://doi.org/10.7910/DVN/HF0PFX
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.7910/DVN/HF0PFX
Dataset updated
Nov 19, 2024
Dataset provided by
Harvard Dataverse
Authors
Gautam Narla
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
This study investigates the application of machine learning (ML) models in stock market forecasting, with a focus on their integration using PineScript, a domain-specific language for algorithmic trading. Leveraging diverse datasets, including historical stock prices and market sentiment data, we developed and tested various ML models such as neural networks, decision trees, and linear regression. Rigorous backtesting over multiple timeframes and market conditions allowed us to evaluate their predictive accuracy and financial performance. The neural network model demonstrated the highest accuracy, achieving a 75% success rate, significantly outperforming traditional models. Additionally, trading strategies derived from these ML models yielded a return on investment (ROI) of up to 12%, compared to an 8% benchmark index ROI. These findings underscore the transformative potential of ML in refining trading strategies, providing critical insights for financial analysts, investors, and developers. The study draws on insights from 15 peer-reviewed articles, financial datasets, and industry reports, establishing a robust foundation for future exploration of ML-driven financial forecasting. Tools and Technologies Used †PineScript PineScript, a scripting language integrated within the TradingView platform, was the primary tool used to develop and implement the machine learning models. Its robust features allowed for custom indicator creation, strategy backtesting, and real-time market data analysis. †Python Python was utilized for data preprocessing, model training, and performance evaluation. Key libraries included: Pandas
Stock market predictions
kaggle.com
Updated Feb 18, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Tanishq dublish (2024). Stock market predictions [Dataset]. https://www.kaggle.com/datasets/tanishqdublish/stock-market-predictions
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Feb 18, 2024
Dataset provided by
Kaggle
Authors
Tanishq dublish
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
Actually, I prepare this dataset for students on my Deep Learning and NLP course.

But I am also very happy to see kagglers play around with it.

Have fun!

Description:

There are two channels of data provided in this dataset:

News data: I crawled historical news headlines from Reddit WorldNews Channel (/r/worldnews). They are ranked by reddit users' votes, and only the top 25 headlines are considered for a single date. (Range: 2008-06-08 to 2016-07-01)

Stock data: Dow Jones Industrial Average (DJIA) is used to "prove the concept". (Range: 2008-08-08 to 2016-07-01)

I provided three data files in .csv format:

RedditNews.csv: two columns The first column is the "date", and second column is the "news headlines". All news are ranked from top to bottom based on how hot they are. Hence, there are 25 lines for each date.

DJIA_table.csv: Downloaded directly from Yahoo Finance: check out the web page for more info.

Combined_News_DJIA.csv: To make things easier for my students, I provide this combined dataset with 27 columns. The first column is "Date", the second is "Label", and the following ones are news headlines ranging from "Top1" to "Top25".
i
Pakistan Stock Exchange Datasets
ieee-dataport.org
Updated May 12, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yosha Jawad (2020). Pakistan Stock Exchange Datasets [Dataset]. https://ieee-dataport.org/documents/pakistan-stock-exchange-datasets
Explore at:
Dataset updated
May 12, 2020
Authors
Yosha Jawad
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This is a preprocessed dataset of 2 companies from Pakistan Stock Exchange.
Tweet Sentiment's Impact on Stock Returns
kaggle.com
Updated Jan 16, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
The Devastator (2023). Tweet Sentiment's Impact on Stock Returns [Dataset]. https://www.kaggle.com/datasets/thedevastator/tweet-sentiment-s-impact-on-stock-returns
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jan 16, 2023
Dataset provided by
Kagglehttp://kaggle.com/
Authors
The Devastator
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Tweet Sentiment's Impact on Stock Returns

862,231 Labeled Instances

By [source]

About this dataset

This dataset contains 862,231 labeled tweets and associated stock returns, providing a comprehensive look into the impact of social media on company-level stock market performance. For each tweet, researchers have extracted data such as the date of the tweet and its associated stock symbol, along with metrics such as last price and various returns (1-day return, 2-day return, 3-day return, 7-day return). Also recorded are volatility scores for both 10 day intervals and 30 day intervals. Finally, sentiment scores from both Long Short - Term Memory (LSTM) and TextBlob models have been included to quantify the overall tone in which these messages were delivered. With this dataset you will be able to explore how tweets can affect a company's share prices both short term and long term by leveraging all of these data points for analysis!

More Datasets

For more datasets, click here.

Featured Notebooks

🚨 Your notebook can be here! 🚨!

How to use the dataset

In order to use this dataset, users can utilize descriptive statistics such as histograms or regression techniques to establish relationships between tweet content & sentiment with corresponding stock return data points such as 1-day & 7-day returns measurements.

The primary fields used for analysis include Tweet Text (TWEET), Stock symbol (STOCK), Date (DATE), Closing Price at the time of Tweet (LAST_PRICE) a range of Volatility measures 10 day Volatility(VOLATILITY_10D)and 30 day Volatility(VOLATILITY_30D ) for each Stock which capture changes in market fluctuation during different periods around when Twitter reactions occur. Additionally Sentiment Polarity analysis undertaken via two Machine learning algorithms LSTM Polarity(LSTM_POLARITY)and Textblob polarity provide insight into whether people are expressing positive or negative sentiments about each company at given times which again could influence thereby potentially influence Stock Prices over shorter term periods like 1-Day Returns(1_DAY_RETURN),2-Day Returns(2_DAY_RETURN)or longer term horizon like 7 Day Returns*7DAY RETURNS*.Finally MENTION field indicates if names/acronyms associated with Companies were specifically mentioned in each Tweet or not which gives extra insight into whether company specific contexts were present within individual Tweets aka “Company Relevancy”

Research Ideas

Analyzing the degree to which tweets can influence stock prices. By analyzing relationships between variables such as tweet sentiment and stock returns, correlations can be identified that could be used to inform investment decisions.

Exploring natural language processing (NLP) models for predicting future market trends based on textual data such as tweets. Through testing and evaluating different text-based models using this dataset, better predictive models may emerge that can give investors advance warning of upcoming market shifts due to news or other events.

Investigating the impact of different types of tweets (positive/negative, factual/opinionated) on stock prices over specific time frames. By studying correlations between the sentiment or nature of a tweet and its effect on stocks, insights may be gained into what sort of news or events have a greater impact on markets in general

Acknowledgements

If you use this dataset in your research, please credit the original authors. Data Source

License

License: CC0 1.0 Universal (CC0 1.0) - Public Domain Dedication No Copyright - You can copy, modify, distribute and perform the work, even for commercial purposes, all without asking permission. See Other Information.

Columns

File: reduced_dataset-release.csv | Column name | Description | |:----------------------|:-------------------------------------------------------------------------------------------------------| | TWEET | Text of the tweet. (String) | | STOCK | Company's stock mentioned in the tweet. (String) | | DATE | Date the tweet was posted. (Date) | | LAST_PRICE | Company's last price at the time of tweeting. (Float) ...
Probabilistic AI: A New Approach to Artificial Intelligence (Forecast)
kappasignal.com
Updated May 27, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
KappaSignal (2023). Probabilistic AI: A New Approach to Artificial Intelligence (Forecast) [Dataset]. https://www.kappasignal.com/2023/05/probabilistic-ai-new-approach-to.html
Explore at:
Dataset updated
May 27, 2023
Dataset authored and provided by
KappaSignal
License
https://www.kappasignal.com/p/legal-disclaimer.htmlhttps://www.kappasignal.com/p/legal-disclaimer.html
Description
This analysis presents a rigorous exploration of financial data, incorporating a diverse range of statistical features. By providing a robust foundation, it facilitates advanced research and innovative modeling techniques within the field of finance.

Probabilistic AI: A New Approach to Artificial Intelligence

Financial data:

Historical daily stock prices (open, high, low, close, volume)

Fundamental data (e.g., market capitalization, price to earnings P/E ratio, dividend yield, earnings per share EPS, price to earnings growth, debt-to-equity ratio, price-to-book ratio, current ratio, free cash flow, projected earnings growth, return on equity, dividend payout ratio, price to sales ratio, credit rating)

Technical indicators (e.g., moving averages, RSI, MACD, average directional index, aroon oscillator, stochastic oscillator, on-balance volume, accumulation/distribution A/D line, parabolic SAR indicator, bollinger bands indicators, fibonacci, williams percent range, commodity channel index)

Machine learning features:

Feature engineering based on financial data and technical indicators

Sentiment analysis data from social media and news articles

Macroeconomic data (e.g., GDP, unemployment rate, interest rates, consumer spending, building permits, consumer confidence, inflation, producer price index, money supply, home sales, retail sales, bond yields)

Potential Applications:

Stock price prediction

Portfolio optimization

Algorithmic trading

Market sentiment analysis

Risk management

Use Cases:

Researchers investigating the effectiveness of machine learning in stock market prediction

Analysts developing quantitative trading Buy/Sell strategies

Individuals interested in building their own stock market prediction models

Students learning about machine learning and financial applications

Additional Notes:

The dataset may include different levels of granularity (e.g., daily, hourly)

Data cleaning and preprocessing are essential before model training

Regular updates are recommended to maintain the accuracy and relevance of the data
H
Stock Market Next Day Forecast Data for
dataverse.harvard.edu
Updated May 15, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Fairuz Maulana (2025). Stock Market Next Day Forecast Data for [Dataset]. http://doi.org/10.7910/DVN/UXVEZ3
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.7910/DVN/UXVEZ3
Dataset updated
May 15, 2025
Dataset provided by
Harvard Dataverse
Authors
Fairuz Maulana
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
Stock market forecasting remains a complex and challenging task to forecast, traditional technical analysis methods such as RSI, EMA, and Candlestick Patterns often fail to analyze the stock market time series pattern with many recent studies now exploring forecasting using machine learning or neural networks, other studies have improved accuracy or decreased regression losses by applying technical indicators and sentiment analysis. This dataset aims to be used to analyze the performance of machine learning models in predicting the next day's stock market trend by combining technical and sentiment-based features. The technical indicators are derived from historical price data focusing on swing trends in the market and sentiment features are extracted using FinBERT from Benzinga Pro as a reliable financial news source. There are limitations to the dataset especially financial news articles. Limitations such as the availability of news data remain a major challenge that has the potential to improve the performance of a machine learning model.
Stock Market: Historical Data of Top 10 Companies
kaggle.com
Updated Jul 18, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Khushi Pitroda (2023). Stock Market: Historical Data of Top 10 Companies [Dataset]. https://www.kaggle.com/datasets/khushipitroda/stock-market-historical-data-of-top-10-companies/data
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jul 18, 2023
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Khushi Pitroda
Description
The dataset contains a total of 25,161 rows, each row representing the stock market data for a specific company on a given date. The information collected through web scraping from www.nasdaq.com includes the stock prices and trading volumes for the companies listed, such as Apple, Starbucks, Microsoft, Cisco Systems, Qualcomm, Meta, Amazon.com, Tesla, Advanced Micro Devices, and Netflix.

Data Analysis Tasks:

1) Exploratory Data Analysis (EDA): Analyze the distribution of stock prices and volumes for each company over time. Visualize trends, seasonality, and patterns in the stock market data using line charts, bar plots, and heatmaps.

2)Correlation Analysis: Investigate the correlations between the closing prices of different companies to identify potential relationships. Calculate correlation coefficients and visualize correlation matrices.

3)Top Performers Identification: Identify the top-performing companies based on their stock price growth and trading volumes over a specific time period.

4)Market Sentiment Analysis: Perform sentiment analysis using Natural Language Processing (NLP) techniques on news headlines related to each company. Determine whether positive or negative news impacts the stock prices and volumes.

5)Volatility Analysis: Calculate the volatility of each company's stock prices using metrics like Standard Deviation or Bollinger Bands. Analyze how volatile stocks are in comparison to others.

Machine Learning Tasks:

1)Stock Price Prediction: Use time-series forecasting models like ARIMA, SARIMA, or Prophet to predict future stock prices for a particular company. Evaluate the models' performance using metrics like Mean Squared Error (MSE) or Root Mean Squared Error (RMSE).

2)Classification of Stock Movements: Create a binary classification model to predict whether a stock will rise or fall on the next trading day. Utilize features like historical price changes, volumes, and technical indicators for the predictions. Implement classifiers such as Logistic Regression, Random Forest, or Support Vector Machines (SVM).

3)Clustering Analysis: Cluster companies based on their historical stock performance using unsupervised learning algorithms like K-means clustering. Explore if companies with similar stock price patterns belong to specific industry sectors.

4)Anomaly Detection: Detect anomalies in stock prices or trading volumes that deviate significantly from the historical trends. Use techniques like Isolation Forest or One-Class SVM for anomaly detection.

5)Reinforcement Learning for Portfolio Optimization: Formulate the stock market data as a reinforcement learning problem to optimize a portfolio's performance. Apply algorithms like Q-Learning or Deep Q-Networks (DQN) to learn the optimal trading strategy.

The dataset provided on Kaggle, titled "Stock Market Stars: Historical Data of Top 10 Companies," is intended for learning purposes only. The data has been gathered from public sources, specifically from web scraping www.nasdaq.com, and is presented in good faith to facilitate educational and research endeavors related to stock market analysis and data science.

It is essential to acknowledge that while we have taken reasonable measures to ensure the accuracy and reliability of the data, we do not guarantee its completeness or correctness. The information provided in this dataset may contain errors, inaccuracies, or omissions. Users are advised to use this dataset at their own risk and are responsible for verifying the data's integrity for their specific applications.

This dataset is not intended for any commercial or legal use, and any reliance on the data for financial or investment decisions is not recommended. We disclaim any responsibility or liability for any damages, losses, or consequences arising from the use of this dataset.

By accessing and utilizing this dataset on Kaggle, you agree to abide by these terms and conditions and understand that it is solely intended for educational and research purposes.

Please note that the dataset's contents, including the stock market data and company names, are subject to copyright and other proprietary rights of the respective sources. Users are advised to adhere to all applicable laws and regulations related to data usage, intellectual property, and any other relevant legal obligations.

In summary, this dataset is provided "as is" for learning purposes, without any warranties or guarantees, and users should exercise due diligence and judgment when using the data for any purpose.
f
S1 Data -
figshare.com
zip
Updated Nov 27, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Hongli Niu; Qiaoying Pan; Kunliang Xu (2023). S1 Data - [Dataset]. http://doi.org/10.1371/journal.pone.0294460.s001
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0294460.s001
Dataset updated
Nov 27, 2023
Dataset provided by
PLOS ONE
Authors
Hongli Niu; Qiaoying Pan; Kunliang Xu
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The prediction of stock prices has long been a captivating subject in academic research. This study aims to forecast the prices of prominent stocks in five key industries of the Chinese A-share market by leveraging the synergistic power of deep learning techniques and investor sentiment analysis. To achieve this, a sentiment multi-classification dataset is for the first time constructed for China’s stock market, based on four types of sentiments in modern psychology. The significant heterogeneity of sentiment changes in the sectors’ leading stock markets is trained and mined using the Bi-LSTM-ATT model. The impact of multi-classification investor sentiment on stock price prediction was analyzed using the CNN-Bi-LSTM-ATT model. It finds that integrating sentiment indicators into the prediction of industry leading stock prices can enhance the accuracy of the model. Drawing upon four fundamental sentiment types derived from modern psychology, our dataset provides a comprehensive framework for analyzing investor sentiment and its impact on forecasting the stock prices of China’s A-share market.
k
Machine Learning Predicts QQQ to Increase in Value by 5% in the Next 3...
kappasignal.com
Updated Jun 2, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
KappaSignal (2023). Machine Learning Predicts QQQ to Increase in Value by 5% in the Next 3 Months (Forecast) [Dataset]. https://www.kappasignal.com/2023/06/machine-learning-predicts-qqq-to.html
Explore at:
Dataset updated
Jun 2, 2023
Dataset authored and provided by
KappaSignal
License
https://www.kappasignal.com/p/legal-disclaimer.htmlhttps://www.kappasignal.com/p/legal-disclaimer.html
Description
This analysis presents a rigorous exploration of financial data, incorporating a diverse range of statistical features. By providing a robust foundation, it facilitates advanced research and innovative modeling techniques within the field of finance.

Machine Learning Predicts QQQ to Increase in Value by 5% in the Next 3 Months

Financial data:

Historical daily stock prices (open, high, low, close, volume)

Fundamental data (e.g., market capitalization, price to earnings P/E ratio, dividend yield, earnings per share EPS, price to earnings growth, debt-to-equity ratio, price-to-book ratio, current ratio, free cash flow, projected earnings growth, return on equity, dividend payout ratio, price to sales ratio, credit rating)

Technical indicators (e.g., moving averages, RSI, MACD, average directional index, aroon oscillator, stochastic oscillator, on-balance volume, accumulation/distribution A/D line, parabolic SAR indicator, bollinger bands indicators, fibonacci, williams percent range, commodity channel index)

Machine learning features:

Feature engineering based on financial data and technical indicators

Sentiment analysis data from social media and news articles

Macroeconomic data (e.g., GDP, unemployment rate, interest rates, consumer spending, building permits, consumer confidence, inflation, producer price index, money supply, home sales, retail sales, bond yields)

Potential Applications:

Stock price prediction

Portfolio optimization

Algorithmic trading

Market sentiment analysis

Risk management

Use Cases:

Researchers investigating the effectiveness of machine learning in stock market prediction

Analysts developing quantitative trading Buy/Sell strategies

Individuals interested in building their own stock market prediction models

Students learning about machine learning and financial applications

Additional Notes:

The dataset may include different levels of granularity (e.g., daily, hourly)

Data cleaning and preprocessing are essential before model training

Regular updates are recommended to maintain the accuracy and relevance of the data
o
Massive Stock Sentiment Dataset
opendatabay.com
.undefined
Updated Jul 3, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Datasimple (2025). Massive Stock Sentiment Dataset [Dataset]. https://www.opendatabay.com/data/ai-ml/d0828f81-ab19-4e17-9195-b32bad95268c
Explore at:
.undefinedAvailable download formats
Dataset updated
Jul 3, 2025
Dataset authored and provided by
Datasimple
Area covered
Finance & Banking Analytics
Description
This dataset provides a substantial collection of news sentences paired with their corresponding sentiment, primarily intended for financial analysis and stock prediction. With over 100,000 rows, each entry indicates whether the news is positive (represented by '1') or negative/neutral (represented by '0'), offering insights into potential stock movement. A positive sentiment suggests a likely increase in stock value, while a negative or neutral sentiment indicates a likely decrease [1, 2]. It is noted that the data within this dataset is not shuffled [2].

Columns

Sentiment: A numerical label indicating the sentiment of the news sentence. A value of 0 denotes negative or neutral sentiment, suggesting a stock price might go down. A value of 1 denotes positive sentiment, suggesting a stock price might go up [1, 2]. There are 53,026 instances of 0 and 55,725 instances of 1, making a total of 108,301 unique values in this column [3].

Sentence: The actual text of the news article sentence [1, 2]. This column contains the textual data analysed for sentiment.

Distribution

The dataset typically comes in CSV format [4] and consists of over 100,000 rows of data [2]. It includes two primary columns: 'Sentiment' and 'Sentence' [1]. The data is presented in an unshuffled order [2]. Specific numbers for records are available for each sentiment label: 53,026 rows for sentiment '0' and 55,725 rows for sentiment '1' [3].

Usage

This dataset is ideal for news sentiment analysis and stock prediction [1]. It can be employed to train machine learning models to forecast stock market movements based on news sentiment [1, 2]. Other use cases include developing financial analytics tools, performing large-scale text analysis on financial news, and researching the correlation between media sentiment and economic indicators [2].

Coverage

The dataset's regional scope is global [5]. The time range of the data is not specified in the provided information. No specific demographic scope is mentioned for the news sources or the subjects of the news.

License

CC-BY-NC

Who Can Use It

This dataset is particularly useful for: * Data Scientists and Machine Learning Engineers: For building and training Natural Language Processing (NLP) models to analyse sentiment in text and predict financial outcomes [2]. * Financial Analysts and Researchers: To gain insights into how news sentiment impacts stock performance and for market forecasting [1]. * Developers: To integrate sentiment analysis capabilities into financial applications or trading algorithms. * Academics: For research into financial economics, sentiment analysis, and predictive analytics.

Dataset Name Suggestions

Stock News Sentiment for Market Prediction

Financial News Sentiment Analysis Dataset

Massive Stock Sentiment Data

Market News Sentiment for Stock Forecasting

Attributes

Original Data Source: Stock News Sentiment Analysis(Massive Dataset)
Financial Datasets
brightdata.com
.json, .csv, .xlsx
Updated Dec 5, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Bright Data (2023). Financial Datasets [Dataset]. https://brightdata.com/products/datasets/news/financial
Explore at:
.json, .csv, .xlsxAvailable download formats
Dataset updated
Dec 5, 2023
Dataset authored and provided by
Bright Datahttps://brightdata.com/
License
https://brightdata.com/licensehttps://brightdata.com/license
Area covered
Worldwide
Description
Stay informed with our comprehensive Financial News Dataset, designed for investors, analysts, and businesses to track market trends, monitor financial events, and make data-driven decisions.

Dataset Features

Financial News Articles: Access structured financial news data, including headlines, summaries, full articles, publication dates, and source details. Market & Economic Indicators: Track financial reports, stock market updates, economic forecasts, and corporate earnings announcements. Sentiment & Trend Analysis: Analyze news sentiment, categorize articles by financial topics, and monitor emerging trends in global markets. Historical & Real-Time Data: Retrieve historical financial news archives or access continuously updated feeds for real-time insights.

Customizable Subsets for Specific Needs Our Financial News Dataset is fully customizable, allowing you to filter data based on publication date, region, financial topics, sentiment, or specific news sources. Whether you need broad coverage for market research or focused data for investment analysis, we tailor the dataset to your needs.

Popular Use Cases

Investment Strategy & Risk Management: Monitor financial news to assess market risks, identify investment opportunities, and optimize trading strategies. Market & Competitive Intelligence: Track industry trends, competitor financial performance, and economic developments. AI & Machine Learning Training: Use structured financial news data to train AI models for sentiment analysis, stock prediction, and automated trading. Regulatory & Compliance Monitoring: Stay updated on financial regulations, policy changes, and corporate governance news. Economic Research & Forecasting: Analyze financial news trends to predict economic shifts and market movements.

Whether you're tracking stock market trends, analyzing financial sentiment, or training AI models, our Financial News Dataset provides the structured data you need. Get started today and customize your dataset to fit your business objectives.
Beat US Stock market (2019 edition)
kaggle.com
Updated Jan 13, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Nicolas Carbone (2020). Beat US Stock market (2019 edition) [Dataset]. https://www.kaggle.com/datasets/cnic92/beat-us-stock-market-data
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jan 13, 2020
Dataset provided by
Kaggle
Authors
Nicolas Carbone
Description
Context

The algorithmic trading space is buzzing with new strategies. Companies have spent billions in infrastructures and R&D to be able to jump ahead of the competition and beat the market. Still, it is well acknowledged that the buy & hold strategy is able to outperform many of the algorithmic strategies, especially in the long-run. However, finding value in stocks is an art that very few mastered, can a computer do that?

Content

This Data repo contains two datasets:

Example_2019_price_var.csv. I built this dataset thanks to Financial Modeling Prep API and to pandas_datareader. Each row is a stock from the technology sector of the US stock market (that is available from the aforementioned API, which is free and highly recommended). The column contains the percent price variation of each stock for the year 2019. In other words, it collects the percent price variation of each stock from the first trading day on Jan 2019 to the last trading day of Dec 2019. To compute this price variation I decided to consider the Adjusted Close Price.

Example_DATASET.csv. I built this dataset thanks to Financial Modeling Prep API. Each row is a stock from the technology sector of the US stock market (that is available from the aforementioned API). Each column is a financial indicator that can be found in the 2018 10-K filings of each company. There are no Nans or empty cells. Furthermore, the last column is the CLASS of each stock, where:

class = 1 if the price of the stock increases during 2019

class = 0 if the price of the stock decreases during 2019

In other words, the last column is used to classify each stock in buy-worthy or not, and this relationship is what should allow a machine learning model to learn to recognize stocks that will increase their value from those that won't.

NOTE: the number of stocks does not match between the two datasets because the API did not have all the required financial indicators for some stocks. It is possible to remove from Example_2019_price_var.csv those rows that do not appear in Example_DATASET.csv.

Inspiration

I built this dataset during the 2019 winter holidays period, because I wanted to answer a simple question: is it possible to have a machine learning model learn the differences between stocks that perform well and those that don't, and then leverage this knowledge in order to predict which stock will be worth buying? Moreover, is it possible to achieve this simply by looking at financial indicators found in the 10-K filings?
o
Nairobi Securities Exchange Prices 2008-2012 for 6 selected stocks
explore.openaire.eu
data.mendeley.com
Updated Mar 10, 2020
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Barack Wanjawa (2020). Nairobi Securities Exchange Prices 2008-2012 for 6 selected stocks [Dataset]. http://doi.org/10.17632/95fb84nzcd
Explore at:
Unique identifier
https://doi.org/10.17632/95fb84nzcd
Dataset updated
Mar 10, 2020
Authors
Barack Wanjawa
Description
Stock market prediction remains active research in a quest to inform investors on how to trade (buy/sell) at the most opportune time. The prevalent methods used by stock market players in trying to predict the likely future trade prices are either technical, fundamental or time series analysis. This research wanted to try out machine learning methods, in contrast to the existing prevalent methods. Artificial neural networks (ANNs) tend to be the preferred machine learning method for this type of application. However, ANNs require some historical data to learn from, in order to do predictions. The research used an ANN model to test the hypothesis that the next day price (prediction) can be determined from the stock prices of the immediate last five days. The final ANN model used for the tests was a feedforward multi-layer perceptron (MLP) with error backpropagation, using sigmoid activation function, with network configuration 5:21:21:1. The data period used was a 5-year dataset (2008 to 2012), with 80% of the data (4-year data) used for training and the balance 20% used for testing (last 1-year data). The original raw data for Nairobi Securities Exchange (NSE) was scrapped from a publicly available and accessible website of a stock market analysis company in Kenya (Synergy, 2020). This daily prices data was first exported to a spreadsheet, then cleaned off headers and other redundant information, leaving only the data with stock name, date of trade and the related data such as volumes, low prices, high prices and adjusted prices. The data was further sorted by the stock names and then the trading dates. The data dimension was finally reduced to only what was needed for the research, which was the stock name, the date of trade and the adjusted price (average trade price). This final dataset was in CSV format, as hereby presented. The research tested three NSE stocks with the mean absolute percentage error (MAPE) ranging between 0.77% to 1.91%, over the 3-month testing period, while the root mean squared error (RMSE) ranged between 1.83 and 3.07. This raw data can be used to train and test any machine learning model that requires training and testing data. The data can also be used to validate and reproduce the results already presented in this research. There could be slight variance between what is obtained when reproducing the results, due to the differences in the final exact weights that the trained ANN model eventually achieves. However, these differences should not be significant. List of data files on this dataset: stock01_NSE_01jan2008_to_31dec2012_Kakuzi.csv stock02_NSE_01jan2008_to_31dec2012_StandardBank.csv stock03_NSE_01jan2008_to_31dec2012_KenyaAirways.csv stock04_NSE_01jan2008_to_31dec2012_BamburiCement.csv stock05_NSE_01jan2008_to_31dec2012_Kengen.csv stock06_NSE_01jan2008_to_31dec2012_BAT.csv References: Synergy Systems Ltd. (2020). MyStocks. Retrieved March 9, 2020, from http://live.mystocks.co.ke/
Can we predict stock market using machine learning? (QRVO Stock Forecast)...
kappasignal.com
Updated Sep 4, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
KappaSignal (2022). Can we predict stock market using machine learning? (QRVO Stock Forecast) (Forecast) [Dataset]. https://www.kappasignal.com/2022/09/can-we-predict-stock-market-using_18.html
Explore at:
Dataset updated
Sep 4, 2022
Dataset authored and provided by
KappaSignal
License
https://www.kappasignal.com/p/legal-disclaimer.htmlhttps://www.kappasignal.com/p/legal-disclaimer.html
Description
This analysis presents a rigorous exploration of financial data, incorporating a diverse range of statistical features. By providing a robust foundation, it facilitates advanced research and innovative modeling techniques within the field of finance.

Can we predict stock market using machine learning? (QRVO Stock Forecast)

Financial data:

Historical daily stock prices (open, high, low, close, volume)

Fundamental data (e.g., market capitalization, price to earnings P/E ratio, dividend yield, earnings per share EPS, price to earnings growth, debt-to-equity ratio, price-to-book ratio, current ratio, free cash flow, projected earnings growth, return on equity, dividend payout ratio, price to sales ratio, credit rating)

Technical indicators (e.g., moving averages, RSI, MACD, average directional index, aroon oscillator, stochastic oscillator, on-balance volume, accumulation/distribution A/D line, parabolic SAR indicator, bollinger bands indicators, fibonacci, williams percent range, commodity channel index)

Machine learning features:

Feature engineering based on financial data and technical indicators

Sentiment analysis data from social media and news articles

Macroeconomic data (e.g., GDP, unemployment rate, interest rates, consumer spending, building permits, consumer confidence, inflation, producer price index, money supply, home sales, retail sales, bond yields)

Potential Applications:

Stock price prediction

Portfolio optimization

Algorithmic trading

Market sentiment analysis

Risk management

Use Cases:

Researchers investigating the effectiveness of machine learning in stock market prediction

Analysts developing quantitative trading Buy/Sell strategies

Individuals interested in building their own stock market prediction models

Students learning about machine learning and financial applications

Additional Notes:

The dataset may include different levels of granularity (e.g., daily, hourly)

Data cleaning and preprocessing are essential before model training

Regular updates are recommended to maintain the accuracy and relevance of the data

Facebook

Twitter

Click to copy link

Link copied

Cite

Umara Umar (2024). Dataset for Stock Market Prediction [Dataset]. https://ieee-dataport.org/documents/dataset-stock-market-prediction

Dataset for Stock Market Prediction

Explore at:

10 scholarly articles cite this dataset (View in Google Scholar)

Dataset updated

Jul 8, 2024

Authors

Umara Umar

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Hascol

Clear search

Close search

Google apps

Main menu

Dataset for Stock Market Prediction

Stock Market Dataset for Predictive Analysis

datasets of stock market indices.

Machine Learning stock prediction: HD Stock Prediction (Forecast)

Machine Learning stock prediction: HD Stock Prediction

Financial data:

Machine learning features:

Potential Applications:

Use Cases:

Additional Notes:

Yahoo Stocks Dataset

Cloudflare (NET) Navigates the Web of Growth (Forecast)

Cloudflare (NET) Navigates the Web of Growth

Financial data:

Machine learning features:

Potential Applications:

Use Cases:

Additional Notes:

Enhancing Stock Market Forecasting with Machine Learning A PineScript-Driven...

Stock market predictions

Pakistan Stock Exchange Datasets

Tweet Sentiment's Impact on Stock Returns

Tweet Sentiment's Impact on Stock Returns

862,231 Labeled Instances

About this dataset

More Datasets

Featured Notebooks

How to use the dataset

Research Ideas

Acknowledgements

License

Columns

Probabilistic AI: A New Approach to Artificial Intelligence (Forecast)

Probabilistic AI: A New Approach to Artificial Intelligence

Financial data:

Machine learning features:

Potential Applications:

Use Cases:

Additional Notes:

Stock Market Next Day Forecast Data for

Stock Market: Historical Data of Top 10 Companies

S1 Data -

Machine Learning Predicts QQQ to Increase in Value by 5% in the Next 3...

Machine Learning Predicts QQQ to Increase in Value by 5% in the Next 3 Months

Financial data:

Machine learning features:

Potential Applications:

Use Cases:

Additional Notes:

Massive Stock Sentiment Dataset

Columns

Distribution

Usage

Coverage

License

Who Can Use It

Dataset Name Suggestions

Attributes

Financial Datasets

Beat US Stock market (2019 edition)

Context

Content

Inspiration

Nairobi Securities Exchange Prices 2008-2012 for 6 selected stocks

Can we predict stock market using machine learning? (QRVO Stock Forecast)...

Can we predict stock market using machine learning? (QRVO Stock Forecast)

Financial data:

Machine learning features:

Potential Applications:

Use Cases:

Additional Notes:

Dataset for Stock Market Prediction