92 datasets found
  1. i

    Dataset for Stock Market Prediction

    • ieee-dataport.org
    Updated Jul 8, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Umara Umar (2024). Dataset for Stock Market Prediction [Dataset]. https://ieee-dataport.org/documents/dataset-stock-market-prediction
    Explore at:
    Dataset updated
    Jul 8, 2024
    Authors
    Umara Umar
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Hascol

  2. Stock Market Dataset for Predictive Analysis

    • kaggle.com
    Updated Feb 24, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    WARNER (2025). Stock Market Dataset for Predictive Analysis [Dataset]. https://www.kaggle.com/datasets/s3programmer/stock-market-dataset-for-predictive-analysis
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Feb 24, 2025
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    WARNER
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    This Stock Market Dataset is designed for predictive analysis and machine learning applications in financial markets. It includes 13647 records of simulated stock trading data with features commonly used in stock price forecasting.

    🔹 Key Features Date – Trading day timestamps (business days only) Open, High, Low, Close – Simulated stock prices Volume – Trading volume per day RSI (Relative Strength Index) – Measures market momentum MACD (Moving Average Convergence Divergence) – Trend-following momentum indicator Sentiment Score – Simulated market sentiment from financial news & social media Target – Binary label (1: Price goes up, 0: Price goes down) for next-day prediction This dataset is useful for training hybrid deep learning models such as LSTM, CNN, and Attention-based networks for stock market forecasting. It enables financial analysts, traders, and AI researchers to experiment with market trends, technical analysis, and sentiment-based predictions.

  3. i

    datasets of stock market indices.

    • ieee-dataport.org
    Updated Apr 7, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Enrique Gonzalez Nunez (2024). datasets of stock market indices. [Dataset]. https://ieee-dataport.org/documents/datasets-stock-market-indices
    Explore at:
    Dataset updated
    Apr 7, 2024
    Authors
    Enrique Gonzalez Nunez
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    DAX

  4. Machine Learning stock prediction: HD Stock Prediction (Forecast)

    • kappasignal.com
    Updated Oct 13, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    KappaSignal (2022). Machine Learning stock prediction: HD Stock Prediction (Forecast) [Dataset]. https://www.kappasignal.com/2022/10/machine-learning-stock-prediction-hd.html
    Explore at:
    Dataset updated
    Oct 13, 2022
    Dataset authored and provided by
    KappaSignal
    License

    https://www.kappasignal.com/p/legal-disclaimer.htmlhttps://www.kappasignal.com/p/legal-disclaimer.html

    Description

    This analysis presents a rigorous exploration of financial data, incorporating a diverse range of statistical features. By providing a robust foundation, it facilitates advanced research and innovative modeling techniques within the field of finance.

    Machine Learning stock prediction: HD Stock Prediction

    Financial data:

    • Historical daily stock prices (open, high, low, close, volume)

    • Fundamental data (e.g., market capitalization, price to earnings P/E ratio, dividend yield, earnings per share EPS, price to earnings growth, debt-to-equity ratio, price-to-book ratio, current ratio, free cash flow, projected earnings growth, return on equity, dividend payout ratio, price to sales ratio, credit rating)

    • Technical indicators (e.g., moving averages, RSI, MACD, average directional index, aroon oscillator, stochastic oscillator, on-balance volume, accumulation/distribution A/D line, parabolic SAR indicator, bollinger bands indicators, fibonacci, williams percent range, commodity channel index)

    Machine learning features:

    • Feature engineering based on financial data and technical indicators

    • Sentiment analysis data from social media and news articles

    • Macroeconomic data (e.g., GDP, unemployment rate, interest rates, consumer spending, building permits, consumer confidence, inflation, producer price index, money supply, home sales, retail sales, bond yields)

    Potential Applications:

    • Stock price prediction

    • Portfolio optimization

    • Algorithmic trading

    • Market sentiment analysis

    • Risk management

    Use Cases:

    • Researchers investigating the effectiveness of machine learning in stock market prediction

    • Analysts developing quantitative trading Buy/Sell strategies

    • Individuals interested in building their own stock market prediction models

    • Students learning about machine learning and financial applications

    Additional Notes:

    • The dataset may include different levels of granularity (e.g., daily, hourly)

    • Data cleaning and preprocessing are essential before model training

    • Regular updates are recommended to maintain the accuracy and relevance of the data

  5. c

    Yahoo Stocks Dataset

    • crawlfeeds.com
    csv, zip
    Updated Apr 27, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Crawl Feeds (2025). Yahoo Stocks Dataset [Dataset]. https://crawlfeeds.com/datasets/yahoo-stocks-dataset
    Explore at:
    zip, csvAvailable download formats
    Dataset updated
    Apr 27, 2025
    Dataset authored and provided by
    Crawl Feeds
    License

    https://crawlfeeds.com/privacy_policyhttps://crawlfeeds.com/privacy_policy

    Description

    The Yahoo Stocks Dataset is an invaluable resource for analysts, traders, and developers looking to enhance their financial data models or trading strategies. Sourced from Yahoo Finance, this dataset includes historical stock prices, market trends, and financial indicators. With its accurate and comprehensive data, it empowers users to analyze patterns, forecast trends, and build robust machine learning models.

    Whether you're a seasoned stock market analyst or a beginner in financial data science, this dataset is tailored to meet diverse needs. It features details like stock prices, trading volume, and market capitalization, enabling a deep dive into investment opportunities and market dynamics.

    For machine learning and AI enthusiasts, the Yahoo Stocks Dataset is a goldmine. It’s perfect for developing predictive models, such as stock price forecasting and sentiment analysis. The dataset's structured format ensures seamless integration into Python, R, and other analytics platforms, making data visualization and reporting effortless.

    Additionally, this dataset supports long-term trend analysis, helping investors make informed decisions. It’s also an essential resource for those conducting research in algorithmic trading and portfolio management.

    Key benefits include:

    • Historical Stock Data: Access years of trading data to analyze market behaviors.
    • Versatile Applications: Use it for financial modeling, data analytics, or academic research.
    • SEO Benefits for Finance Websites: Boost your content with insights derived from this dataset.

    Download the Yahoo Stocks Dataset today and harness the power of financial data for your projects. Whether for AI, financial reporting, or trend analysis, this dataset equips you with the tools to succeed in the dynamic world of stock markets.

  6. Cloudflare (NET) Navigates the Web of Growth (Forecast)

    • kappasignal.com
    Updated Sep 26, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    KappaSignal (2024). Cloudflare (NET) Navigates the Web of Growth (Forecast) [Dataset]. https://www.kappasignal.com/2024/09/cloudflare-net-navigates-web-of-growth.html
    Explore at:
    Dataset updated
    Sep 26, 2024
    Dataset authored and provided by
    KappaSignal
    License

    https://www.kappasignal.com/p/legal-disclaimer.htmlhttps://www.kappasignal.com/p/legal-disclaimer.html

    Description

    This analysis presents a rigorous exploration of financial data, incorporating a diverse range of statistical features. By providing a robust foundation, it facilitates advanced research and innovative modeling techniques within the field of finance.

    Cloudflare (NET) Navigates the Web of Growth

    Financial data:

    • Historical daily stock prices (open, high, low, close, volume)

    • Fundamental data (e.g., market capitalization, price to earnings P/E ratio, dividend yield, earnings per share EPS, price to earnings growth, debt-to-equity ratio, price-to-book ratio, current ratio, free cash flow, projected earnings growth, return on equity, dividend payout ratio, price to sales ratio, credit rating)

    • Technical indicators (e.g., moving averages, RSI, MACD, average directional index, aroon oscillator, stochastic oscillator, on-balance volume, accumulation/distribution A/D line, parabolic SAR indicator, bollinger bands indicators, fibonacci, williams percent range, commodity channel index)

    Machine learning features:

    • Feature engineering based on financial data and technical indicators

    • Sentiment analysis data from social media and news articles

    • Macroeconomic data (e.g., GDP, unemployment rate, interest rates, consumer spending, building permits, consumer confidence, inflation, producer price index, money supply, home sales, retail sales, bond yields)

    Potential Applications:

    • Stock price prediction

    • Portfolio optimization

    • Algorithmic trading

    • Market sentiment analysis

    • Risk management

    Use Cases:

    • Researchers investigating the effectiveness of machine learning in stock market prediction

    • Analysts developing quantitative trading Buy/Sell strategies

    • Individuals interested in building their own stock market prediction models

    • Students learning about machine learning and financial applications

    Additional Notes:

    • The dataset may include different levels of granularity (e.g., daily, hourly)

    • Data cleaning and preprocessing are essential before model training

    • Regular updates are recommended to maintain the accuracy and relevance of the data

  7. H

    Enhancing Stock Market Forecasting with Machine Learning A PineScript-Driven...

    • dataverse.harvard.edu
    Updated Nov 19, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Gautam Narla (2024). Enhancing Stock Market Forecasting with Machine Learning A PineScript-Driven Approach [Dataset]. http://doi.org/10.7910/DVN/HF0PFX
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Nov 19, 2024
    Dataset provided by
    Harvard Dataverse
    Authors
    Gautam Narla
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    This study investigates the application of machine learning (ML) models in stock market forecasting, with a focus on their integration using PineScript, a domain-specific language for algorithmic trading. Leveraging diverse datasets, including historical stock prices and market sentiment data, we developed and tested various ML models such as neural networks, decision trees, and linear regression. Rigorous backtesting over multiple timeframes and market conditions allowed us to evaluate their predictive accuracy and financial performance. The neural network model demonstrated the highest accuracy, achieving a 75% success rate, significantly outperforming traditional models. Additionally, trading strategies derived from these ML models yielded a return on investment (ROI) of up to 12%, compared to an 8% benchmark index ROI. These findings underscore the transformative potential of ML in refining trading strategies, providing critical insights for financial analysts, investors, and developers. The study draws on insights from 15 peer-reviewed articles, financial datasets, and industry reports, establishing a robust foundation for future exploration of ML-driven financial forecasting. Tools and Technologies Used †PineScript PineScript, a scripting language integrated within the TradingView platform, was the primary tool used to develop and implement the machine learning models. Its robust features allowed for custom indicator creation, strategy backtesting, and real-time market data analysis. †Python Python was utilized for data preprocessing, model training, and performance evaluation. Key libraries included: Pandas

  8. Stock market predictions

    • kaggle.com
    Updated Feb 18, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Tanishq dublish (2024). Stock market predictions [Dataset]. https://www.kaggle.com/datasets/tanishqdublish/stock-market-predictions
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Feb 18, 2024
    Dataset provided by
    Kaggle
    Authors
    Tanishq dublish
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Actually, I prepare this dataset for students on my Deep Learning and NLP course.

    But I am also very happy to see kagglers play around with it.

    Have fun!

    Description:

    There are two channels of data provided in this dataset:

    News data: I crawled historical news headlines from Reddit WorldNews Channel (/r/worldnews). They are ranked by reddit users' votes, and only the top 25 headlines are considered for a single date. (Range: 2008-06-08 to 2016-07-01)

    Stock data: Dow Jones Industrial Average (DJIA) is used to "prove the concept". (Range: 2008-08-08 to 2016-07-01)

    I provided three data files in .csv format:

    RedditNews.csv: two columns The first column is the "date", and second column is the "news headlines". All news are ranked from top to bottom based on how hot they are. Hence, there are 25 lines for each date.

    DJIA_table.csv: Downloaded directly from Yahoo Finance: check out the web page for more info.

    Combined_News_DJIA.csv: To make things easier for my students, I provide this combined dataset with 27 columns. The first column is "Date", the second is "Label", and the following ones are news headlines ranging from "Top1" to "Top25".

  9. i

    Pakistan Stock Exchange Datasets

    • ieee-dataport.org
    Updated May 12, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yosha Jawad (2020). Pakistan Stock Exchange Datasets [Dataset]. https://ieee-dataport.org/documents/pakistan-stock-exchange-datasets
    Explore at:
    Dataset updated
    May 12, 2020
    Authors
    Yosha Jawad
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This is a preprocessed dataset of 2 companies from Pakistan Stock Exchange.

  10. Tweet Sentiment's Impact on Stock Returns

    • kaggle.com
    Updated Jan 16, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Devastator (2023). Tweet Sentiment's Impact on Stock Returns [Dataset]. https://www.kaggle.com/datasets/thedevastator/tweet-sentiment-s-impact-on-stock-returns
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jan 16, 2023
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    The Devastator
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Tweet Sentiment's Impact on Stock Returns

    862,231 Labeled Instances

    By [source]

    About this dataset

    This dataset contains 862,231 labeled tweets and associated stock returns, providing a comprehensive look into the impact of social media on company-level stock market performance. For each tweet, researchers have extracted data such as the date of the tweet and its associated stock symbol, along with metrics such as last price and various returns (1-day return, 2-day return, 3-day return, 7-day return). Also recorded are volatility scores for both 10 day intervals and 30 day intervals. Finally, sentiment scores from both Long Short - Term Memory (LSTM) and TextBlob models have been included to quantify the overall tone in which these messages were delivered. With this dataset you will be able to explore how tweets can affect a company's share prices both short term and long term by leveraging all of these data points for analysis!

    More Datasets

    For more datasets, click here.

    Featured Notebooks

    • 🚨 Your notebook can be here! 🚨!

    How to use the dataset

    In order to use this dataset, users can utilize descriptive statistics such as histograms or regression techniques to establish relationships between tweet content & sentiment with corresponding stock return data points such as 1-day & 7-day returns measurements.

    The primary fields used for analysis include Tweet Text (TWEET), Stock symbol (STOCK), Date (DATE), Closing Price at the time of Tweet (LAST_PRICE) a range of Volatility measures 10 day Volatility(VOLATILITY_10D)and 30 day Volatility(VOLATILITY_30D ) for each Stock which capture changes in market fluctuation during different periods around when Twitter reactions occur. Additionally Sentiment Polarity analysis undertaken via two Machine learning algorithms LSTM Polarity(LSTM_POLARITY)and Textblob polarity provide insight into whether people are expressing positive or negative sentiments about each company at given times which again could influence thereby potentially influence Stock Prices over shorter term periods like 1-Day Returns(1_DAY_RETURN),2-Day Returns(2_DAY_RETURN)or longer term horizon like 7 Day Returns*7DAY RETURNS*.Finally MENTION field indicates if names/acronyms associated with Companies were specifically mentioned in each Tweet or not which gives extra insight into whether company specific contexts were present within individual Tweets aka “Company Relevancy”

    Research Ideas

    • Analyzing the degree to which tweets can influence stock prices. By analyzing relationships between variables such as tweet sentiment and stock returns, correlations can be identified that could be used to inform investment decisions.
    • Exploring natural language processing (NLP) models for predicting future market trends based on textual data such as tweets. Through testing and evaluating different text-based models using this dataset, better predictive models may emerge that can give investors advance warning of upcoming market shifts due to news or other events.
    • Investigating the impact of different types of tweets (positive/negative, factual/opinionated) on stock prices over specific time frames. By studying correlations between the sentiment or nature of a tweet and its effect on stocks, insights may be gained into what sort of news or events have a greater impact on markets in general

    Acknowledgements

    If you use this dataset in your research, please credit the original authors. Data Source

    License

    License: CC0 1.0 Universal (CC0 1.0) - Public Domain Dedication No Copyright - You can copy, modify, distribute and perform the work, even for commercial purposes, all without asking permission. See Other Information.

    Columns

    File: reduced_dataset-release.csv | Column name | Description | |:----------------------|:-------------------------------------------------------------------------------------------------------| | TWEET | Text of the tweet. (String) | | STOCK | Company's stock mentioned in the tweet. (String) | | DATE | Date the tweet was posted. (Date) | | LAST_PRICE | Company's last price at the time of tweeting. (Float) ...

  11. Probabilistic AI: A New Approach to Artificial Intelligence (Forecast)

    • kappasignal.com
    Updated May 27, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    KappaSignal (2023). Probabilistic AI: A New Approach to Artificial Intelligence (Forecast) [Dataset]. https://www.kappasignal.com/2023/05/probabilistic-ai-new-approach-to.html
    Explore at:
    Dataset updated
    May 27, 2023
    Dataset authored and provided by
    KappaSignal
    License

    https://www.kappasignal.com/p/legal-disclaimer.htmlhttps://www.kappasignal.com/p/legal-disclaimer.html

    Description

    This analysis presents a rigorous exploration of financial data, incorporating a diverse range of statistical features. By providing a robust foundation, it facilitates advanced research and innovative modeling techniques within the field of finance.

    Probabilistic AI: A New Approach to Artificial Intelligence

    Financial data:

    • Historical daily stock prices (open, high, low, close, volume)

    • Fundamental data (e.g., market capitalization, price to earnings P/E ratio, dividend yield, earnings per share EPS, price to earnings growth, debt-to-equity ratio, price-to-book ratio, current ratio, free cash flow, projected earnings growth, return on equity, dividend payout ratio, price to sales ratio, credit rating)

    • Technical indicators (e.g., moving averages, RSI, MACD, average directional index, aroon oscillator, stochastic oscillator, on-balance volume, accumulation/distribution A/D line, parabolic SAR indicator, bollinger bands indicators, fibonacci, williams percent range, commodity channel index)

    Machine learning features:

    • Feature engineering based on financial data and technical indicators

    • Sentiment analysis data from social media and news articles

    • Macroeconomic data (e.g., GDP, unemployment rate, interest rates, consumer spending, building permits, consumer confidence, inflation, producer price index, money supply, home sales, retail sales, bond yields)

    Potential Applications:

    • Stock price prediction

    • Portfolio optimization

    • Algorithmic trading

    • Market sentiment analysis

    • Risk management

    Use Cases:

    • Researchers investigating the effectiveness of machine learning in stock market prediction

    • Analysts developing quantitative trading Buy/Sell strategies

    • Individuals interested in building their own stock market prediction models

    • Students learning about machine learning and financial applications

    Additional Notes:

    • The dataset may include different levels of granularity (e.g., daily, hourly)

    • Data cleaning and preprocessing are essential before model training

    • Regular updates are recommended to maintain the accuracy and relevance of the data

  12. H

    Stock Market Next Day Forecast Data for

    • dataverse.harvard.edu
    Updated May 15, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Fairuz Maulana (2025). Stock Market Next Day Forecast Data for [Dataset]. http://doi.org/10.7910/DVN/UXVEZ3
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 15, 2025
    Dataset provided by
    Harvard Dataverse
    Authors
    Fairuz Maulana
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Stock market forecasting remains a complex and challenging task to forecast, traditional technical analysis methods such as RSI, EMA, and Candlestick Patterns often fail to analyze the stock market time series pattern with many recent studies now exploring forecasting using machine learning or neural networks, other studies have improved accuracy or decreased regression losses by applying technical indicators and sentiment analysis. This dataset aims to be used to analyze the performance of machine learning models in predicting the next day's stock market trend by combining technical and sentiment-based features. The technical indicators are derived from historical price data focusing on swing trends in the market and sentiment features are extracted using FinBERT from Benzinga Pro as a reliable financial news source. There are limitations to the dataset especially financial news articles. Limitations such as the availability of news data remain a major challenge that has the potential to improve the performance of a machine learning model.

  13. Stock Market: Historical Data of Top 10 Companies

    • kaggle.com
    Updated Jul 18, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Khushi Pitroda (2023). Stock Market: Historical Data of Top 10 Companies [Dataset]. https://www.kaggle.com/datasets/khushipitroda/stock-market-historical-data-of-top-10-companies/data
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 18, 2023
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Khushi Pitroda
    Description

    The dataset contains a total of 25,161 rows, each row representing the stock market data for a specific company on a given date. The information collected through web scraping from www.nasdaq.com includes the stock prices and trading volumes for the companies listed, such as Apple, Starbucks, Microsoft, Cisco Systems, Qualcomm, Meta, Amazon.com, Tesla, Advanced Micro Devices, and Netflix.

    Data Analysis Tasks:

    1) Exploratory Data Analysis (EDA): Analyze the distribution of stock prices and volumes for each company over time. Visualize trends, seasonality, and patterns in the stock market data using line charts, bar plots, and heatmaps.

    2)Correlation Analysis: Investigate the correlations between the closing prices of different companies to identify potential relationships. Calculate correlation coefficients and visualize correlation matrices.

    3)Top Performers Identification: Identify the top-performing companies based on their stock price growth and trading volumes over a specific time period.

    4)Market Sentiment Analysis: Perform sentiment analysis using Natural Language Processing (NLP) techniques on news headlines related to each company. Determine whether positive or negative news impacts the stock prices and volumes.

    5)Volatility Analysis: Calculate the volatility of each company's stock prices using metrics like Standard Deviation or Bollinger Bands. Analyze how volatile stocks are in comparison to others.

    Machine Learning Tasks:

    1)Stock Price Prediction: Use time-series forecasting models like ARIMA, SARIMA, or Prophet to predict future stock prices for a particular company. Evaluate the models' performance using metrics like Mean Squared Error (MSE) or Root Mean Squared Error (RMSE).

    2)Classification of Stock Movements: Create a binary classification model to predict whether a stock will rise or fall on the next trading day. Utilize features like historical price changes, volumes, and technical indicators for the predictions. Implement classifiers such as Logistic Regression, Random Forest, or Support Vector Machines (SVM).

    3)Clustering Analysis: Cluster companies based on their historical stock performance using unsupervised learning algorithms like K-means clustering. Explore if companies with similar stock price patterns belong to specific industry sectors.

    4)Anomaly Detection: Detect anomalies in stock prices or trading volumes that deviate significantly from the historical trends. Use techniques like Isolation Forest or One-Class SVM for anomaly detection.

    5)Reinforcement Learning for Portfolio Optimization: Formulate the stock market data as a reinforcement learning problem to optimize a portfolio's performance. Apply algorithms like Q-Learning or Deep Q-Networks (DQN) to learn the optimal trading strategy.

    The dataset provided on Kaggle, titled "Stock Market Stars: Historical Data of Top 10 Companies," is intended for learning purposes only. The data has been gathered from public sources, specifically from web scraping www.nasdaq.com, and is presented in good faith to facilitate educational and research endeavors related to stock market analysis and data science.

    It is essential to acknowledge that while we have taken reasonable measures to ensure the accuracy and reliability of the data, we do not guarantee its completeness or correctness. The information provided in this dataset may contain errors, inaccuracies, or omissions. Users are advised to use this dataset at their own risk and are responsible for verifying the data's integrity for their specific applications.

    This dataset is not intended for any commercial or legal use, and any reliance on the data for financial or investment decisions is not recommended. We disclaim any responsibility or liability for any damages, losses, or consequences arising from the use of this dataset.

    By accessing and utilizing this dataset on Kaggle, you agree to abide by these terms and conditions and understand that it is solely intended for educational and research purposes.

    Please note that the dataset's contents, including the stock market data and company names, are subject to copyright and other proprietary rights of the respective sources. Users are advised to adhere to all applicable laws and regulations related to data usage, intellectual property, and any other relevant legal obligations.

    In summary, this dataset is provided "as is" for learning purposes, without any warranties or guarantees, and users should exercise due diligence and judgment when using the data for any purpose.

  14. f

    S1 Data -

    • figshare.com
    zip
    Updated Nov 27, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Hongli Niu; Qiaoying Pan; Kunliang Xu (2023). S1 Data - [Dataset]. http://doi.org/10.1371/journal.pone.0294460.s001
    Explore at:
    zipAvailable download formats
    Dataset updated
    Nov 27, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Hongli Niu; Qiaoying Pan; Kunliang Xu
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The prediction of stock prices has long been a captivating subject in academic research. This study aims to forecast the prices of prominent stocks in five key industries of the Chinese A-share market by leveraging the synergistic power of deep learning techniques and investor sentiment analysis. To achieve this, a sentiment multi-classification dataset is for the first time constructed for China’s stock market, based on four types of sentiments in modern psychology. The significant heterogeneity of sentiment changes in the sectors’ leading stock markets is trained and mined using the Bi-LSTM-ATT model. The impact of multi-classification investor sentiment on stock price prediction was analyzed using the CNN-Bi-LSTM-ATT model. It finds that integrating sentiment indicators into the prediction of industry leading stock prices can enhance the accuracy of the model. Drawing upon four fundamental sentiment types derived from modern psychology, our dataset provides a comprehensive framework for analyzing investor sentiment and its impact on forecasting the stock prices of China’s A-share market.

  15. k

    Machine Learning Predicts QQQ to Increase in Value by 5% in the Next 3...

    • kappasignal.com
    Updated Jun 2, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    KappaSignal (2023). Machine Learning Predicts QQQ to Increase in Value by 5% in the Next 3 Months (Forecast) [Dataset]. https://www.kappasignal.com/2023/06/machine-learning-predicts-qqq-to.html
    Explore at:
    Dataset updated
    Jun 2, 2023
    Dataset authored and provided by
    KappaSignal
    License

    https://www.kappasignal.com/p/legal-disclaimer.htmlhttps://www.kappasignal.com/p/legal-disclaimer.html

    Description

    This analysis presents a rigorous exploration of financial data, incorporating a diverse range of statistical features. By providing a robust foundation, it facilitates advanced research and innovative modeling techniques within the field of finance.

    Machine Learning Predicts QQQ to Increase in Value by 5% in the Next 3 Months

    Financial data:

    • Historical daily stock prices (open, high, low, close, volume)

    • Fundamental data (e.g., market capitalization, price to earnings P/E ratio, dividend yield, earnings per share EPS, price to earnings growth, debt-to-equity ratio, price-to-book ratio, current ratio, free cash flow, projected earnings growth, return on equity, dividend payout ratio, price to sales ratio, credit rating)

    • Technical indicators (e.g., moving averages, RSI, MACD, average directional index, aroon oscillator, stochastic oscillator, on-balance volume, accumulation/distribution A/D line, parabolic SAR indicator, bollinger bands indicators, fibonacci, williams percent range, commodity channel index)

    Machine learning features:

    • Feature engineering based on financial data and technical indicators

    • Sentiment analysis data from social media and news articles

    • Macroeconomic data (e.g., GDP, unemployment rate, interest rates, consumer spending, building permits, consumer confidence, inflation, producer price index, money supply, home sales, retail sales, bond yields)

    Potential Applications:

    • Stock price prediction

    • Portfolio optimization

    • Algorithmic trading

    • Market sentiment analysis

    • Risk management

    Use Cases:

    • Researchers investigating the effectiveness of machine learning in stock market prediction

    • Analysts developing quantitative trading Buy/Sell strategies

    • Individuals interested in building their own stock market prediction models

    • Students learning about machine learning and financial applications

    Additional Notes:

    • The dataset may include different levels of granularity (e.g., daily, hourly)

    • Data cleaning and preprocessing are essential before model training

    • Regular updates are recommended to maintain the accuracy and relevance of the data

  16. o

    Massive Stock Sentiment Dataset

    • opendatabay.com
    .undefined
    Updated Jul 3, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Datasimple (2025). Massive Stock Sentiment Dataset [Dataset]. https://www.opendatabay.com/data/ai-ml/d0828f81-ab19-4e17-9195-b32bad95268c
    Explore at:
    .undefinedAvailable download formats
    Dataset updated
    Jul 3, 2025
    Dataset authored and provided by
    Datasimple
    Area covered
    Finance & Banking Analytics
    Description

    This dataset provides a substantial collection of news sentences paired with their corresponding sentiment, primarily intended for financial analysis and stock prediction. With over 100,000 rows, each entry indicates whether the news is positive (represented by '1') or negative/neutral (represented by '0'), offering insights into potential stock movement. A positive sentiment suggests a likely increase in stock value, while a negative or neutral sentiment indicates a likely decrease [1, 2]. It is noted that the data within this dataset is not shuffled [2].

    Columns

    • Sentiment: A numerical label indicating the sentiment of the news sentence. A value of 0 denotes negative or neutral sentiment, suggesting a stock price might go down. A value of 1 denotes positive sentiment, suggesting a stock price might go up [1, 2]. There are 53,026 instances of 0 and 55,725 instances of 1, making a total of 108,301 unique values in this column [3].
    • Sentence: The actual text of the news article sentence [1, 2]. This column contains the textual data analysed for sentiment.

    Distribution

    The dataset typically comes in CSV format [4] and consists of over 100,000 rows of data [2]. It includes two primary columns: 'Sentiment' and 'Sentence' [1]. The data is presented in an unshuffled order [2]. Specific numbers for records are available for each sentiment label: 53,026 rows for sentiment '0' and 55,725 rows for sentiment '1' [3].

    Usage

    This dataset is ideal for news sentiment analysis and stock prediction [1]. It can be employed to train machine learning models to forecast stock market movements based on news sentiment [1, 2]. Other use cases include developing financial analytics tools, performing large-scale text analysis on financial news, and researching the correlation between media sentiment and economic indicators [2].

    Coverage

    The dataset's regional scope is global [5]. The time range of the data is not specified in the provided information. No specific demographic scope is mentioned for the news sources or the subjects of the news.

    License

    CC-BY-NC

    Who Can Use It

    This dataset is particularly useful for: * Data Scientists and Machine Learning Engineers: For building and training Natural Language Processing (NLP) models to analyse sentiment in text and predict financial outcomes [2]. * Financial Analysts and Researchers: To gain insights into how news sentiment impacts stock performance and for market forecasting [1]. * Developers: To integrate sentiment analysis capabilities into financial applications or trading algorithms. * Academics: For research into financial economics, sentiment analysis, and predictive analytics.

    Dataset Name Suggestions

    • Stock News Sentiment for Market Prediction
    • Financial News Sentiment Analysis Dataset
    • Massive Stock Sentiment Data
    • Market News Sentiment for Stock Forecasting

    Attributes

    Original Data Source: Stock News Sentiment Analysis(Massive Dataset)

  17. Financial Datasets

    • brightdata.com
    .json, .csv, .xlsx
    Updated Dec 5, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bright Data (2023). Financial Datasets [Dataset]. https://brightdata.com/products/datasets/news/financial
    Explore at:
    .json, .csv, .xlsxAvailable download formats
    Dataset updated
    Dec 5, 2023
    Dataset authored and provided by
    Bright Datahttps://brightdata.com/
    License

    https://brightdata.com/licensehttps://brightdata.com/license

    Area covered
    Worldwide
    Description

    Stay informed with our comprehensive Financial News Dataset, designed for investors, analysts, and businesses to track market trends, monitor financial events, and make data-driven decisions.

    Dataset Features

    Financial News Articles: Access structured financial news data, including headlines, summaries, full articles, publication dates, and source details. Market & Economic Indicators: Track financial reports, stock market updates, economic forecasts, and corporate earnings announcements. Sentiment & Trend Analysis: Analyze news sentiment, categorize articles by financial topics, and monitor emerging trends in global markets. Historical & Real-Time Data: Retrieve historical financial news archives or access continuously updated feeds for real-time insights.

    Customizable Subsets for Specific Needs Our Financial News Dataset is fully customizable, allowing you to filter data based on publication date, region, financial topics, sentiment, or specific news sources. Whether you need broad coverage for market research or focused data for investment analysis, we tailor the dataset to your needs.

    Popular Use Cases

    Investment Strategy & Risk Management: Monitor financial news to assess market risks, identify investment opportunities, and optimize trading strategies. Market & Competitive Intelligence: Track industry trends, competitor financial performance, and economic developments. AI & Machine Learning Training: Use structured financial news data to train AI models for sentiment analysis, stock prediction, and automated trading. Regulatory & Compliance Monitoring: Stay updated on financial regulations, policy changes, and corporate governance news. Economic Research & Forecasting: Analyze financial news trends to predict economic shifts and market movements.

    Whether you're tracking stock market trends, analyzing financial sentiment, or training AI models, our Financial News Dataset provides the structured data you need. Get started today and customize your dataset to fit your business objectives.

  18. Beat US Stock market (2019 edition)

    • kaggle.com
    Updated Jan 13, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nicolas Carbone (2020). Beat US Stock market (2019 edition) [Dataset]. https://www.kaggle.com/datasets/cnic92/beat-us-stock-market-data
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jan 13, 2020
    Dataset provided by
    Kaggle
    Authors
    Nicolas Carbone
    Description

    Context

    The algorithmic trading space is buzzing with new strategies. Companies have spent billions in infrastructures and R&D to be able to jump ahead of the competition and beat the market. Still, it is well acknowledged that the buy & hold strategy is able to outperform many of the algorithmic strategies, especially in the long-run. However, finding value in stocks is an art that very few mastered, can a computer do that?

    Content

    This Data repo contains two datasets:

    1. Example_2019_price_var.csv. I built this dataset thanks to Financial Modeling Prep API and to pandas_datareader. Each row is a stock from the technology sector of the US stock market (that is available from the aforementioned API, which is free and highly recommended). The column contains the percent price variation of each stock for the year 2019. In other words, it collects the percent price variation of each stock from the first trading day on Jan 2019 to the last trading day of Dec 2019. To compute this price variation I decided to consider the Adjusted Close Price.

    2. Example_DATASET.csv. I built this dataset thanks to Financial Modeling Prep API. Each row is a stock from the technology sector of the US stock market (that is available from the aforementioned API). Each column is a financial indicator that can be found in the 2018 10-K filings of each company. There are no Nans or empty cells. Furthermore, the last column is the CLASS of each stock, where:

      1. class = 1 if the price of the stock increases during 2019
      2. class = 0 if the price of the stock decreases during 2019

    In other words, the last column is used to classify each stock in buy-worthy or not, and this relationship is what should allow a machine learning model to learn to recognize stocks that will increase their value from those that won't.

    NOTE: the number of stocks does not match between the two datasets because the API did not have all the required financial indicators for some stocks. It is possible to remove from Example_2019_price_var.csv those rows that do not appear in Example_DATASET.csv.

    Inspiration

    I built this dataset during the 2019 winter holidays period, because I wanted to answer a simple question: is it possible to have a machine learning model learn the differences between stocks that perform well and those that don't, and then leverage this knowledge in order to predict which stock will be worth buying? Moreover, is it possible to achieve this simply by looking at financial indicators found in the 10-K filings?

  19. o

    Nairobi Securities Exchange Prices 2008-2012 for 6 selected stocks

    • explore.openaire.eu
    • data.mendeley.com
    Updated Mar 10, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Barack Wanjawa (2020). Nairobi Securities Exchange Prices 2008-2012 for 6 selected stocks [Dataset]. http://doi.org/10.17632/95fb84nzcd
    Explore at:
    Dataset updated
    Mar 10, 2020
    Authors
    Barack Wanjawa
    Description

    Stock market prediction remains active research in a quest to inform investors on how to trade (buy/sell) at the most opportune time. The prevalent methods used by stock market players in trying to predict the likely future trade prices are either technical, fundamental or time series analysis. This research wanted to try out machine learning methods, in contrast to the existing prevalent methods. Artificial neural networks (ANNs) tend to be the preferred machine learning method for this type of application. However, ANNs require some historical data to learn from, in order to do predictions. The research used an ANN model to test the hypothesis that the next day price (prediction) can be determined from the stock prices of the immediate last five days. The final ANN model used for the tests was a feedforward multi-layer perceptron (MLP) with error backpropagation, using sigmoid activation function, with network configuration 5:21:21:1. The data period used was a 5-year dataset (2008 to 2012), with 80% of the data (4-year data) used for training and the balance 20% used for testing (last 1-year data). The original raw data for Nairobi Securities Exchange (NSE) was scrapped from a publicly available and accessible website of a stock market analysis company in Kenya (Synergy, 2020). This daily prices data was first exported to a spreadsheet, then cleaned off headers and other redundant information, leaving only the data with stock name, date of trade and the related data such as volumes, low prices, high prices and adjusted prices. The data was further sorted by the stock names and then the trading dates. The data dimension was finally reduced to only what was needed for the research, which was the stock name, the date of trade and the adjusted price (average trade price). This final dataset was in CSV format, as hereby presented. The research tested three NSE stocks with the mean absolute percentage error (MAPE) ranging between 0.77% to 1.91%, over the 3-month testing period, while the root mean squared error (RMSE) ranged between 1.83 and 3.07. This raw data can be used to train and test any machine learning model that requires training and testing data. The data can also be used to validate and reproduce the results already presented in this research. There could be slight variance between what is obtained when reproducing the results, due to the differences in the final exact weights that the trained ANN model eventually achieves. However, these differences should not be significant. List of data files on this dataset: stock01_NSE_01jan2008_to_31dec2012_Kakuzi.csv stock02_NSE_01jan2008_to_31dec2012_StandardBank.csv stock03_NSE_01jan2008_to_31dec2012_KenyaAirways.csv stock04_NSE_01jan2008_to_31dec2012_BamburiCement.csv stock05_NSE_01jan2008_to_31dec2012_Kengen.csv stock06_NSE_01jan2008_to_31dec2012_BAT.csv References: Synergy Systems Ltd. (2020). MyStocks. Retrieved March 9, 2020, from http://live.mystocks.co.ke/

  20. Can we predict stock market using machine learning? (QRVO Stock Forecast)...

    • kappasignal.com
    Updated Sep 4, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    KappaSignal (2022). Can we predict stock market using machine learning? (QRVO Stock Forecast) (Forecast) [Dataset]. https://www.kappasignal.com/2022/09/can-we-predict-stock-market-using_18.html
    Explore at:
    Dataset updated
    Sep 4, 2022
    Dataset authored and provided by
    KappaSignal
    License

    https://www.kappasignal.com/p/legal-disclaimer.htmlhttps://www.kappasignal.com/p/legal-disclaimer.html

    Description

    This analysis presents a rigorous exploration of financial data, incorporating a diverse range of statistical features. By providing a robust foundation, it facilitates advanced research and innovative modeling techniques within the field of finance.

    Can we predict stock market using machine learning? (QRVO Stock Forecast)

    Financial data:

    • Historical daily stock prices (open, high, low, close, volume)

    • Fundamental data (e.g., market capitalization, price to earnings P/E ratio, dividend yield, earnings per share EPS, price to earnings growth, debt-to-equity ratio, price-to-book ratio, current ratio, free cash flow, projected earnings growth, return on equity, dividend payout ratio, price to sales ratio, credit rating)

    • Technical indicators (e.g., moving averages, RSI, MACD, average directional index, aroon oscillator, stochastic oscillator, on-balance volume, accumulation/distribution A/D line, parabolic SAR indicator, bollinger bands indicators, fibonacci, williams percent range, commodity channel index)

    Machine learning features:

    • Feature engineering based on financial data and technical indicators

    • Sentiment analysis data from social media and news articles

    • Macroeconomic data (e.g., GDP, unemployment rate, interest rates, consumer spending, building permits, consumer confidence, inflation, producer price index, money supply, home sales, retail sales, bond yields)

    Potential Applications:

    • Stock price prediction

    • Portfolio optimization

    • Algorithmic trading

    • Market sentiment analysis

    • Risk management

    Use Cases:

    • Researchers investigating the effectiveness of machine learning in stock market prediction

    • Analysts developing quantitative trading Buy/Sell strategies

    • Individuals interested in building their own stock market prediction models

    • Students learning about machine learning and financial applications

    Additional Notes:

    • The dataset may include different levels of granularity (e.g., daily, hourly)

    • Data cleaning and preprocessing are essential before model training

    • Regular updates are recommended to maintain the accuracy and relevance of the data

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Umara Umar (2024). Dataset for Stock Market Prediction [Dataset]. https://ieee-dataport.org/documents/dataset-stock-market-prediction

Dataset for Stock Market Prediction

Explore at:
10 scholarly articles cite this dataset (View in Google Scholar)
Dataset updated
Jul 8, 2024
Authors
Umara Umar
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Hascol

Search
Clear search
Close search
Google apps
Main menu