https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This Stock Market Dataset is designed for predictive analysis and machine learning applications in financial markets. It includes 13647 records of simulated stock trading data with features commonly used in stock price forecasting.
🔹 Key Features Date – Trading day timestamps (business days only) Open, High, Low, Close – Simulated stock prices Volume – Trading volume per day RSI (Relative Strength Index) – Measures market momentum MACD (Moving Average Convergence Divergence) – Trend-following momentum indicator Sentiment Score – Simulated market sentiment from financial news & social media Target – Binary label (1: Price goes up, 0: Price goes down) for next-day prediction This dataset is useful for training hybrid deep learning models such as LSTM, CNN, and Attention-based networks for stock market forecasting. It enables financial analysts, traders, and AI researchers to experiment with market trends, technical analysis, and sentiment-based predictions.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset contains historical daily prices for all tickers currently trading on NASDAQ. The up to date list is available from nasdaqtrader.com. The historic data is retrieved from Yahoo finance via yfinance python package.
It contains prices for up to 01 of April 2020. If you need more up to date data, just fork and re-run data collection script also available from Kaggle.
The date for every symbol is saved in CSV format with common fields:
All that ticker data is then stored in either ETFs or stocks folder, depending on a type. Moreover, each filename is the corresponding ticker symbol. At last, symbols_valid_meta.csv
contains some additional metadata for each ticker such as full name.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
By [source]
This dataset contains 862,231 labeled tweets and associated stock returns, providing a comprehensive look into the impact of social media on company-level stock market performance. For each tweet, researchers have extracted data such as the date of the tweet and its associated stock symbol, along with metrics such as last price and various returns (1-day return, 2-day return, 3-day return, 7-day return). Also recorded are volatility scores for both 10 day intervals and 30 day intervals. Finally, sentiment scores from both Long Short - Term Memory (LSTM) and TextBlob models have been included to quantify the overall tone in which these messages were delivered. With this dataset you will be able to explore how tweets can affect a company's share prices both short term and long term by leveraging all of these data points for analysis!
For more datasets, click here.
- 🚨 Your notebook can be here! 🚨!
In order to use this dataset, users can utilize descriptive statistics such as histograms or regression techniques to establish relationships between tweet content & sentiment with corresponding stock return data points such as 1-day & 7-day returns measurements.
The primary fields used for analysis include Tweet Text (TWEET), Stock symbol (STOCK), Date (DATE), Closing Price at the time of Tweet (LAST_PRICE) a range of Volatility measures 10 day Volatility(VOLATILITY_10D)and 30 day Volatility(VOLATILITY_30D ) for each Stock which capture changes in market fluctuation during different periods around when Twitter reactions occur. Additionally Sentiment Polarity analysis undertaken via two Machine learning algorithms LSTM Polarity(LSTM_POLARITY)and Textblob polarity provide insight into whether people are expressing positive or negative sentiments about each company at given times which again could influence thereby potentially influence Stock Prices over shorter term periods like 1-Day Returns(1_DAY_RETURN),2-Day Returns(2_DAY_RETURN)or longer term horizon like 7 Day Returns*7DAY RETURNS*.Finally MENTION field indicates if names/acronyms associated with Companies were specifically mentioned in each Tweet or not which gives extra insight into whether company specific contexts were present within individual Tweets aka “Company Relevancy”
- Analyzing the degree to which tweets can influence stock prices. By analyzing relationships between variables such as tweet sentiment and stock returns, correlations can be identified that could be used to inform investment decisions.
- Exploring natural language processing (NLP) models for predicting future market trends based on textual data such as tweets. Through testing and evaluating different text-based models using this dataset, better predictive models may emerge that can give investors advance warning of upcoming market shifts due to news or other events.
- Investigating the impact of different types of tweets (positive/negative, factual/opinionated) on stock prices over specific time frames. By studying correlations between the sentiment or nature of a tweet and its effect on stocks, insights may be gained into what sort of news or events have a greater impact on markets in general
If you use this dataset in your research, please credit the original authors. Data Source
License: CC0 1.0 Universal (CC0 1.0) - Public Domain Dedication No Copyright - You can copy, modify, distribute and perform the work, even for commercial purposes, all without asking permission. See Other Information.
File: reduced_dataset-release.csv | Column name | Description | |:----------------------|:-------------------------------------------------------------------------------------------------------| | TWEET | Text of the tweet. (String) | | STOCK | Company's stock mentioned in the tweet. (String) | | DATE | Date the tweet was posted. (Date) | | LAST_PRICE | Company's last price at the time of tweeting. (Float) ...
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
This dataset contains the historical stock prices and related financial information for five major technology companies: Apple (AAPL), Microsoft (MSFT), Amazon (AMZN), Google (GOOGL), and Tesla (TSLA). The dataset spans a five-year period from January 1, 2019, to January 1, 2024. It includes key stock metrics such as Open, High, Low, Close, Adjusted Close, and Volume for each trading day.
The data was sourced using the yfinance library in Python, which provides convenient access to historical market data from Yahoo Finance.
The dataset contains the following columns:
Date: The trading date. Open: The opening price of the stock on that date. High: The highest price of the stock on that date. Low: The lowest price of the stock on that date. Close: The closing price of the stock on that date. Adj Close: The adjusted closing price, accounting for dividends and splits. Volume: The number of shares traded on that date. Ticker: The stock ticker symbol representing each company.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
After some rigorous SQL queries and coding on python. I made this dataset. In this dataset, all stocks of the Indian Stock Market are present a total of 2435 stocks. The data is of 1-year rows represent stock name and column represent date and I have filled the table with closing price. Enjoy and do some stock price predictions.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Disclaimer: Educational Purposes Only
The financial and International Securities Identification Number (ISIN) data listed on this platform is provided solely for educational purposes. The information is intended to serve as general guidance and does not constitute financial advice, an endorsement, or a recommendation for the purchase or sale of any securities.
While we strive to ensure the accuracy and timeliness of the information presented, we make no representations or warranties, express or implied, regarding the completeness, accuracy, reliability, suitability, or availability of the provided data. Users are encouraged to independently verify any information obtained from this platform before making any investment decisions.
This platform and its operators are not responsible for any errors, omissions, or inaccuracies in the provided data, nor for any actions taken in reliance on such information. Users are strongly advised to conduct thorough research and seek the advice of qualified financial professionals before making any investment decisions.
The use of International Securities Identification Numbers (ISINs) and other financial data is subject to various regulations and licensing agreements. Users are responsible for complying with all applicable laws and respecting any terms and conditions associated with the use of such data.
By accessing and using this platform, users acknowledge and agree that they are doing so at their own risk and discretion. This educational content is not a substitute for professional financial advice, and users should consult with qualified professionals for specific guidance tailored to their individual circumstances.
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Actually, I prepare this dataset for students on my Deep Learning and NLP course.
But I am also very happy to see kagglers play around with it.
Have fun!
Description:
There are two channels of data provided in this dataset:
News data: I crawled historical news headlines from Reddit WorldNews Channel (/r/worldnews). They are ranked by reddit users' votes, and only the top 25 headlines are considered for a single date. (Range: 2008-06-08 to 2016-07-01)
Stock data: Dow Jones Industrial Average (DJIA) is used to "prove the concept". (Range: 2008-08-08 to 2016-07-01)
I provided three data files in .csv format:
RedditNews.csv: two columns The first column is the "date", and second column is the "news headlines". All news are ranked from top to bottom based on how hot they are. Hence, there are 25 lines for each date.
DJIA_table.csv: Downloaded directly from Yahoo Finance: check out the web page for more info.
Combined_News_DJIA.csv: To make things easier for my students, I provide this combined dataset with 27 columns. The first column is "Date", the second is "Label", and the following ones are news headlines ranging from "Top1" to "Top25".
http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/
The subject matter of this dataset explores Tesla's stock price from its initial public offering (IPO) to yesterday.
Within the dataset one will encounter the following:
The date - "Date"
The opening price of the stock - "Open"
The high price of that day - "High"
The low price of that day - "Low"
The closed price of that day - "Close"
The amount of stocks traded during that day - "Volume"
The stock's closing price that has been amended to include any distributions/corporate actions that occurs before next days open - "Adj[usted] Close"
Through Python programming and checking Sentdex out, I acquired the data from Yahoo Finance. The time period represented starts from 06/29/2010 to 03/17/2017.
What happens when the volume of this stock trading increases/decreases in a short and long period of time? What happens when there is a discrepancy between the adjusted close and the next day's opening price?
This dataset was created by Abrar Ahmed
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
This dataset was created by Huzaifa Rashid
Released under Apache 2.0
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Real and up to date stock market exchange of cryptocurrencies can be quite expensive and are hard to get. However, historical financial data are the starting point to develop algorithm(s) to analyze market trend and why not beat the market by predicting market movement.
Data provided in this dataset are historical data from the beginning of OMG-ETH pair market on Kraken exchange up to the present (2021 December). This data comes frome real trades on one of the most popular cryptocurrencies exchange.
Historical market data, also known as trading history, time and sales or tick data, provides a detailed record of every trade that happens on Kraken exchange, and includes the following information: - Timestamp - The exact date and time of each trade. - Price - The price at which each trade occurred. - Volume - The amount of volume that was traded.
In addition, OHLCVT data are provided for the most common period interval: 1 min, 5 min, 15 min, 1 hour, 12 hours and 1 day. OHLCVT stands for Open, High, Low, Close, Volume and Trades and represents the following trading information for each time period: - Open - The first traded price - High - The highest traded price - Low - The lowest traded price - Close - The final traded price - Volume - The total volume traded by all trades - Trades - The number of individual trades
Don't hesitate to tell me if you need other period interval 😉 ...
This dataset will be updated every quarter to add new and up to date market trend. Let me know if you need an update more frequently.
Can you beat the market? Let see what you can do with these data!
Fixed full version - https://www.kaggle.com/jacksoncrow/stock-market-dataset
This dataset was created by ka.na.o
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Stock Market Dataset Columns**
The dataset generated using the yfinance
library typically contains two types of data:
- Historical Stock Prices
- Company Metadata
This data provides a time series of a stock's market performance. Below are the main columns and their explanations:
Column | Description |
---|---|
Date | The date for the recorded stock data. |
Open | The price at which the stock started trading on that day. |
High | The highest price reached during that day. |
Low | The lowest price reached during that day. |
Close | The price at which the stock closed trading on that day. |
Adj Close | The adjusted closing price accounting for corporate actions like stock splits and dividends. |
Volume | The total number of shares traded on that day. |
Date | Open | High | Low | Close | Adj Close | Volume |
---|---|---|---|---|---|---|
2022-01-03 | 170.0 | 172.5 | 169.2 | 172.0 | 171.2 | 1200000 |
This data provides descriptive information about the company associated with the stock. Columns and their meanings include:
Column | Description |
---|---|
Ticker | The stock ticker symbol (e.g., AAPL for Apple Inc.). |
Company | The full name of the company (e.g., Apple Inc. ). |
Sector | The industry sector to which the company belongs (e.g., Technology ). |
Industry | The specific industry within the sector (e.g., Consumer Electronics ). |
Market Cap | The total market value of the company’s outstanding shares in USD. |
P/E Ratio | The company's Price-to-Earnings ratio, indicating how expensive the stock is relative to its earnings. |
Ticker | Company | Sector | Industry | Market Cap | P/E Ratio |
---|---|---|---|---|---|
AAPL | Apple Inc. | Technology | Consumer Hardware | $2.5 Trillion | 28.3 |
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Symbol: This acts as a unique identifier for a particular stock on a specific exchange. Just like AAPL represents Apple Inc. on the NASDAQ exchange. Name: This is the full name of the company that issued the stock. Currency: This indicates the currency in which the stock is traded. Examples include USD (US Dollar), EUR (Euro), and JPY (Japanese Yen). Exchange: This refers to the stock exchange where the stock is traded. NASDAQ and NYSE are some well-known exchanges. MIC Code: This stands for Market Identifier Code and is used to uniquely identify a specific exchange or trading venue. Country: This specifies the country of incorporation of the company that issued the stock. Type: the type of the st0ck
This dataset was created by Farrukh Ali
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset was created by Vzu Boora
Released under CC0: Public Domain
gold price dataset for a stock market analysis. Reference from Quandl https://www.quandl.com/
By Jon Loyens [source]
This powerful dataset brings together publically-available information from leading stock markets with extensive details about corporate board members. For each company, discover not only their board composition and background, but also current market dynamics, trends and rule changes affecting them. Whether you're a teacher looking to add more detail to a class presentation or an investor seeking a competitive edge in the market - this dataset provides comprehensive insights into the world of stocks and those that play an influential role on its direction. Unprecedented access awaits as you explore hypothetical investments and strategies or actual risks associated with established entities today
For more datasets, click here.
- 🚨 Your notebook can be here! 🚨!
Using this dataset, you can gain a better understanding of the relationship between corporate board members and stock market performance. You can analyze the data to determine the average performance of board members at different companies and compare it to the overall performance of other stocks. In addition, you can look into correlations between individual stocks, various industries, and different groups of companies with similar board membership profiles. This dataset provides an overview of all major stocks across multiple industries with detailed insights on each stock's current and past market performance as well as corporate boards
- Analyzing the performance of individual board members in relation to their company’s stock market performance.
- Determining if certain board members are better at making decisions that benefit the company’s stock market position across all companies they have a stake in.
- Identifying correlations between trends in different companies' stocks and external factors such as the influence of particular board members or other events associated with that company's sectors or markets
If you use this dataset in your research, please credit the original authors. Data Source
See the dataset description for more information.
File: boardmembers.csv | Column name | Description | |:--------------------|:-----------------------------------| | BoardMemberName | Name of the board member. (String) | | CompanyName | Name of the company. (String) | | Source | Source of the data. (String) |
If you use this dataset in your research, please credit the original authors. If you use this dataset in your research, please credit Jon Loyens.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Netflix, Inc. is an American media company engaged in paid streaming and the production of films and series.
Market capitalization of Netflix (NFLX)
Market cap: $517.08 Billion USD
As of June 2025 Netflix has a market cap of $517.08 Billion USD. This makes Netflix the world's 19th most valuable company by market cap according to our data. The market capitalization, commonly called market cap, is the total market value of a publicly traded company's outstanding shares and is commonly used to measure how much a company is worth.
Revenue for Netflix (NFLX)
Revenue in 2025: $40.17 Billion USD
According to Netflix's latest financial reports the company's current revenue (TTM ) is $40.17 Billion USD. In 2024 the company made a revenue of $39.00 Billion USD an increase over the revenue in the year 2023 that were of $33.72 Billion USD. The revenue is the total amount of income that a company generates by the sale of goods or services. Unlike with the earnings no expenses are subtracted.
Earnings for Netflix (NFLX)
Earnings in 2025 (TTM): $11.31 Billion USD
According to Netflix's latest financial reports the company's current earnings are $40.17 Billion USD. In 2024 the company made an earning of $10.70 Billion USD, an increase over its 2023 earnings that were of $7.02 Billion USD. The earnings displayed on this page is the company's Pretax Income.
On Jun 12th, 2025 the market cap of Netflix was reported to be:
$517.08 Billion USD by Yahoo Finance
$517.08 Billion USD by CompaniesMarketCap
$517.21 Billion USD by Nasdaq
Geography: USA
Time period: May 2002- June 2025
Unit of analysis: Netflix Stock Data 2025
Variable | Description |
---|---|
date | date |
open | The price at market open. |
high | The highest price for that day. |
low | The lowest price for that day. |
close | The price at market close, adjusted for splits. |
adj_close | The closing price after adjustments for all applicable splits and dividend distributions. Data is adjusted using appropriate split and dividend multipliers, adhering to Center for Research in Security Prices (CRSP) standards. |
volume | The number of shares traded on that day. |
This dataset belongs to me. I’m sharing it here for free. You may do with it as you wish.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This Stock Market Dataset is designed for predictive analysis and machine learning applications in financial markets. It includes 13647 records of simulated stock trading data with features commonly used in stock price forecasting.
🔹 Key Features Date – Trading day timestamps (business days only) Open, High, Low, Close – Simulated stock prices Volume – Trading volume per day RSI (Relative Strength Index) – Measures market momentum MACD (Moving Average Convergence Divergence) – Trend-following momentum indicator Sentiment Score – Simulated market sentiment from financial news & social media Target – Binary label (1: Price goes up, 0: Price goes down) for next-day prediction This dataset is useful for training hybrid deep learning models such as LSTM, CNN, and Attention-based networks for stock market forecasting. It enables financial analysts, traders, and AI researchers to experiment with market trends, technical analysis, and sentiment-based predictions.