The dataset contains a total of 25,161 rows, each row representing the stock market data for a specific company on a given date. The information collected through web scraping from www.nasdaq.com includes the stock prices and trading volumes for the companies listed, such as Apple, Starbucks, Microsoft, Cisco Systems, Qualcomm, Meta, Amazon.com, Tesla, Advanced Micro Devices, and Netflix.
Data Analysis Tasks:
1) Exploratory Data Analysis (EDA): Analyze the distribution of stock prices and volumes for each company over time. Visualize trends, seasonality, and patterns in the stock market data using line charts, bar plots, and heatmaps.
2)Correlation Analysis: Investigate the correlations between the closing prices of different companies to identify potential relationships. Calculate correlation coefficients and visualize correlation matrices.
3)Top Performers Identification: Identify the top-performing companies based on their stock price growth and trading volumes over a specific time period.
4)Market Sentiment Analysis: Perform sentiment analysis using Natural Language Processing (NLP) techniques on news headlines related to each company. Determine whether positive or negative news impacts the stock prices and volumes.
5)Volatility Analysis: Calculate the volatility of each company's stock prices using metrics like Standard Deviation or Bollinger Bands. Analyze how volatile stocks are in comparison to others.
Machine Learning Tasks:
1)Stock Price Prediction: Use time-series forecasting models like ARIMA, SARIMA, or Prophet to predict future stock prices for a particular company. Evaluate the models' performance using metrics like Mean Squared Error (MSE) or Root Mean Squared Error (RMSE).
2)Classification of Stock Movements: Create a binary classification model to predict whether a stock will rise or fall on the next trading day. Utilize features like historical price changes, volumes, and technical indicators for the predictions. Implement classifiers such as Logistic Regression, Random Forest, or Support Vector Machines (SVM).
3)Clustering Analysis: Cluster companies based on their historical stock performance using unsupervised learning algorithms like K-means clustering. Explore if companies with similar stock price patterns belong to specific industry sectors.
4)Anomaly Detection: Detect anomalies in stock prices or trading volumes that deviate significantly from the historical trends. Use techniques like Isolation Forest or One-Class SVM for anomaly detection.
5)Reinforcement Learning for Portfolio Optimization: Formulate the stock market data as a reinforcement learning problem to optimize a portfolio's performance. Apply algorithms like Q-Learning or Deep Q-Networks (DQN) to learn the optimal trading strategy.
The dataset provided on Kaggle, titled "Stock Market Stars: Historical Data of Top 10 Companies," is intended for learning purposes only. The data has been gathered from public sources, specifically from web scraping www.nasdaq.com, and is presented in good faith to facilitate educational and research endeavors related to stock market analysis and data science.
It is essential to acknowledge that while we have taken reasonable measures to ensure the accuracy and reliability of the data, we do not guarantee its completeness or correctness. The information provided in this dataset may contain errors, inaccuracies, or omissions. Users are advised to use this dataset at their own risk and are responsible for verifying the data's integrity for their specific applications.
This dataset is not intended for any commercial or legal use, and any reliance on the data for financial or investment decisions is not recommended. We disclaim any responsibility or liability for any damages, losses, or consequences arising from the use of this dataset.
By accessing and utilizing this dataset on Kaggle, you agree to abide by these terms and conditions and understand that it is solely intended for educational and research purposes.
Please note that the dataset's contents, including the stock market data and company names, are subject to copyright and other proprietary rights of the respective sources. Users are advised to adhere to all applicable laws and regulations related to data usage, intellectual property, and any other relevant legal obligations.
In summary, this dataset is provided "as is" for learning purposes, without any warranties or guarantees, and users should exercise due diligence and judgment when using the data for any purpose.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The main stock market index of United States, the US500, rose to 6008 points on June 9, 2025, gaining 0.13% from the previous session. Over the past month, the index has climbed 2.80% and is up 12.07% compared to the same time last year, according to trading on a contract for difference (CFD) that tracks this benchmark index from United States. United States Stock Market Index - values, historical data, forecasts and news - updated on June of 2025.
https://www.gnu.org/licenses/gpl-3.0.htmlhttps://www.gnu.org/licenses/gpl-3.0.html
This dataset consists of five CSV files that provide detailed data on a stock portfolio and related market performance over the last 5 years. It includes portfolio positions, stock prices, and major U.S. market indices (NASDAQ, S&P 500, and Dow Jones). The data is essential for conducting portfolio analysis, financial modeling, and performance tracking.
This file contains the portfolio composition with details about individual stock positions, including the quantity of shares, sector, and their respective weights in the portfolio. The data also includes the stock's closing price.
Ticker
: The stock symbol (e.g., AAPL, TSLA) Quantity
: The number of shares in the portfolio Sector
: The sector the stock belongs to (e.g., Technology, Healthcare) Close
: The closing price of the stock Weight
: The weight of the stock in the portfolio (as a percentage of total portfolio)This file contains historical pricing data for the stocks in the portfolio. It includes daily open, high, low, close prices, adjusted close prices, returns, and volume of traded stocks.
Date
: The date of the data point Ticker
: The stock symbol Open
: The opening price of the stock on that day High
: The highest price reached on that day Low
: The lowest price reached on that day Close
: The closing price of the stock Adjusted
: The adjusted closing price after stock splits and dividends Returns
: Daily percentage return based on close prices Volume
: The volume of shares traded that dayThis file contains historical pricing data for the NASDAQ Composite index, providing similar data as in the Portfolio Prices file, but for the NASDAQ market index.
Date
: The date of the data point Ticker
: The stock symbol (for NASDAQ index, this will be "IXIC") Open
: The opening price of the index High
: The highest value reached on that day Low
: The lowest value reached on that day Close
: The closing value of the index Adjusted
: The adjusted closing value after any corporate actions Returns
: Daily percentage return based on close values Volume
: The volume of shares tradedThis file contains similar historical pricing data, but for the S&P 500 index, providing insights into the performance of the top 500 U.S. companies.
Date
: The date of the data point Ticker
: The stock symbol (for S&P 500 index, this will be "SPX") Open
: The opening price of the index High
: The highest value reached on that day Low
: The lowest value reached on that day Close
: The closing value of the index Adjusted
: The adjusted closing value after any corporate actions Returns
: Daily percentage return based on close values Volume
: The volume of shares tradedThis file contains similar historical pricing data for the Dow Jones Industrial Average, providing insights into one of the most widely followed stock market indices in the world.
Date
: The date of the data point Ticker
: The stock symbol (for Dow Jones index, this will be "DJI") Open
: The opening price of the index High
: The highest value reached on that day Low
: The lowest value reached on that day Close
: The closing value of the index Adjusted
: The adjusted closing value after any corporate actions Returns
: Daily percentage return based on close values Volume
: The volume of shares tradedThis data is received using a custom framework that fetches real-time and historical stock data from Yahoo Finance. It provides the portfolio’s data based on user-specific stock holdings and performance, allowing for personalized analysis. The personal framework ensures the portfolio data is automatically retrieved and updated with the latest stock prices, returns, and performance metrics.
This part of the dataset would typically involve data specific to a particular user’s stock positions, weights, and performance, which can be integrated with the other files for portfolio performance analysis.
https://brightdata.com/licensehttps://brightdata.com/license
Use our Stock Market dataset to access comprehensive financial and corporate data, including company profiles, stock prices, market capitalization, revenue, and key performance metrics. This dataset is tailored for financial analysts, investors, and researchers to analyze market trends and evaluate company performance.
Popular use cases include investment research, competitor benchmarking, and trend forecasting. Leverage this dataset to make informed financial decisions, identify growth opportunities, and gain a deeper understanding of the business landscape. The dataset includes all major data points: company name, company ID, summary, stock ticker, earnings date, closing price, previous close, opening price, and much more.
NASDAQ (National Association of Securities Dealers Automated Quotation) is the world's second largest automated and electronic stock exchange and securities market in the United States, the first being the New York Stock Exchange, with more than 8,000 companies and corporations. It has more trading volume per hour than any other stock exchange in the world. More than 7,000 small and mid-cap stocks are traded on the NASDAQ. It is characterized by comprising high-tech companies in electronics, computers, telecommunications, biotechnology, and many others. This dataset was created as a result of an automatic extraction of open & public data available in nasdaq.com, using web scraping techniques. The only purpose of creating it was for academic reasons https://github.com/jadvani/NasdaqScraper
End-of-day prices refer to the closing prices of various financial instruments, such as equities (stocks), bonds, and indices, at the end of a trading session on a particular trading day. These prices are crucial pieces of market data used by investors, traders, and financial institutions to track the performance and value of these assets over time. The Techsalerator closing prices dataset is considered the most up-to-date, standardized valuation of a security trading commences again on the next trading day. This data is used for portfolio valuation, index calculation, technical analysis and benchmarking throughout the financial industry. The End-of-Day Pricing service covers equities, equity derivative bonds, and indices listed on 170 markets worldwide.
This dataset contains a detailed information on companies listed in the NASDAQ exchanges. The dataset also includes the market category as well as the financial status of the listed companies.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
DAX
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Historical daily stock market data for 4500+ Nasdaq listed companies. Dataset to be updated quarterly.
Date: dates vary depending on stock, all in DD/MM/YYYY format Open: open price (in USD) High: high price (in USD) Low: low price (in USD) Close: close price (in USD) Adj close: adjusted close price (in USD) Volume: traded volume (in USD)
Banner photo: https://www.pexels.com/photo/numbers-on-monitor-534216/
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The main stock market index of United States, the US500, rose to 6000 points on June 6, 2025, gaining 1.03% from the previous session. Over the past month, the index has climbed 6.55% and is up 12.22% compared to the same time last year, according to trading on a contract for difference (CFD) that tracks this benchmark index from United States. United States Stock Market Index - values, historical data, forecasts and news - updated on June of 2025.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
United States US: Number of Listed Domestic Companies: Total data was reported at 4,336.000 Unit in 2017. This records an increase from the previous number of 4,331.000 Unit for 2016. United States US: Number of Listed Domestic Companies: Total data is updated yearly, averaging 5,930.000 Unit from Dec 1980 (Median) to 2017, with 38 observations. The data reached an all-time high of 8,090.000 Unit in 1996 and a record low of 4,102.000 Unit in 2012. United States US: Number of Listed Domestic Companies: Total data remains active status in CEIC and is reported by World Bank. The data is categorized under Global Database’s United States – Table US.World Bank.WDI: Financial Sector. Listed domestic companies, including foreign companies which are exclusively listed, are those which have shares listed on an exchange at the end of the year. Investment funds, unit trusts, and companies whose only business goal is to hold shares of other listed companies, such as holding companies and investment companies, regardless of their legal status, are excluded. A company with several classes of shares is counted once. Only companies admitted to listing on the exchange are included.; ; World Federation of Exchanges database.; Sum; Stock market data were previously sourced from Standard & Poor's until they discontinued their 'Global Stock Markets Factbook' and database in April 2013. Time series have been replaced in December 2015 with data from the World Federation of Exchanges and may differ from the previous S&P definitions and methodology.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The main stock market index in the United States (US500) increased 31 points or 0.53% since the beginning of 2025, according to trading on a contract for difference (CFD) that tracks this benchmark index from United States. United States Stock Market Index - values, historical data, forecasts and news - updated on May of 2025.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Japan's main stock market index, the JP225, rose to 38094 points on June 9, 2025, gaining 0.93% from the previous session. Over the past month, the index has climbed 1.19%, though it remains 2.42% lower than a year ago, according to trading on a contract for difference (CFD) that tracks this benchmark index from Japan. Japan Stock Market Index (JP225) - values, historical data, forecasts and news - updated on June of 2025.
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
This dataset contains almost all the stocks listed on these exchanges as of the date shown in the file name. Some of the symbols cannot be found on Yahoo Finance, which I plan on using CNN Money to scrape. There are other symbols that have different classes that require some modification before I can make them queryable... I have yet to decide on the best course of action. If you want to know what these excluded symbols are, see excluded_symbols.txt.
Note: there used to be some tickers missing because of poor connection, that's been solved now.
I've also been asked why I don't put everything into one table, and here's my rationale (copy/pasted from my email):
It is possible and I've debated this before, but I've decided to go with individual files for quite a number of reasons, and I highly recommend you consider these before combining them: 1) I don't need to load everything into memory or search for the right rows if I only want to work with particular sets, 2) easier and faster to manipulate (append, remove, or whatever) when all the data of a ticker is in the same place, 3) I don't need to repeat ticker names for each row just to know which row belongs to which ticker, 4) reduce risk, latency, and waits during parallel processing of different ticker data, 5) in case of any unforeseen bad writes or termination, this way reduces the chances of affecting the entire dataset and allows for restart anytime without the need to keep backup things up every 5 minutes. I get all these benefits only at the cost of slightly larger compressed file and a few more lines of code. To me it's worth it, but I can understand if you are frustrated, but it is possible to concatenate everything.
https://github.com/qks1lver/redtide
Listing files (i.e. NYSE.txt) are from http://eoddata.com/symbols.aspx
Daily historical data compiled from Yahoo Finance
If you have questions, e-mail me: jiunyyen@gmail.com
Happy mining!
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
United States US: Stocks Traded: Total Value data was reported at 39,785.881 USD bn in 2017. This records a decrease from the previous number of 42,071.330 USD bn for 2016. United States US: Stocks Traded: Total Value data is updated yearly, averaging 17,934.293 USD bn from Dec 1984 (Median) to 2017, with 34 observations. The data reached an all-time high of 47,245.496 USD bn in 2008 and a record low of 1,108.421 USD bn in 1984. United States US: Stocks Traded: Total Value data remains active status in CEIC and is reported by World Bank. The data is categorized under Global Database’s United States – Table US.World Bank.WDI: Financial Sector. The value of shares traded is the total number of shares traded, both domestic and foreign, multiplied by their respective matching prices. Figures are single counted (only one side of the transaction is considered). Companies admitted to listing and admitted to trading are included in the data. Data are end of year values converted to U.S. dollars using corresponding year-end foreign exchange rates.; ; World Federation of Exchanges database.; Sum; Stock market data were previously sourced from Standard & Poor's until they discontinued their 'Global Stock Markets Factbook' and database in April 2013. Time series have been replaced in December 2015 with data from the World Federation of Exchanges and may differ from the previous S&P definitions and methodology.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
United Kingdom's main stock market index, the GB100, rose to 8838 points on June 6, 2025, gaining 0.30% from the previous session. Over the past month, the index has climbed 3.25% and is up 7.19% compared to the same time last year, according to trading on a contract for difference (CFD) that tracks this benchmark index from United Kingdom. United Kingdom Stock Market Index (GB100) - values, historical data, forecasts and news - updated on June of 2025.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
China's main stock market index, the SHANGHAI, rose to 3385 points on June 6, 2025, gaining 0.04% from the previous session. Over the past month, the index has climbed 1.28% and is up 10.95% compared to the same time last year, according to trading on a contract for difference (CFD) that tracks this benchmark index from China. China Shanghai Composite Stock Market Index - values, historical data, forecasts and news - updated on June of 2025.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Stock Market Tweets Data
Overview
This dataset is the same as the Stock Market Tweets Data on IEEE by Bruno Taborda.
Data Description
This dataset contains 943,672 tweets collected between April 9 and July 16, 2020, using the S&P 500 tag (#SPX500), the references to the top 25 companies in the S&P 500 index, and the Bloomberg tag (#stocks).
Dataset Structure
created_at: The exact time this tweet was posted. text: The text of the tweet, providing… See the full description on the dataset page: https://huggingface.co/datasets/StephanAkkerman/stock-market-tweets-data.
Open Government Licence - Canada 2.0https://open.canada.ca/en/open-government-licence-canada
License information was derived automatically
This table contains 14 series, with data starting from 1953 (not all combinations necessarily have data for all years). This table contains data described by the following dimensions (Not all combinations are available): Geography (1 items: Canada ...), Stock market statistics (14 items: Toronto Stock Exchange; value of shares traded; United States common stocks; Dow-Jones industrials; high; United States common stocks; Dow-Jones industrials; low; Toronto Stock Exchange; volume of shares traded ...).
Get Nasdaq real-time and historical data with support for fast market replay at over 19 million book updates per second. Test our data for free with only 4 lines of code.
Nasdaq TotalView-ITCH is a proprietary data feed that disseminates full order book depth and last sale data from the Nasdaq stock market (XNAS). It delivers every quote and order at each price level, along with any event that updates the order book after an order is placed, such as trade executions, modifications, or cancellations. Nasdaq is the most active US equity exchange by volume and represented 13.03% of the average daily volume (ADV) as of January 2025.
With its L3 granularity, Nasdaq TotalView-ITCH captures information beyond the L1, top-of-book data available through SIP feeds and enables more accurate modeling of book imbalances, trade directionality, quote lifetimes, and more. This includes explicit trade aggressor side, odd lots, auction imbalance data, and the Net Order Imbalance Indicator (NOII) for the Nasdaq Opening and Closing Crosses and Nasdaq IPO/Halt Cross—the best predictor of Nasdaq opening and closing prices available. Other key advantages of Nasdaq TotalView-ITCH over SIP data include faster real-time dissemination and precise exchange-side timestamping directly from Nasdaq.
Real-time Nasdaq TotalView-ITCH data is included with a Plus or Unlimited subscription through our Databento US Equities service. Historical data is available for usage-based rates or with any subscription. Visit our pricing page for more details or to upgrade your plan.
Breadth of coverage: 20,329 products
Asset class(es): Equities
Origin: Directly captured at Equinix NY4 (Secaucus, NJ) with an FPGA-based network card and hardware timestamping. Synchronized to UTC with PTP.
Supported data encodings: DBN, CSV, JSON Learn more
Supported market data schemas: MBO, MBP-1, MBP-10, BBO-1s, BBO-1m, TBBO, Trades, OHLCV-1s, OHLCV-1m, OHLCV-1h, OHLCV-1d, Definition, Statistics, Status, Imbalance Learn more
Resolution: Immediate publication, nanosecond-resolution timestamps
The dataset contains a total of 25,161 rows, each row representing the stock market data for a specific company on a given date. The information collected through web scraping from www.nasdaq.com includes the stock prices and trading volumes for the companies listed, such as Apple, Starbucks, Microsoft, Cisco Systems, Qualcomm, Meta, Amazon.com, Tesla, Advanced Micro Devices, and Netflix.
Data Analysis Tasks:
1) Exploratory Data Analysis (EDA): Analyze the distribution of stock prices and volumes for each company over time. Visualize trends, seasonality, and patterns in the stock market data using line charts, bar plots, and heatmaps.
2)Correlation Analysis: Investigate the correlations between the closing prices of different companies to identify potential relationships. Calculate correlation coefficients and visualize correlation matrices.
3)Top Performers Identification: Identify the top-performing companies based on their stock price growth and trading volumes over a specific time period.
4)Market Sentiment Analysis: Perform sentiment analysis using Natural Language Processing (NLP) techniques on news headlines related to each company. Determine whether positive or negative news impacts the stock prices and volumes.
5)Volatility Analysis: Calculate the volatility of each company's stock prices using metrics like Standard Deviation or Bollinger Bands. Analyze how volatile stocks are in comparison to others.
Machine Learning Tasks:
1)Stock Price Prediction: Use time-series forecasting models like ARIMA, SARIMA, or Prophet to predict future stock prices for a particular company. Evaluate the models' performance using metrics like Mean Squared Error (MSE) or Root Mean Squared Error (RMSE).
2)Classification of Stock Movements: Create a binary classification model to predict whether a stock will rise or fall on the next trading day. Utilize features like historical price changes, volumes, and technical indicators for the predictions. Implement classifiers such as Logistic Regression, Random Forest, or Support Vector Machines (SVM).
3)Clustering Analysis: Cluster companies based on their historical stock performance using unsupervised learning algorithms like K-means clustering. Explore if companies with similar stock price patterns belong to specific industry sectors.
4)Anomaly Detection: Detect anomalies in stock prices or trading volumes that deviate significantly from the historical trends. Use techniques like Isolation Forest or One-Class SVM for anomaly detection.
5)Reinforcement Learning for Portfolio Optimization: Formulate the stock market data as a reinforcement learning problem to optimize a portfolio's performance. Apply algorithms like Q-Learning or Deep Q-Networks (DQN) to learn the optimal trading strategy.
The dataset provided on Kaggle, titled "Stock Market Stars: Historical Data of Top 10 Companies," is intended for learning purposes only. The data has been gathered from public sources, specifically from web scraping www.nasdaq.com, and is presented in good faith to facilitate educational and research endeavors related to stock market analysis and data science.
It is essential to acknowledge that while we have taken reasonable measures to ensure the accuracy and reliability of the data, we do not guarantee its completeness or correctness. The information provided in this dataset may contain errors, inaccuracies, or omissions. Users are advised to use this dataset at their own risk and are responsible for verifying the data's integrity for their specific applications.
This dataset is not intended for any commercial or legal use, and any reliance on the data for financial or investment decisions is not recommended. We disclaim any responsibility or liability for any damages, losses, or consequences arising from the use of this dataset.
By accessing and utilizing this dataset on Kaggle, you agree to abide by these terms and conditions and understand that it is solely intended for educational and research purposes.
Please note that the dataset's contents, including the stock market data and company names, are subject to copyright and other proprietary rights of the respective sources. Users are advised to adhere to all applicable laws and regulations related to data usage, intellectual property, and any other relevant legal obligations.
In summary, this dataset is provided "as is" for learning purposes, without any warranties or guarantees, and users should exercise due diligence and judgment when using the data for any purpose.