I wanted to scrape data on stocks from Yahoo for ML purposes (predicting prices from other features). The code used for scraping is here: https://github.com/nateGeorge/scrape_stocks
I will update this later if I find time.
https://crawlfeeds.com/privacy_policyhttps://crawlfeeds.com/privacy_policy
The Yahoo Stocks Dataset is an invaluable resource for analysts, traders, and developers looking to enhance their financial data models or trading strategies. Sourced from Yahoo Finance, this dataset includes historical stock prices, market trends, and financial indicators. With its accurate and comprehensive data, it empowers users to analyze patterns, forecast trends, and build robust machine learning models.
Whether you're a seasoned stock market analyst or a beginner in financial data science, this dataset is tailored to meet diverse needs. It features details like stock prices, trading volume, and market capitalization, enabling a deep dive into investment opportunities and market dynamics.
For machine learning and AI enthusiasts, the Yahoo Stocks Dataset is a goldmine. It’s perfect for developing predictive models, such as stock price forecasting and sentiment analysis. The dataset's structured format ensures seamless integration into Python, R, and other analytics platforms, making data visualization and reporting effortless.
Additionally, this dataset supports long-term trend analysis, helping investors make informed decisions. It’s also an essential resource for those conducting research in algorithmic trading and portfolio management.
Key benefits include:
Download the Yahoo Stocks Dataset today and harness the power of financial data for your projects. Whether for AI, financial reporting, or trend analysis, this dataset equips you with the tools to succeed in the dynamic world of stock markets.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
## Overview
Cut Stock is a dataset for instance segmentation tasks - it contains Item annotations for 413 images.
## Getting Started
You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
## License
This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
http://www.gnu.org/licenses/old-licenses/gpl-2.0.en.htmlhttp://www.gnu.org/licenses/old-licenses/gpl-2.0.en.html
Contains stock prices and other details for stocks listed in NEPSE, categorized by date and stock.
All data herein were extracted by web-scraping the official website of the Nepal Stock Exchange (old website). NEPSE official website: http://www.nepalstock.com/
Company details were obtained by web-scraping the webpage at the following link. The data obtained can be found in the "companies_with_details.csv" file. "http://www.nepalstock.com/company">http://www.nepalstock.com/company
Stock Prices and other details for each day starting 2022-06-03 till 2022-07-08 were obtained by web-scraping webpage at the following link. The data obtained can be found in the "By_Date" folder. "http://www.nepalstock.com/todaysprice">http://www.nepalstock.com/todaysprice
Python and BeautifulSoup were used to do the scrapping. 2012-06-03 was used as the start date of data collection because this seems to be the oldest date where data exist at the above link. Non-Traded days have been excluded.
The data obtained thus was further combed through to categorize the data based on individual stocks. The data obtained can be found in the "By_Stock" folder. Note that a few filenames may not match exactly with their company names (as listed). For example, "&" in the listed company name has been replaced with "and" in the stock's filename. Similarly, a '/' in the company name has been replaced with '(underscore)' in the stock's filename. This was done because kaggle does not allow '&' in the filename and Mac OS did not allow '/' in the filename.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset is about stocks per day. It has 4,398 rows and is filtered where the stock is CUT. It features 3 columns: stock, and opening price.
The dataset contains a total of 25,161 rows, each row representing the stock market data for a specific company on a given date. The information collected through web scraping from www.nasdaq.com includes the stock prices and trading volumes for the companies listed, such as Apple, Starbucks, Microsoft, Cisco Systems, Qualcomm, Meta, Amazon.com, Tesla, Advanced Micro Devices, and Netflix.
Data Analysis Tasks:
1) Exploratory Data Analysis (EDA): Analyze the distribution of stock prices and volumes for each company over time. Visualize trends, seasonality, and patterns in the stock market data using line charts, bar plots, and heatmaps.
2)Correlation Analysis: Investigate the correlations between the closing prices of different companies to identify potential relationships. Calculate correlation coefficients and visualize correlation matrices.
3)Top Performers Identification: Identify the top-performing companies based on their stock price growth and trading volumes over a specific time period.
4)Market Sentiment Analysis: Perform sentiment analysis using Natural Language Processing (NLP) techniques on news headlines related to each company. Determine whether positive or negative news impacts the stock prices and volumes.
5)Volatility Analysis: Calculate the volatility of each company's stock prices using metrics like Standard Deviation or Bollinger Bands. Analyze how volatile stocks are in comparison to others.
Machine Learning Tasks:
1)Stock Price Prediction: Use time-series forecasting models like ARIMA, SARIMA, or Prophet to predict future stock prices for a particular company. Evaluate the models' performance using metrics like Mean Squared Error (MSE) or Root Mean Squared Error (RMSE).
2)Classification of Stock Movements: Create a binary classification model to predict whether a stock will rise or fall on the next trading day. Utilize features like historical price changes, volumes, and technical indicators for the predictions. Implement classifiers such as Logistic Regression, Random Forest, or Support Vector Machines (SVM).
3)Clustering Analysis: Cluster companies based on their historical stock performance using unsupervised learning algorithms like K-means clustering. Explore if companies with similar stock price patterns belong to specific industry sectors.
4)Anomaly Detection: Detect anomalies in stock prices or trading volumes that deviate significantly from the historical trends. Use techniques like Isolation Forest or One-Class SVM for anomaly detection.
5)Reinforcement Learning for Portfolio Optimization: Formulate the stock market data as a reinforcement learning problem to optimize a portfolio's performance. Apply algorithms like Q-Learning or Deep Q-Networks (DQN) to learn the optimal trading strategy.
The dataset provided on Kaggle, titled "Stock Market Stars: Historical Data of Top 10 Companies," is intended for learning purposes only. The data has been gathered from public sources, specifically from web scraping www.nasdaq.com, and is presented in good faith to facilitate educational and research endeavors related to stock market analysis and data science.
It is essential to acknowledge that while we have taken reasonable measures to ensure the accuracy and reliability of the data, we do not guarantee its completeness or correctness. The information provided in this dataset may contain errors, inaccuracies, or omissions. Users are advised to use this dataset at their own risk and are responsible for verifying the data's integrity for their specific applications.
This dataset is not intended for any commercial or legal use, and any reliance on the data for financial or investment decisions is not recommended. We disclaim any responsibility or liability for any damages, losses, or consequences arising from the use of this dataset.
By accessing and utilizing this dataset on Kaggle, you agree to abide by these terms and conditions and understand that it is solely intended for educational and research purposes.
Please note that the dataset's contents, including the stock market data and company names, are subject to copyright and other proprietary rights of the respective sources. Users are advised to adhere to all applicable laws and regulations related to data usage, intellectual property, and any other relevant legal obligations.
In summary, this dataset is provided "as is" for learning purposes, without any warranties or guarantees, and users should exercise due diligence and judgment when using the data for any purpose.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Stock Charts
This dataset is a collection of a sample of images from tweets that I scraped using my Discord bot that keeps track of financial influencers on Twitter. The data consists of images that were part of tweets that mentioned a stock. This dataset can be used for a wide variety of tasks, such as image classification or feature extraction.
FinTwit Charts Collection
This dataset is part of a larger collection of datasets, scraped from Twitter and labeled by a… See the full description on the dataset page: https://huggingface.co/datasets/StephanAkkerman/stock-charts.
https://fred.stlouisfed.org/legal/#copyright-public-domainhttps://fred.stlouisfed.org/legal/#copyright-public-domain
Graph and download economic data for Producer Price Index by Industry: Cut Stock, Resawing Lumber, and Planing: Softwood Lumber, Made from Purchased Lumber, Cut Stock, and Dimension (PCU3219123219124) from May 2025 to Jul 2025 about stocks, wood, purchase, PPI, industry, price index, indexes, price, and USA.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This line chart displays stocks over time by date using the aggregation count. The data is filtered where the stock is CUT. The data is about stocks per day.
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Forecast: Stocks of Purchased Aluminum New Scrap Can Stock Clippings at Other Consumers in the US 2022 - 2026 Discover more data with ReportLinker!
This dataset includes the daily historical stock prices for Google (GOOGL) spanning from 2020 to 2025. It features essential financial metrics such as opening and closing prices, daily highs and lows, adjusted close prices, and trading volumes. The information offers valuable insights into the stock's performance over a five-year timeframe.
Note: 1. This data is scraped from Yahoo Finance by me using python code. 2. Some of the About Data is generated from AI, but verified from me.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This line chart displays closing price by date using the aggregation sum. The data is filtered where the stock is CUT. The data is about stocks per day.
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Forecast: Cut Stock, Resawing Lumber and Planing Sales in the US 2024 - 2028 Discover more data with ReportLinker!
By 2050, estimates suggest that as much as ************ metric tons of aluminum scrap will become available in the European Union. This would amount to double the stock volume seen in 2015. This increase would be the result of greater implementation of circular economy, including higher recycling rates.
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Forecast: Boot and Shoe Cut Stock and Findings Sales in the US 2024 - 2028 Discover more data with ReportLinker!
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
United States - Producer Price Index by Commodity: Lumber and Wood Products: Hardwood Cut Stock and Dimension was 382.87700 Index 1982=100 in May of 2025, according to the United States Federal Reserve. Historically, United States - Producer Price Index by Commodity: Lumber and Wood Products: Hardwood Cut Stock and Dimension reached a record high of 382.87700 in May of 2025 and a record low of 94.80000 in December of 1980. Trading Economics provides the current actual value, an historical data chart and related indicators for United States - Producer Price Index by Commodity: Lumber and Wood Products: Hardwood Cut Stock and Dimension - last updated from the United States Federal Reserve on September of 2025.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Stock market data can be interesting to analyze and as a further incentive, strong predictive models can have large financial payoff. The amount of financial data on the web is seemingly endless. A large and well structured dataset on a wide array of companies can be hard to come by. Here I provide a dataset with historical stock prices (last 5 years) for all companies currently found on the S&P 500 index.
The script I used to acquire all of these .csv files can be found in this GitHub repository In the future if you wish for a more up to date dataset, this can be used to acquire new versions of the .csv files.
The data is presented in a couple of formats to suit different individual's needs or computational limitations. I have included files containing 5 years of stock data (in the all_stocks_5yr.csv and corresponding folder) and a smaller version of the dataset (all_stocks_1yr.csv) with only the past year's stock data for those wishing to use something more manageable in size.
The folder individual_stocks_5yr contains files of data for individual stocks, labelled by their stock ticker name. The all_stocks_5yr.csv and all_stocks_1yr.csv contain this same data, presented in merged .csv files. Depending on the intended use (graphing, modelling etc.) the user may prefer one of these given formats.
All the files have the following columns: Date - in format: yy-mm-dd Open - price of the stock at market open (this is NYSE data so all in USD) High - Highest price reached in the day Low Close - Lowest price reached in the day Volume - Number of shares traded Name - the stock's ticker name
I scraped this data from Google finance using the python library 'pandas_datareader'. Special thanks to Kaggle, Github and The Market.
This dataset lends itself to a some very interesting visualizations. One can look at simple things like how prices change over time, graph an compare multiple stocks at once, or generate and graph new metrics from the data provided. From these data informative stock stats such as volatility and moving averages can be easily calculated. The million dollar question is: can you develop a model that can beat the market and allow you to make statistically informed trades!
https://fred.stlouisfed.org/legal/#copyright-public-domainhttps://fred.stlouisfed.org/legal/#copyright-public-domain
Graph and download economic data for Producer Price Index by Commodity: Lumber and Wood Products: Softwood Lumber, Made from Purchased Lumber, Cut Stock, and Dimension (WPU08110801) from May 2025 to Jul 2025 about stocks, wood, purchase, commodities, PPI, price index, indexes, price, and USA.
https://meyka.com/licensehttps://meyka.com/license
AI-powered price forecasts for CUT stock across different timeframes including weekly, monthly, yearly, and multi-year predictions.
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Forecast: Hardwood Cut Stock and Dimension Sales in the US 2022 - 2026 Discover more data with ReportLinker!
I wanted to scrape data on stocks from Yahoo for ML purposes (predicting prices from other features). The code used for scraping is here: https://github.com/nateGeorge/scrape_stocks
I will update this later if I find time.