78 datasets found

Alpha Insights: US Funds
kaggle.com
Updated Feb 12, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
willian oliveira gibin (2024). Alpha Insights: US Funds [Dataset]. http://doi.org/10.34740/kaggle/dsv/7614015
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.34740/kaggle/dsv/7614015
Dataset updated
Feb 12, 2024
Dataset provided by
Kaggle
Authors
willian oliveira gibin
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F16731800%2F2b87409e296a59d20dab602e6501f340%2Ffile9e063b84e35.gif?generation=1707771596337465&alt=media" alt="">

https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F16731800%2F9d574862156fdd14299b6bcdf1d7c0e8%2Ffile9e048912e2.gif?generation=1707771713059014&alt=media" alt="">

US Funds Dataset: Unlocking Insights for Informed Investment Decisions

Exchange-Traded Funds (ETFs) have gained significant popularity in recent years as a low-cost alternative to Mutual Funds. This dataset, compiled from Yahoo Finance, offers a comprehensive overview of the US funds market, encompassing 23,783 Mutual Funds and 2,310 ETFs.

Data

The dataset provides a wealth of information on each fund, including:

General fund aspects: total net assets, fund family, inception date, expense ratios, and more. Portfolio indicators: cash allocation, sector weightings, holdings diversification, and other key metrics. Historical returns: year-to-date, 1-year, 3-year, and other performance data for different time periods. Financial ratios: price/earnings ratio, Treynor and Sharpe ratios, alpha, beta, and ESG scores. Applications

This dataset can be leveraged by investors, researchers, and financial professionals for a variety of purposes, including:

Investment analysis: comparing the performance and characteristics of Mutual Funds and ETFs to make informed investment decisions. Portfolio construction: using the data to build diversified portfolios that align with investment goals and risk tolerance. Research and analysis: studying market trends, fund behavior, and other factors to gain insights into the US funds market. Inspiration and Updates

The dataset was inspired by the surge of interest in ETFs in 2017 and the subsequent shift away from Mutual Funds. The data is sourced from Yahoo Finance, a publicly available website, ensuring transparency and accessibility. Updates are planned every 1-2 semesters to keep the data current and relevant.

Conclusion

This comprehensive dataset offers a valuable resource for anyone seeking to gain a deeper understanding of the US funds market. By providing detailed information on a wide range of funds, the dataset empowers investors to make informed decisions and build successful investment portfolios.

Access the dataset and unlock the insights it offers to make informed investment decisions.
Stock market prediction
kaggle.com
Updated Aug 17, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Luis Andrés García (2023). Stock market prediction [Dataset]. https://www.kaggle.com/datasets/luisandresgarcia/stock-market-prediction
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Aug 17, 2023
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Luis Andrés García
License
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Description
PURPOSE (possible uses)

Non-professional investors often try to find an interesting stock among those in an index (such as the Standard and Poor's 500, Nasdaq, etc.). They need only one company, the best, and they don't want to fail (perform poorly). So, the metric to optimize is accuracy, described as:

Accuracy = True Positives / (True Positives + False Positives)

And the predictive model can be a binary classifier.

The data covers the price and volume of shares of 31 NASDAQ companies in the year 2022.

Context

Every data set I found to predict a stock price (investing) aims to find the price for the next day, and only for that stock. But in practical terms, people like to find the best stocks to buy from an index and wait a few days hoping to get an increase in the price of this investment.

Content

Rows are grouped by companies and their age (newest to oldest) on a common date. The first column is the company. The following are the age, market, date (separated by year, month, day, hour, minute), share volume, various traditional prices of that share (close, open, high...), some price and volume statistics and target. The target is mainly defined as 1 when the closing price increases by at least 5% in 5 days (open market days). The target is 0 in any other case.

Complex features and target were made by executing: https://www.kaggle.com/code/luisandresgarcia/202307

Thanks

Many thanks to everyone who participates in scientific papers and Kaggle notebooks related to financial investment.
US Stock Valuation Analysis
kaggle.com
Updated Dec 1, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Keith Scully (2024). US Stock Valuation Analysis [Dataset]. https://www.kaggle.com/datasets/keithscully/us-stock-valuation-analysis
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Dec 1, 2024
Dataset provided by
Kaggle
Authors
Keith Scully
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
This dataset provides financial accouting data for US company stocks along with per-share earnings & price metrics, liquidity ratios, management efficiency measures, margins and stock price data.

Companies are predominantly from the S&P 500 index, with a small number of additions made. The accounting data is on Fiscal Year basis, but most companies have had their stock price sampled up to 3 times in any given year. The time period covers the 10 most recent fiscal years, either 2013-2023 or 2014-2024 depending on when a company's fiscal year ends.

Data was collected from multiple sources, with some fields calculated from various other data points collected. There is no pre-defined target variable, and no directed goal to achieve using this dataset. Please explore and take your own unique approach in terms of how this data can be used, supplementing it with additional data if necessary.

This dataset was created as part of a college research project focused on stock valuation using machine learning, and I am sharing this here so that others may also benefit. I do not intend to maintain this dataset over time. Regardless I do believe that this will be a very valuable and useful dataset for anyone looking to carry out research or just looking to learn more about the area of stock investing using machine learning or other forms of analytics.
e
The motives and methods of middle-class international property investors -...
b2find.eudat.eu
Updated Mar 30, 2014
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2014). The motives and methods of middle-class international property investors - Dataset - B2FIND [Dataset]. https://b2find.eudat.eu/dataset/d995594f-daf7-59cf-939e-0cf148f2127d
Explore at:
Dataset updated
Mar 30, 2014
Description
This data collection consists of 18 interview transcripts meant to explore the rationales and methods by which investors in Hong Kong buy properties in the UK. The life and impact of the residential choices of the 'super rich' has been a major strand in research by the research team. This work advanced the proposition that the upper-tier of income groups living in cities tend to exploit particular forms of service provision (such as education, cultural life and personal services), are largely distanced from the mundane flow of social life in urban areas and tend to be withdrawn from the civic life of cities more generally. Some of this work is underpinned by the literature on, for example, gated communities, but it has surprisingly been under-used as the guiding framework for close empirical work in affluent neighbourhoods, perhaps largely as a result of the perceived difficulty of working with such individuals. This project will allow us to generate insights into how super-rich neighbourhoods operate, how people come to live there and the social and economic tensions and trade-offs that exist as such processes are allowed to run. As many people question the role and value of wealth and identify inequality as a growing social problem this research will feed into public conversations and policymaker concerns about how socially vital cities can be maintained when capital investment may undermine such objectives on one level (the creation of neighbourhoods that are both exclusive and often 'abandoned' for large parts of the year), while potentially fulfilling broader ambitions at others (over tax receipts for example).Social research has tended not to focus on the super-rich, largely because they are hard to locate, and even harder to collaborate with in research. In this project we seek to address these concerns by focusing extensive research effort on the question of where and how the super-rich live and invest in the property markets of the cities of Hong Kong and London. We see these cities as exemplary in assisting in the construction of further insights and knowledge in how the super-rich seek residential investment opportunities, how they live there when they are 'at home' in such residences and how these patterns of investment shape the social, political and economic life of these cities more broadly. Given that the super-rich make such decisions on the basis of tax incentives and the attraction of major cultural infrastructure (such as galleries and theatre) we have proposed a program of research capable of offering an inside account of the practices that go to make-up these investment patterns including processes of searching for suitable property, its financing, the kinds of property deemed to be suitable and an analysis of how estate agents and city authorities seek to capitalise and retain the potentially highly mobile investment by the super-rich. In economic terms the life and functioning of rich neighbourhood spaces appears intuitively important. For example, attractive and safe spaces for captains of industry, senior figures in political and non-government organizations are often regarded as major markers of urban vitality and the foundation of social networks that may make-up the broader glue of civic and political society. Yet we know very little about how such neighbourhoods operate, who they attract and how they are linked to other cities and their neighbourhoods globally. Our aim in this research is to grapple with what might be described as the 'problem' of these super-rich neighbourhoods - sometime called the 'alpha territory' - and undertake research that will help us to understand more about the advantages and disadvantages of these kinds of property investment. The research was carried out using semi-structured interviews and participant observation at property fairs and development sites in Hong Kong and different cities in the UK. Moreover, semi-structured interviews were conducted to explore the rationales and methods by which investors in Hong Kong buy properties in the UK. Participants were recruited using searches for relevant key actors as well as accessing personal and professional networks that enabled snowballing techniques to elicit further contacts. Interviews were conducted with individual investors, local government officials, planning officers, inward investment agencies, city government officials and estate agents. Interviews were conducted in both English and Cantonese.
400k NYSE random investments + financial ratios
kaggle.com
Updated Jan 18, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Imanol Recio Erquicia (2021). 400k NYSE random investments + financial ratios [Dataset]. https://www.kaggle.com/imanolrecioerquicia/400k-nyse-random-investments-financial-ratios/code
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jan 18, 2021
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Imanol Recio Erquicia
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Context

This dataset was created to make the project "AI Learn to invest" for SaturdaysAI - Euskadi 1st edition. The project can be found in https://github.com/ImanolR87/AI-Learn-to-invest

Content

More than 400.000 random investments were created with the data from the last 10 years from the NYSE market. Finantial ratios and volatilities were calculated and added to the random investments dataset.

Finantial ratios included: - ESG Ranking - ROA - ROE - Net Yearly Income - PB - PE - PS - EPS - Sharpe

Acknowledgements

I thank SaturdaysAI to push me falling in love with data science.

Inspiration

Our inspiration was to find an answer to why young people doesn't invest more on Stock-Exchange markets.
AI corporate investment worldwide 2015-2022
statista.com
Updated Jun 30, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2025). AI corporate investment worldwide 2015-2022 [Dataset]. https://www.statista.com/statistics/941137/ai-investment-and-funding-worldwide/
Explore at:
Dataset updated
Jun 30, 2025
Dataset authored and provided by
Statistahttp://statista.com/
Area covered
Worldwide
Description
In 2022, the global total corporate investment in artificial intelligence (AI) reached almost ** billion U.S. dollars, a slight decrease from the previous year. In 2018, the yearly investment in AI saw a slight downturn, but that was only temporary. Private investments account for a bulk of total AI corporate investment. AI investment has increased more than ******* since 2016, a staggering growth in any market. It is a testament to the importance of the development of AI around the world. What is Artificial Intelligence (AI)? Artificial intelligence, once the subject of people’s imaginations and the main plot of science fiction movies for decades, is no longer a piece of fiction, but rather commonplace in people’s daily lives whether they realize it or not. AI refers to the ability of a computer or machine to imitate the capacities of the human brain, which often learns from previous experiences to understand and respond to language, decisions, and problems. These AI capabilities, such as computer vision and conversational interfaces, have become embedded throughout various industries’ standard business processes. AI investment and startups The global AI market, valued at ***** billion U.S. dollars as of 2023, continues to grow driven by the influx of investments it receives. This is a rapidly growing market, looking to expand from billions to trillions of U.S. dollars in market size in the coming years. From 2020 to 2022, investment in startups globally, and in particular AI startups, increased by **** billion U.S. dollars, nearly double its previous investments, with much of it coming from private capital from U.S. companies. The most recent top-funded AI businesses are all machine learning and chatbot companies, focusing on human interface with machines.
Debt And Money Market Dataset
kaggle.com
Updated Jan 11, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Srushti Dongre (2023). Debt And Money Market Dataset [Dataset]. https://www.kaggle.com/datasets/srustidongre/debtandmoneymarket
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jan 11, 2023
Dataset provided by
Kaggle
Authors
Srushti Dongre
Description
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F5826868%2Fa264779b7afac1151f9c6a33ee6395db%2Fdownload.png?generation=1672726994863140&alt=media" alt="">

The data set focuses on the Debt and Money markets. The Fixed Income Market includes the debt and money markets. Investments in fixed-income securities are traded on the fixed income market. The fixed-income market comprises trading in securities such as Treasury Bills with various maturities, floating rate bonds, perpetual bonds, commercial paper, certificates of deposit, STRIPS, and debentures.

The information is gathered from the website amfiindia(Association of Mutual Funds in India) between June 2021 and June 2022.

Cleaning was necessary because the data came from the aforementioned website's data dump. The data has been partially cleaned, but much more cleaning is required.

Mutual fund Dataset

kaggle.com

Updated Sep 18, 2024

Facebook

Twitter

Click to copy link

Link copied

Cite

Alok Pandey (2024). Mutual fund Dataset [Dataset]. https://www.kaggle.com/datasets/aloktantrik/mutual-fund-nav-data

Explore at:

CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.

Dataset updated

Sep 18, 2024

Dataset provided by

Kaggle

Authors

Alok Pandey

License

https://cdla.io/permissive-1-0/https://cdla.io/permissive-1-0/

Description

Mutual Fund Dataset

Introduction

The Mutual Fund Dataset provides key information about various mutual fund schemes managed by multiple Asset Management Companies (AMCs). The dataset includes critical details such as fund ratings, returns across different time periods, NAV (Net Asset Value), and minimum investment requirements. Additionally, the dataset also contains information about the fund managers responsible for managing the portfolios.

This data can be particularly valuable for financial analysts, individual investors, and researchers seeking to evaluate the performance and characteristics of different mutual funds.

Dataset Overview

The dataset comprises the following major categories:

AMC (Asset Management Company): The institution managing the mutual fund.
Fund Ratings: Ratings from reputable organizations like Morningstar and Value Research.
Returns: Historical performance of mutual funds over 1 month, 1 year, and 3 years.
NAV (Net Asset Value): The per-unit price of the mutual fund.
Investment Requirements: Information on the minimum investment needed to participate in a fund.
Fund Manager: Details of the person responsible for managing the fund’s investment strategy.

Data Fields Description

Column Name	Description
AMC	Name of the Asset Management Company (e.g., 'Aditya Birla Sun Life Mutual Fund', 'ICICI Prudential Mutual Fund').
Fund Name	The specific name of the mutual fund scheme. This field may have some missing data.
Morning Star Rating	The star rating provided by Morningstar, evaluating the fund's historical performance.
Value Research Rating	The rating assigned by Value Research, another trusted source for evaluating mutual funds.
1 Month Return	The return on investment (%) for the mutual fund over the last month.
NAV (Net Asset Value)	The value per unit of the mutual fund, calculated as the market value of all assets minus liabilities, divided by the number of outstanding units.
1 Year Return	The return on investment (%) for the mutual fund over the last year.
3 Year Return	The return on investment (%) for the mutual fund over the last three years.
Minimum Investment	The minimum amount required to invest in the mutual fund (e.g., Rs.100, Rs.500).
Fund Manager	The name of the fund manager in charge of the mutual fund's strategy (e.g., 'Abhishek Bisen').

Usage Guidelines

This dataset can be used for:

Comparative Analysis: Investors and analysts can compare mutual funds based on their returns, minimum investment requirements, and ratings from reputed agencies like Morningstar and Value Research.
Investment Strategy: The dataset can help in identifying high-performing funds based on past returns and other key factors, assisting in portfolio diversification or fund selection.
Financial Research: Researchers can analyze trends across the mutual fund industry, assess risk versus reward, or develop prediction models based on fund performance data.

Handling Missing or Anomalous Data

Missing Data: Some columns, such as Fund Name, may have missing or incomplete data. Consider filtering out rows with insufficient data based on your use case.
NA Values: Fields such as Morning Star Rating, Value Research Rating, and Fund Manager may contain 'NA' values, indicating unavailability or lack of a rating for certain funds.
Other Category: Some columns include data points marked as "Other" to represent a collective...

Financial wealth: wealth in Great Britain
ons.gov.uk
cy.ons.gov.uk
xlsx
Updated Jan 24, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Office for National Statistics (2025). Financial wealth: wealth in Great Britain [Dataset]. https://www.ons.gov.uk/peoplepopulationandcommunity/personalandhouseholdfinances/incomeandwealth/datasets/financialwealthwealthingreatbritain
Explore at:
xlsxAvailable download formats
Dataset updated
Jan 24, 2025
Dataset provided by
Office for National Statisticshttp://www.ons.gov.uk/
License
Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
License information was derived automatically
Description
The values of any financial assets held including both formal investments, such as bank or building society current or saving accounts, investment vehicles such as Individual Savings Accounts, endowments, stocks and shares, and informal savings.
Dataset: WisdomTree Cloud Computing Fund (WCLD)...
kaggle.com
Updated Jun 21, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Nitiraj Kulkarni (2024). Dataset: WisdomTree Cloud Computing Fund (WCLD)... [Dataset]. https://www.kaggle.com/datasets/nitirajkulkarni/wcld-stock-performance
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jun 21, 2024
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Nitiraj Kulkarni
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
This dataset provides historical stock market performance data for specific companies. It enables users to analyze and understand the past trends and fluctuations in stock prices over time. This information can be utilized for various purposes such as investment analysis, financial research, and market trend forecasting.

Stock Market Dataset for August 2025

kaggle.com

Updated Aug 7, 2025

Facebook

Twitter

Click to copy link

Link copied

Cite

Kshitij Saini (2025). Stock Market Dataset for August 2025 [Dataset]. https://www.kaggle.com/datasets/kshitijsaini121/stock-market-prediction-for-july-2025-dataset

Explore at:

CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.

Dataset updated

Aug 7, 2025

Dataset provided by

Kagglehttp://kaggle.com/

Authors

Kshitij Saini

License

https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

Description

Dataset Overview

This dataset contains comprehensive stock market data for June 2025, capturing daily trading information across multiple companies and sectors. The dataset represents a substantial collection of market data with detailed financial metrics and trading statistics.

Basic Dataset Information

Time Period: June 1-21, 2025 (21 trading days)
Total Records: Approximately 11,600+ entries
Companies Covered: 500+ unique stocks
Data Type: Daily stock market trading data with fundamental metrics

Markdown Table Format

Column Name	Data Type	Description	Example Values
Date	Date	Trading date in DD-MM-YYYY format	01-06-2025, 02-06-2025
Ticker	String	Stock ticker symbol (3-4 characters)	AAPL, GOOGL, TSLA
Open Price	Float	Opening price of the stock	34.92, 206.5, 125.1

Dataset Information Table

Dataset Overview

Attribute	Details
Dataset Name	Stock Market Data - June 2025
File Format	CSV
File Size	~2.5 MB
Number of Records	11,600+
Number of Features	13
Time Period	June 1-21, 2025

Data Schema

Column Name	Data Type	Description	Example Values
Date	Date	Trading date in DD-MM-YYYY format	01-06-2025, 02-06-2025
Ticker	String	Stock ticker symbol (3-4 characters)	AAPL, GOOGL, TSLA, SLH
Open Price	Float	Opening price of the stock	34.92, 206.5, 125.1
Close Price	Float	Closing price of the stock	34.53, 208.45, 124.03
High Price	Float	Highest price during the trading day	35.22, 210.51, 127.4
Low Price	Float	Lowest price during the trading day	34.38, 205.12, 121.77
Volume Traded	Integer	Number of shares traded	2,966,611, 1,658,738
Market Cap	Float	Market capitalization in dollars	57,381,363,838.88
PE Ratio	Float	Price-to-Earnings ratio	29.63, 13.03, 29.19
Dividend Yield	Float	Dividend yield percentage	2.85, 2.73, 2.64
EPS	Float	Earnings per Share	1.17, 16.0, 4.25
52 Week High	Float	Highest price in the last 52 weeks	39.39, 227.38, 138.35
52 Week Low	Float	Lowest price in the last 52 weeks	28.44, 136.79, 100.69
Sector	String	Industry sector classification	Industrials, Energy, Healthcare

Market Capitalization Tiers

Mega Cap (>$1T): 6 companies (AAPL, MSFT, NVDA, AMZN, GOOGL, META)
Large Cap ($200B-$1T): 28 companies
Mid Cap ($50B-$200B): 47 companies

Key Market Characteristics

Price Volatility by Sector

Technology: Higher volatility (±3.5% daily range)
Energy: High volatility (±4.0% daily range)
Utilities: Lower volatility (±1.5% daily range)
Healthcare/Financials: Moderate volatility (±2.5% daily range)

Trading Volume Patterns

Mega Cap: 25M - 90M shares daily
Large Cap: 8M - 35M shares daily
Mid Cap: 2M - 15M shares daily
Small Cap: 500K - 5M shares daily

Financial Metrics Distribution

Average P/E Ratio: 25.9 (market-wide)
Average Dividend Yield: 1.25%
Price Range: $19 (T) to $3,850 (BKNG)
EPS Range: $1.50 to $70.00

Notable Market Features

High-Value Stocks

BKNG (Booking Holdings): $3,650-$3,850 range
AVGO (Broadcom): $1,650-$1,750 range
REGN (Regeneron): $1,050-$1,150 range
LLY (Eli Lilly): $920-$980 range

High-Dividend Yielders

T (AT&T): 7.1% dividend yield
VZ (Verizon): 6.2% dividend yield
PFE (Pfizer): 5.8% dividend yield

Growth & Technology Leaders

NOW (ServiceNow): P/E ratio of 85
NVDA (NVIDIA): P/E ratio of 45
TSLA (Tesla): P/E ratio of 55

Data Quality & Realism Features

✅ Authentic Price Ranges: Based on realistic 2025 market projections ✅ Sector-Appropriate Volatility: Different volatility patterns by industry ✅ Correlated Metrics: P/E ratios, dividend yields, and EPS align with market caps ✅ Realistic Trading Volumes: Volume scaled appropriately to market cap ✅ Temporal Consistency: Logical price progression over 53-day period ✅ Market Cap Accuracy: Daily fluctuations reflect actual price movements

Intended Use Cases

Financial Analysis & Modeling: Portfolio optimization, risk assessment
Machine Learning Applications: Predictive modeling, algorithmic trading
Educational Purposes: Finance courses, data science training
Algorithm Development: Backtesting trading strategies
Market Research: Sector analysis, correlation studies
Visualization Projects: Interactive dashboards, market trend analysis

This dataset provides a comprehensive foundation for quantitative finance research, offering both breadth across market sectors and depth in daily trading dynamics while maintaining statistical realism throughout the observation period...

Dataset: Momentus Inc. (MNTS) Stock Performance
kaggle.com
Updated Jun 21, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Nitiraj Kulkarni (2024). Dataset: Momentus Inc. (MNTS) Stock Performance [Dataset]. https://www.kaggle.com/datasets/nitirajkulkarni/mnts-stock-performance
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jun 21, 2024
Dataset provided by
Kaggle
Authors
Nitiraj Kulkarni
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
This dataset provides historical stock market performance data for specific companies. It enables users to analyze and understand the past trends and fluctuations in stock prices over time. This information can be utilized for various purposes such as investment analysis, financial research, and market trend forecasting.
Stock Market: Historical Data of Top 10 Companies
kaggle.com
Updated Jul 18, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Khushi Pitroda (2023). Stock Market: Historical Data of Top 10 Companies [Dataset]. https://www.kaggle.com/datasets/khushipitroda/stock-market-historical-data-of-top-10-companies/data
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jul 18, 2023
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Khushi Pitroda
Description
The dataset contains a total of 25,161 rows, each row representing the stock market data for a specific company on a given date. The information collected through web scraping from www.nasdaq.com includes the stock prices and trading volumes for the companies listed, such as Apple, Starbucks, Microsoft, Cisco Systems, Qualcomm, Meta, Amazon.com, Tesla, Advanced Micro Devices, and Netflix.

Data Analysis Tasks:

1) Exploratory Data Analysis (EDA): Analyze the distribution of stock prices and volumes for each company over time. Visualize trends, seasonality, and patterns in the stock market data using line charts, bar plots, and heatmaps.

2)Correlation Analysis: Investigate the correlations between the closing prices of different companies to identify potential relationships. Calculate correlation coefficients and visualize correlation matrices.

3)Top Performers Identification: Identify the top-performing companies based on their stock price growth and trading volumes over a specific time period.

4)Market Sentiment Analysis: Perform sentiment analysis using Natural Language Processing (NLP) techniques on news headlines related to each company. Determine whether positive or negative news impacts the stock prices and volumes.

5)Volatility Analysis: Calculate the volatility of each company's stock prices using metrics like Standard Deviation or Bollinger Bands. Analyze how volatile stocks are in comparison to others.

Machine Learning Tasks:

1)Stock Price Prediction: Use time-series forecasting models like ARIMA, SARIMA, or Prophet to predict future stock prices for a particular company. Evaluate the models' performance using metrics like Mean Squared Error (MSE) or Root Mean Squared Error (RMSE).

2)Classification of Stock Movements: Create a binary classification model to predict whether a stock will rise or fall on the next trading day. Utilize features like historical price changes, volumes, and technical indicators for the predictions. Implement classifiers such as Logistic Regression, Random Forest, or Support Vector Machines (SVM).

3)Clustering Analysis: Cluster companies based on their historical stock performance using unsupervised learning algorithms like K-means clustering. Explore if companies with similar stock price patterns belong to specific industry sectors.

4)Anomaly Detection: Detect anomalies in stock prices or trading volumes that deviate significantly from the historical trends. Use techniques like Isolation Forest or One-Class SVM for anomaly detection.

5)Reinforcement Learning for Portfolio Optimization: Formulate the stock market data as a reinforcement learning problem to optimize a portfolio's performance. Apply algorithms like Q-Learning or Deep Q-Networks (DQN) to learn the optimal trading strategy.

The dataset provided on Kaggle, titled "Stock Market Stars: Historical Data of Top 10 Companies," is intended for learning purposes only. The data has been gathered from public sources, specifically from web scraping www.nasdaq.com, and is presented in good faith to facilitate educational and research endeavors related to stock market analysis and data science.

It is essential to acknowledge that while we have taken reasonable measures to ensure the accuracy and reliability of the data, we do not guarantee its completeness or correctness. The information provided in this dataset may contain errors, inaccuracies, or omissions. Users are advised to use this dataset at their own risk and are responsible for verifying the data's integrity for their specific applications.

This dataset is not intended for any commercial or legal use, and any reliance on the data for financial or investment decisions is not recommended. We disclaim any responsibility or liability for any damages, losses, or consequences arising from the use of this dataset.

By accessing and utilizing this dataset on Kaggle, you agree to abide by these terms and conditions and understand that it is solely intended for educational and research purposes.

Please note that the dataset's contents, including the stock market data and company names, are subject to copyright and other proprietary rights of the respective sources. Users are advised to adhere to all applicable laws and regulations related to data usage, intellectual property, and any other relevant legal obligations.

In summary, this dataset is provided "as is" for learning purposes, without any warranties or guarantees, and users should exercise due diligence and judgment when using the data for any purpose.
Full Nasdaq Stocks Data
kaggle.com
Updated May 31, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
fjgonzalez (2020). Full Nasdaq Stocks Data [Dataset]. https://www.kaggle.com/gonzalezfrancisco/full-nasdaq-stocks-data
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
May 31, 2020
Dataset provided by
Kagglehttp://kaggle.com/
Authors
fjgonzalez
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Context

Predicting the stock market is one of the most commonly performed projects when someone is learning about ML and Data Science. After all, who wouldn't want to delegate the task of picking stocks to a model and reap the rewards for themselves? However, one of the most difficult and tedious steps to predict what stocks to invest in is actually gathering the data to use. There are so many options and it is important to get sufficient information for each. But, what if you can skip this step and just download a dataset that has all that information easily available for you? Look no further as this is the answer to this problem.

Content

This dataset contains information of 4447 stocks traded under Nasdaq across various exchanges. There is a file that contains information for all 4447 stocks but also has several null fields, which is why I labeled it as full_financial_stocks_raw.csv --it has minimal modifications to the values inside the rows. The second file, dividend_stocks_only.csv, is still a raw-ish style dataset but it only contains stocks that pay out dividends to its shareholders. Interestingly, it seems dividend-paying stocks have more information about them, which explains why this file has significantly fewer rows with null values.

Update: In the next 24 hours, I will be uploading an optimized, feature-engineered dataset that has fewer columns overall and fewer rows with null values. This dataset is intended to be a fully cleaned option to directly feed into ML/DL models.

Acknowledgements

I would like to thank the sources where I obtained my data, which are the FTP Nasdaq Trader website and the Yahoo Finance API.

Inspiration

Analyzing the stock market is one of the most intriguing endeavors I could think of as the ways it can be influenced are so broad and distinct from one another. A news article can influence how investors view a particular company, social media can directly fluctuate a company's share price, and there are numerous calculations and formulas that can show what stocks are worth investing in.
Stock Market Dataset
kaggle.com
Updated Jul 9, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
jasineri (2025). Stock Market Dataset [Dataset]. https://www.kaggle.com/datasets/jasineri/stock-market-dataset
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jul 9, 2025
Dataset provided by
Kagglehttp://kaggle.com/
Authors
jasineri
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Disclaimer: Educational Purposes Only

The financial and International Securities Identification Number (ISIN) data listed on this platform is provided solely for educational purposes. The information is intended to serve as general guidance and does not constitute financial advice, an endorsement, or a recommendation for the purchase or sale of any securities.

While we strive to ensure the accuracy and timeliness of the information presented, we make no representations or warranties, express or implied, regarding the completeness, accuracy, reliability, suitability, or availability of the provided data. Users are encouraged to independently verify any information obtained from this platform before making any investment decisions.

This platform and its operators are not responsible for any errors, omissions, or inaccuracies in the provided data, nor for any actions taken in reliance on such information. Users are strongly advised to conduct thorough research and seek the advice of qualified financial professionals before making any investment decisions.

The use of International Securities Identification Numbers (ISINs) and other financial data is subject to various regulations and licensing agreements. Users are responsible for complying with all applicable laws and respecting any terms and conditions associated with the use of such data.

By accessing and using this platform, users acknowledge and agree that they are doing so at their own risk and discretion. This educational content is not a substitute for professional financial advice, and users should consult with qualified professionals for specific guidance tailored to their individual circumstances.
Financial Statement Data Sets
kaggle.com
Updated Jul 4, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Vadim Vanak (2025). Financial Statement Data Sets [Dataset]. https://www.kaggle.com/datasets/vadimvanak/company-facts-2
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jul 4, 2025
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Vadim Vanak
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
This dataset offers a detailed collection of US-GAAP financial data extracted from the financial statements of exchange-listed U.S. companies, as submitted to the U.S. Securities and Exchange Commission (SEC) via the EDGAR database. Covering filings from January 2009 onwards, this dataset provides key financial figures reported by companies in accordance with U.S. Generally Accepted Accounting Principles (GAAP).

Dataset Features:

Data Scope: The dataset is restricted to figures reported under US-GAAP standards, with the exception of EntityCommonStockSharesOutstanding and EntityPublicFloat.

Currency and Units: The dataset exclusively includes figures reported in USD or shares, ensuring uniformity and comparability. It excludes ratios and non-financial metrics to maintain focus on financial data.

Company Selection: The dataset is limited to companies with U.S. exchange tickers, providing a concentrated analysis of publicly traded firms within the United States.

Submission Types: The dataset only incorporates data from 10-Q, 10-K, 10-Q/A, and 10-K/A filings, ensuring consistency in the type of financial reports analyzed.

Data Sources and Extraction:

This dataset primarily relies on the SEC's Financial Statement Data Sets and EDGAR APIs: - SEC Financial Statement Data Sets - EDGAR Application Programming Interfaces

In instances where specific figures were missing from these sources, data was directly extracted from the companies' financial statements to ensure completeness.

Please note that the dataset presents financial figures exactly as reported by the companies, which may occasionally include errors. A common issue involves incorrect reporting of scaling factors in the XBRL format. XBRL supports two tag attributes related to scaling: 'decimals' and 'scale.' The 'decimals' attribute indicates the number of significant decimal places but does not affect the actual value of the figure, while the 'scale' attribute adjusts the value by a specific factor.

However, there are several instances, numbering in the thousands, where companies have incorrectly used the 'decimals' attribute (e.g., 'decimals="-6"') under the mistaken assumption that it controls scaling. This is not correct, and as a result, some figures may be inaccurately scaled. This dataset does not attempt to detect or correct such errors; it aims to reflect the data precisely as reported by the companies. A future version of the dataset may be introduced to address and correct these issues.

The source code for data extraction is available here
Data from: Stock List Dataset
kaggle.com
Updated May 6, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Aditya Kumar (2024). Stock List Dataset [Dataset]. https://www.kaggle.com/datasets/adityakumar5095/stock-list-dataset
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
May 6, 2024
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Aditya Kumar
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
Symbol: This acts as a unique identifier for a particular stock on a specific exchange. Just like AAPL represents Apple Inc. on the NASDAQ exchange. Name: This is the full name of the company that issued the stock. Currency: This indicates the currency in which the stock is traded. Examples include USD (US Dollar), EUR (Euro), and JPY (Japanese Yen). Exchange: This refers to the stock exchange where the stock is traded. NASDAQ and NYSE are some well-known exchanges. MIC Code: This stands for Market Identifier Code and is used to uniquely identify a specific exchange or trading venue. Country: This specifies the country of incorporation of the company that issued the stock. Type: the type of the st0ck
Dataset: iShares Core MSCI Total International ...
kaggle.com
Updated Jun 21, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Nitiraj Kulkarni (2024). Dataset: iShares Core MSCI Total International ... [Dataset]. https://www.kaggle.com/datasets/nitirajkulkarni/ixus-stock-performance/discussion
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jun 21, 2024
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Nitiraj Kulkarni
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
This dataset provides historical stock market performance data for specific companies. It enables users to analyze and understand the past trends and fluctuations in stock prices over time. This information can be utilized for various purposes such as investment analysis, financial research, and market trend forecasting.
Dataset: First Trust Dow Jones International In...
kaggle.com
Updated Jun 21, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Nitiraj Kulkarni (2024). Dataset: First Trust Dow Jones International In... [Dataset]. https://www.kaggle.com/datasets/nitirajkulkarni/fdni-stock-performance/discussion
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jun 21, 2024
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Nitiraj Kulkarni
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
This dataset provides historical stock market performance data for specific companies. It enables users to analyze and understand the past trends and fluctuations in stock prices over time. This information can be utilized for various purposes such as investment analysis, financial research, and market trend forecasting.
Dataset: Ishares Environmentally Aware Real Est...
kaggle.com
Updated Jun 21, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Nitiraj Kulkarni (2024). Dataset: Ishares Environmentally Aware Real Est... [Dataset]. https://www.kaggle.com/datasets/nitirajkulkarni/eret-stock-performance/versions/1
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jun 21, 2024
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Nitiraj Kulkarni
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
This dataset provides historical stock market performance data for specific companies. It enables users to analyze and understand the past trends and fluctuations in stock prices over time. This information can be utilized for various purposes such as investment analysis, financial research, and market trend forecasting.

Facebook

Twitter

Click to copy link

Link copied

Cite

willian oliveira gibin (2024). Alpha Insights: US Funds [Dataset]. http://doi.org/10.34740/kaggle/dsv/7614015

Alpha Insights: US Funds

US Funds Dataset: Unlocking Insights for Informed Investment Decisions

Explore at:

CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.

Unique identifier

https://doi.org/10.34740/kaggle/dsv/7614015

Dataset updated

Feb 12, 2024

Dataset provided by

Kaggle

Authors

willian oliveira gibin

License

https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

Description

https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F16731800%2F2b87409e296a59d20dab602e6501f340%2Ffile9e063b84e35.gif?generation=1707771596337465&alt=media" alt="">

https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F16731800%2F9d574862156fdd14299b6bcdf1d7c0e8%2Ffile9e048912e2.gif?generation=1707771713059014&alt=media" alt="">

US Funds Dataset: Unlocking Insights for Informed Investment Decisions

Exchange-Traded Funds (ETFs) have gained significant popularity in recent years as a low-cost alternative to Mutual Funds. This dataset, compiled from Yahoo Finance, offers a comprehensive overview of the US funds market, encompassing 23,783 Mutual Funds and 2,310 ETFs.

Data

The dataset provides a wealth of information on each fund, including:

General fund aspects: total net assets, fund family, inception date, expense ratios, and more. Portfolio indicators: cash allocation, sector weightings, holdings diversification, and other key metrics. Historical returns: year-to-date, 1-year, 3-year, and other performance data for different time periods. Financial ratios: price/earnings ratio, Treynor and Sharpe ratios, alpha, beta, and ESG scores. Applications

This dataset can be leveraged by investors, researchers, and financial professionals for a variety of purposes, including:

Investment analysis: comparing the performance and characteristics of Mutual Funds and ETFs to make informed investment decisions. Portfolio construction: using the data to build diversified portfolios that align with investment goals and risk tolerance. Research and analysis: studying market trends, fund behavior, and other factors to gain insights into the US funds market. Inspiration and Updates

The dataset was inspired by the surge of interest in ETFs in 2017 and the subsequent shift away from Mutual Funds. The data is sourced from Yahoo Finance, a publicly available website, ensuring transparency and accessibility. Updates are planned every 1-2 semesters to keep the data current and relevant.

Conclusion

This comprehensive dataset offers a valuable resource for anyone seeking to gain a deeper understanding of the US funds market. By providing detailed information on a wide range of funds, the dataset empowers investors to make informed decisions and build successful investment portfolios.

Access the dataset and unlock the insights it offers to make informed investment decisions.

Clear search

Close search

Google apps

Main menu

Alpha Insights: US Funds

US Funds Dataset: Unlocking Insights for Informed Investment Decisions

Stock market prediction

PURPOSE (possible uses)

Context

Content

Thanks

US Stock Valuation Analysis

The motives and methods of middle-class international property investors -...

400k NYSE random investments + financial ratios

Context

Content

Acknowledgements

Inspiration

AI corporate investment worldwide 2015-2022

Debt And Money Market Dataset

Mutual fund Dataset

Mutual Fund Dataset

Table of Contents

Introduction

Dataset Overview

Data Fields Description

Usage Guidelines

Handling Missing or Anomalous Data

Financial wealth: wealth in Great Britain

Dataset: WisdomTree Cloud Computing Fund (WCLD)...

Stock Market Dataset for August 2025

Dataset Overview

Basic Dataset Information

Markdown Table Format

Dataset Information Table

Dataset Overview

Data Schema

Market Capitalization Tiers

Key Market Characteristics

Price Volatility by Sector

Trading Volume Patterns

Financial Metrics Distribution

Notable Market Features

High-Value Stocks

High-Dividend Yielders

Growth & Technology Leaders

Data Quality & Realism Features

Intended Use Cases

Dataset: Momentus Inc. (MNTS) Stock Performance

Stock Market: Historical Data of Top 10 Companies

Full Nasdaq Stocks Data

Context

Content

Acknowledgements

Inspiration

Stock Market Dataset

Financial Statement Data Sets

Dataset Features:

Data Sources and Extraction:

Data from: Stock List Dataset

Dataset: iShares Core MSCI Total International ...

Dataset: First Trust Dow Jones International In...

Dataset: Ishares Environmentally Aware Real Est...

Alpha Insights: US Funds

US Funds Dataset: Unlocking Insights for Informed Investment Decisions

US Funds Dataset: Unlocking Insights for Informed Investment Decisions