Facebook
TwitterOpen Database License (ODbL) v1.0https://www.opendatacommons.org/licenses/odbl/1.0/
License information was derived automatically
The "yahoo_finance_dataset(2018-2023)" dataset is a financial dataset containing daily stock market data for multiple assets such as equities, ETFs, and indexes. It spans from April 1, 2018 to March 31, 2023, and contains 1257 rows and 7 columns. The data was sourced from Yahoo Finance, and the purpose of the dataset is to provide researchers, analysts, and investors with a comprehensive dataset that they can use to analyze stock market trends, identify patterns, and develop investment strategies. The dataset can be used for various tasks, including stock price prediction, trend analysis, portfolio optimization, and risk management. The dataset is provided in XLSX format, which makes it easy to import into various data analysis tools, including Python, R, and Excel.
The dataset includes the following columns:
Date: The date on which the stock market data was recorded. Open: The opening price of the asset on the given date. High: The highest price of the asset on the given date. Low: The lowest price of the asset on the given date. Close*: The closing price of the asset on the given date. Note that this price does not take into account any after-hours trading that may have occurred after the market officially closed. Adj Close**: The adjusted closing price of the asset on the given date. This price takes into account any dividends, stock splits, or other corporate actions that may have occurred, which can affect the stock price. Volume: The total number of shares of the asset that were traded on the given date.
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
This repository contains a meticulously scraped dataset from various financial websites. The data extraction process ensures high-quality and accurate text, including content from both the websites and their embedded PDFs.
We applied the advanced Mixtral 7X8 model to generate the following additional fields:
The prompt used to generate the additional fields was highly effective, thanks to extensive discussions and collaboration with the Mistral AI team. This ensures that the dataset provides valuable insights and is ready for further analysis and model training.
This dataset can be used for various applications, including but not limited to:
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Dataset Description 📊🔍
The Sujet-Finance-QA-Vision-100k is a comprehensive dataset containing over 100,000 question-answer pairs derived from more than 9,800 financial document images. This dataset is designed to support research and development in the field of financial document analysis and visual question answering.
Key Features:
🖼️ 9,801 unique financial document images ❓ 107,050 question-answer pairs 🇬🇧 English language 📄 Diverse financial document types… See the full description on the dataset page: https://huggingface.co/datasets/sujet-ai/Sujet-Finance-QA-Vision-100k.
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
A comprehensive list of the columns in your dataset, along with descriptions for each:
| Column Name | Description |
|---|---|
| Company | The name of the company (e.g., Apple, Facebook). |
| Ticker | The stock ticker symbol for the company (e.g., AAPL for Apple, META for Facebook). |
| Date | The trading date for the stock data. |
| Open | The opening price of the stock for the trading day. |
| High | The highest price of the stock during the trading day. |
| Low | The lowest price of the stock during the trading day. |
| Close | The closing price of the stock for the trading day. |
| Adj Close | The adjusted closing price, which accounts for dividends and stock splits. |
| Volume | The number of shares traded during the day. |
| Market Cap | The total market value of a company's outstanding shares. |
| PE Ratio | Price-to-earnings ratio; a measure of a company's current share price relative to its per-share earnings. |
| Beta | A measure of a stock's volatility in relation to the market. |
| EPS (Earnings Per Share) | The portion of a company's profit allocated to each outstanding share of common stock. |
| Forward PE | The price-to-earnings ratio using forecasted earnings. |
| Revenue | Total revenue reported by the company. |
| Gross Profit | The profit a company makes after deducting the costs associated with making and selling its products. |
| Operating Income | The profit realized from a business's normal operations, excluding any income derived from non-operational activities. |
| Net Income | The total profit of a company after all expenses, taxes, and costs have been deducted from total revenue. |
| Debt to Equity | A financial ratio indicating the relative proportion of shareholders' equity and debt used to finance a company's assets. |
| Return on Equity (ROE) | A measure of financial performance calculated by dividing net income by shareholders' equity. |
| Current Ratio | A liquidity ratio that measures a company's ability to pay short-term obligations or those due within one year. |
| Dividends Paid | The total dividend payments made by the company. |
| Dividend Yield | A financial ratio that shows how much a company pays out in dividends each year relative to its stock price. |
| Quarterly Revenue Growth | The year-over-year percentage growth in revenue for the most recent quarter compared to the same quarter last year. |
| Analyst Recommendation | Analysts' consensus rating for the stock (e.g., buy, sell, hold). |
| Target Price | The forecasted price for the stock as estimated by analysts. |
| Free Cash Flow | The cash generated by the company after accounting for capital expenditures. |
| Operating Margin | A measure of how much... |
Facebook
Twitterhttps://encrypted-tbn0.gstatic.com/images?q=tbn:ANd9GcSO20g5cBn_b3UvD4HrPSKMrujGXq8LfT2NQP3LC3F3k8ufSV6TP97l7Har-625Bju08bc&usqp=CAU" alt="File:Yahoo Finance Logo 2013.svg - Wikipedia">
Yahoo! Finance is a media property that is part of the Yahoo! network. It provides financial news, data and commentary including stock quotes, press releases, financial reports, and original content. It also offers some online tools for personal finance management. In addition to posting partner content from other web sites, it posts original stories by its team of staff journalists. It is ranked 20th by Similar Web on the list of largest news and media websites.
###
python
1.Content:
2.Symbol:
3.Name:
4.Price:
5.Volume:
6.Market cap:
7.P/E ratio:
The data is sourced from Yahoo Finance and is updated daily, providing users with the most up-to-date financial information for each company listed.
The dataset is suitable for anyone interested in analyzing or predicting stock market trends and is particularly useful for financial analysts, investors, and traders.
Facebook
TwitterDataset Card for "lmsys-finance"
This dataset is a curated version of the lmsys-chat-1m dataset, focusing solely on finance-related conversations. The refinement process encompassed:
Removing non-English conversations. Selecting conversations from models: "vicuna-33b", "wizardlm-13b", "gpt-4", "gpt-3.5-turbo", "claude-2", "palm-2", and "claude-instant-1". Excluding conversations with responses under 30 characters. Using 100 financial keywords, choosing conversations with at least… See the full description on the dataset page: https://huggingface.co/datasets/amphora/lmsys-finance.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
These three datasets provide closing price information for the following assets: Google, Apple, Microsoft, Netflix, Amazon, Pfizer, Astra Zeneca, Johnson & Johnson, ETH, BTC and LTC.The time period spans from 2012 to the end of 2020.
Facebook
TwitterAll financial transactions made by the Intellectual Property Office as part of the Government’s commitment to transparency in expenditure
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
The Financial Risk Assessment Dataset provides detailed information on individual financial profiles. It includes demographic, financial, and behavioral data to assess financial risk. The dataset features various columns such as income, credit score, and risk rating, with intentional imbalances and missing values to simulate real-world scenarios.
Facebook
TwitterThis (financial and personal) data is required to be kept as part of the auditing process of the co-ordinating country. It is required to be retained for several years after the ESSnet is completed.
Facebook
TwitterAll financial transactions made by the Intellectual Property Office as part of the Government’s commitment to transparency in expenditure
Facebook
TwitterFinance Datasets
Historical stock and cryptocurrency price data.
Contents
Stocks (5 years of daily OHLCV data)
AAPL - Apple Inc. GOOGL - Alphabet Inc. MSFT - Microsoft Corp. AMZN - Amazon.com Inc. TSLA - Tesla Inc. META - Meta Platforms NVDA - NVIDIA Corp. AMD - Advanced Micro Devices INTC - Intel Corp. NFLX - Netflix Inc.
Cryptocurrencies (full history)
BTC_USD - Bitcoin ETH_USD - Ethereum SOL_USD - Solana ADA_USD - Cardano DOT_USD - Polkadot… See the full description on the dataset page: https://huggingface.co/datasets/misterdonn/finance-datasets.
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Image generated by DALL-E. See prompt for more details
💼 📊 Synthetic Financial Domain Documents with PII Labels
gretelai/synthetic_pii_finance_multilingual is a dataset of full length synthetic financial documents containing Personally Identifiable Information (PII), generated using Gretel Navigator and released under Apache 2.0. This dataset is designed to assist with the following use cases:
🏷️ Training NER (Named Entity Recognition) models to detect and label PII in… See the full description on the dataset page: https://huggingface.co/datasets/gretelai/synthetic_pii_finance_multilingual.
Facebook
Twitterhttps://www.lseg.com/en/policies/website-disclaimerhttps://www.lseg.com/en/policies/website-disclaimer
Explore LSEG's Project Finance Deals Data, providing loan information and league tables to the global deal-making community.
Facebook
TwitterAttribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Paper |Homepage |Github
🛠️ Usage
Regarding the data, first of all, you should download the MMfin.tsv and MMfin_CN.tsv files, as well as the relevant financial images. The folder structure is shown as follows: ├─ datasets ├─ images ├─ MMfin ... ├─ MMfin_CN ... │ MMfin.tsv │ MMfin_CN.tsv
The following is the process of inference and evaluation (Qwen2-VL-2B-Instruct as an example): export LMUData="The path of the datasets" python… See the full description on the dataset page: https://huggingface.co/datasets/hithink-ai/MME-Finance.
Facebook
TwitterAll financial transactions made by Companies House as part of the Government’s commitment to transparency in expenditure
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Quantitative Finance Fine-Tuning Dataset
A dataset of 24 Q&A examples designed to fine-tune large language models (LLMs) for quantitative finance.
📂 Categories
Category Topics Examples
Volatility Models SABR (corrected), Bergomi, rBergomi, Heston 5
Derivatives Pricing Dupire, VIX, Black-Scholes Greeks, CVaR 5
Interest Rates & Credit HJM, Hull-White, Merton, CDS 4
Numerical Methods Crank-Nicolson, Monte Carlo, FFT, LSM 5
Quant Strategies Momentum, Pairs… See the full description on the dataset page: https://huggingface.co/datasets/mo35/quant-finance-dataset.
Facebook
Twitterhttps://fred.stlouisfed.org/legal/#copyright-public-domainhttps://fred.stlouisfed.org/legal/#copyright-public-domain
Graph and download economic data for Finance Companies; Total Miscellaneous Liabilities, Level (BOGZ1FL613190005A) from 1945 to 2025 about miscellaneous, finance companies, companies, finance, liabilities, financial, and USA.
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Finance-Instruct-500k Dataset
Overview
Finance-Instruct-500k is a comprehensive and meticulously curated dataset designed to train advanced language models for financial tasks, reasoning, and multi-turn conversations. Combining data from numerous high-quality financial datasets, this corpus provides over 500,000 entries, offering unparalleled depth and versatility for finance-related instruction tuning and fine-tuning. The dataset includes content tailored for financial… See the full description on the dataset page: https://huggingface.co/datasets/oieieio/Finance-Instruct-500k.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset offers a detailed and organized set of financial data, enabling users to analyze company performance, conduct stock market research, and develop predictive models. It spans multiple financial aspects, such as annual and quarterly profit and loss statements, balance sheets, cash flow data, financial ratios, and market prices.
The data is structured to support time-series analysis, with datasets covering financial metrics at T0 (financial statements) and T1 (market prices).
This makes it particularly useful for applications requiring cross-temporal insights or forecasting.
Facebook
TwitterOpen Database License (ODbL) v1.0https://www.opendatacommons.org/licenses/odbl/1.0/
License information was derived automatically
The "yahoo_finance_dataset(2018-2023)" dataset is a financial dataset containing daily stock market data for multiple assets such as equities, ETFs, and indexes. It spans from April 1, 2018 to March 31, 2023, and contains 1257 rows and 7 columns. The data was sourced from Yahoo Finance, and the purpose of the dataset is to provide researchers, analysts, and investors with a comprehensive dataset that they can use to analyze stock market trends, identify patterns, and develop investment strategies. The dataset can be used for various tasks, including stock price prediction, trend analysis, portfolio optimization, and risk management. The dataset is provided in XLSX format, which makes it easy to import into various data analysis tools, including Python, R, and Excel.
The dataset includes the following columns:
Date: The date on which the stock market data was recorded. Open: The opening price of the asset on the given date. High: The highest price of the asset on the given date. Low: The lowest price of the asset on the given date. Close*: The closing price of the asset on the given date. Note that this price does not take into account any after-hours trading that may have occurred after the market officially closed. Adj Close**: The adjusted closing price of the asset on the given date. This price takes into account any dividends, stock splits, or other corporate actions that may have occurred, which can affect the stock price. Volume: The total number of shares of the asset that were traded on the given date.