47 datasets found

h
Revenue
huggingface.co
Updated May 5, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
InDyne (2024). Revenue [Dataset]. https://huggingface.co/datasets/InDyne/Revenue
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
May 5, 2024
Dataset authored and provided by
InDyne
Description
InDyne/Revenue dataset hosted on Hugging Face and contributed by the HF Datasets community
h
earnings-calls-qa
huggingface.co
Updated Dec 1, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Lamini (2022). earnings-calls-qa [Dataset]. https://huggingface.co/datasets/lamini/earnings-calls-qa
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Dec 1, 2022
Dataset authored and provided by
Lamini
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Lamini Earning Calls QA Dataset

Description

This dataset contains transcripts of earning calls for various companies, along with questions and answers related to the companies' financial performance and other relevant topics.

Format

The transcripts, questions, and answers are in the form of jsonlines files, with each json object in the file containing the transcript of an earning call for a single company.

Data Pipeline Code

The entire data pipeline… See the full description on the dataset page: https://huggingface.co/datasets/lamini/earnings-calls-qa.
h
finRAG
huggingface.co
Updated May 7, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Parsee.ai (2024). finRAG [Dataset]. https://huggingface.co/datasets/parsee-ai/finRAG
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
May 7, 2024
Dataset provided by
Parsee.ai
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
finRAG Datasets

This is the official Huggingface repo of the finRAG datasets published by parsee.ai. More detailed information about the 3 datasets and methodology can be found in the sub-directories for the individual datasets. We wanted to investigate how good the current state of the art (M)LLMs are at solving the relatively simple problem of extracting revenue figures from publicly available financial reports. To test this, we created 3 different datasets, all based on the same… See the full description on the dataset page: https://huggingface.co/datasets/parsee-ai/finRAG.
h
cqadupstack-gaming-top-20-gen-queries
huggingface.co
Updated Apr 1, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
INCOME (2023). cqadupstack-gaming-top-20-gen-queries [Dataset]. https://huggingface.co/datasets/income/cqadupstack-gaming-top-20-gen-queries
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Apr 1, 2023
Dataset authored and provided by
INCOME
License
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Description
NFCorpus: 20 generated queries (BEIR Benchmark)

This HF dataset contains the top-20 synthetic queries generated for each passage in the above BEIR benchmark dataset.

DocT5query model used: BeIR/query-gen-msmarco-t5-base-v1 id (str): unique document id in NFCorpus in the BEIR benchmark (corpus.jsonl). Questions generated: 20 Code used for generation: evaluate_anserini_docT5query_parallel.py

Below contains the old dataset card for the BEIR benchmark.

Dataset Card for BEIR… See the full description on the dataset page: https://huggingface.co/datasets/income/cqadupstack-gaming-top-20-gen-queries.
h
adult-census-income
huggingface.co
opendatalab.com
+1more
Updated Feb 1, 2001
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
scikit-learn (2001). adult-census-income [Dataset]. https://huggingface.co/datasets/scikit-learn/adult-census-income
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Feb 1, 2001
Dataset authored and provided by
scikit-learn
License
https://choosealicense.com/licenses/cc0-1.0/https://choosealicense.com/licenses/cc0-1.0/
Description
Adult Census Income Dataset

The following was retrieved from UCI machine learning repository. This data was extracted from the 1994 Census bureau database by Ronny Kohavi and Barry Becker (Data Mining and Visualization, Silicon Graphics). A set of reasonably clean records was extracted using the following conditions: ((AAGE>16) && (AGI>100) && (AFNLWGT>1) && (HRSWK>0)). The prediction task is to determine whether a person makes over $50K a year. Description of fnlwgt (final weight)… See the full description on the dataset page: https://huggingface.co/datasets/scikit-learn/adult-census-income.
h
revenue-estimate-stocks
huggingface.co
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
chuyin0321, revenue-estimate-stocks [Dataset]. https://huggingface.co/datasets/chuyin0321/revenue-estimate-stocks
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset authored and provided by
chuyin0321
Description
Dataset Card for "revenue-estimate-stocks"

More Information needed
h
earnings22_baseline_5_gram
huggingface.co
Updated Jul 17, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Anton Lozhkov (2023). earnings22_baseline_5_gram [Dataset]. https://huggingface.co/datasets/anton-l/earnings22_baseline_5_gram
Explore at:
Dataset updated
Jul 17, 2023
Authors
Anton Lozhkov
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
The Earnings 22 dataset ( also referred to as earnings22 ) is a 119-hour corpus of English-language earnings calls collected from global companies. The primary purpose is to serve as a benchmark for industrial and academic automatic speech recognition (ASR) models on real-world accented speech.
h
earnings22
huggingface.co
Updated Mar 21, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Whisper Distillation (2024). earnings22 [Dataset]. https://huggingface.co/datasets/distil-whisper/earnings22
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Mar 21, 2024
Dataset authored and provided by
Whisper Distillation
Description
Dataset Card for Earnings 22

Dataset Summary

Earnings-22 provides a free-to-use benchmark of real-world, accented audio to bridge academic and industrial research. This dataset contains 125 files totalling roughly 119 hours of English language earnings calls from global countries. This dataset provides the full audios, transcripts, and accompanying metadata such as ticker symbol, headquarters country, and our defined "Language Region".

Supported Tasks and… See the full description on the dataset page: https://huggingface.co/datasets/distil-whisper/earnings22.
h
earnings_call
huggingface.co
dataverse.nl
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
John Henning, earnings_call [Dataset]. http://doi.org/10.34894/TJE0D0
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.34894/TJE0D0
Authors
John Henning
License
https://choosealicense.com/licenses/cc0-1.0/https://choosealicense.com/licenses/cc0-1.0/
Description
The dataset reports a collection of earnings call transcripts, the related stock prices, and the sector index In terms of volume, there is a total of 188 transcripts, 11970 stock prices, and 1196 sector index values. Furthermore, all of these data originated in the period 2016-2020 and are related to the NASDAQ stock market. Furthermore, the data collection was made possible by Yahoo Finance and Thomson Reuters Eikon. Specifically, Yahoo Finance enabled the search for stock values and Thomson Reuters Eikon provided the earnings call transcripts. Lastly, the dataset can be used as a benchmark for the evaluation of several NLP techniques to understand their potential for financial applications. Moreover, it is also possible to expand the dataset by extending the period in which the data originated following a similar procedure.
h
trec-news-top-20-gen-queries
huggingface.co
Updated Mar 13, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
INCOME (2023). trec-news-top-20-gen-queries [Dataset]. https://huggingface.co/datasets/income/trec-news-top-20-gen-queries
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Mar 13, 2023
Dataset authored and provided by
INCOME
License
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Description
NFCorpus: 20 generated queries (BEIR Benchmark)

This HF dataset contains the top-20 synthetic queries generated for each passage in the above BEIR benchmark dataset.

DocT5query model used: BeIR/query-gen-msmarco-t5-base-v1 id (str): unique document id in NFCorpus in the BEIR benchmark (corpus.jsonl). Questions generated: 20 Code used for generation: evaluate_anserini_docT5query_parallel.py

Below contains the old dataset card for the BEIR benchmark.

Dataset Card for BEIR… See the full description on the dataset page: https://huggingface.co/datasets/income/trec-news-top-20-gen-queries.
h
CUADRevenueProfitSharingLegalBenchClassification
huggingface.co
Updated May 11, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Massive Text Embedding Benchmark (2025). CUADRevenueProfitSharingLegalBenchClassification [Dataset]. https://huggingface.co/datasets/mteb/CUADRevenueProfitSharingLegalBenchClassification
Explore at:
Dataset updated
May 11, 2025
Dataset authored and provided by
Massive Text Embedding Benchmark
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
CUADRevenueProfitSharingLegalBenchClassification An MTEB dataset Massive Text Embedding Benchmark

This task was constructed from the CUAD dataset. It consists of determining if the clause require a party to share revenue or profit with the counterparty for any technology, goods, or services.

Task category t2c

Domains Legal, Written

Reference https://huggingface.co/datasets/nguha/legalbench

How to evaluate on this task

You can evaluate an embedding… See the full description on the dataset page: https://huggingface.co/datasets/mteb/CUADRevenueProfitSharingLegalBenchClassification.
h
income
huggingface.co
Updated Oct 10, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Rahul Sivaram (2024). income [Dataset]. https://huggingface.co/datasets/rahulisivaram5/income
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Oct 10, 2024
Authors
Rahul Sivaram
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
rahulisivaram5/income dataset hosted on Hugging Face and contributed by the HF Datasets community
h
BusinessData
huggingface.co
Updated Jun 1, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mita D (2025). BusinessData [Dataset]. https://huggingface.co/datasets/mitadhamdhere13/BusinessData
Explore at:
Dataset updated
Jun 1, 2025
Authors
Mita D
Description
language:

en --Generate a clean Excel dataset with the following columns: Date (from 01-01-2023 to 31-12-2025), Region (North, South, East, West), Branch (Branch A to Branch E), Business Type (B2B & B2C), Partner ID (should be unique), Client ID (should be unique), Total Investment, Total Revenue, Revenue generated by B2B, Revenue generated by B2C, Revenue generated by Partner, Partner share of 40% from total revenue, Admin Expenses, Employee & HR Expenses, Marketing Expense, Technology… See the full description on the dataset page: https://huggingface.co/datasets/mitadhamdhere13/BusinessData.
h
scidocs-top-20-gen-queries
huggingface.co
Updated Apr 1, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
INCOME (2023). scidocs-top-20-gen-queries [Dataset]. https://huggingface.co/datasets/income/scidocs-top-20-gen-queries
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Apr 1, 2023
Dataset authored and provided by
INCOME
License
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Description
NFCorpus: 20 generated queries (BEIR Benchmark)

This HF dataset contains the top-20 synthetic queries generated for each passage in the above BEIR benchmark dataset.

DocT5query model used: BeIR/query-gen-msmarco-t5-base-v1 id (str): unique document id in NFCorpus in the BEIR benchmark (corpus.jsonl). Questions generated: 20 Code used for generation: evaluate_anserini_docT5query_parallel.py

Below contains the old dataset card for the BEIR benchmark.

Dataset Card for BEIR… See the full description on the dataset page: https://huggingface.co/datasets/income/scidocs-top-20-gen-queries.
h
Stocks-Quarterly-Earnings
huggingface.co
Updated Aug 22, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Papers With Backtest (2024). Stocks-Quarterly-Earnings [Dataset]. https://huggingface.co/datasets/paperswithbacktest/Stocks-Quarterly-Earnings
Explore at:
Dataset updated
Aug 22, 2024
Dataset authored and provided by
Papers With Backtest
License
https://choosealicense.com/licenses/other/https://choosealicense.com/licenses/other/
Description
Dataset Information

This dataset includes quarterly earnings reports for various US stocks.

Instruments Included

7000+ US Stocks

Dataset Columns

symbol: The stock ticker or financial instrument identifier associated with the data. date: The end date of the fiscal period for which the financial data is reported. reported_date: The actual date on which the company reported its earnings or financial results. reported_eps: The earnings per share (EPS) that the… See the full description on the dataset page: https://huggingface.co/datasets/paperswithbacktest/Stocks-Quarterly-Earnings.
h
climate-fever-top-20-gen-queries
huggingface.co
Updated Mar 6, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
INCOME (2023). climate-fever-top-20-gen-queries [Dataset]. https://huggingface.co/datasets/income/climate-fever-top-20-gen-queries
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Mar 6, 2023
Dataset authored and provided by
INCOME
License
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Description
NFCorpus: 20 generated queries (BEIR Benchmark)

This HF dataset contains the top-20 synthetic queries generated for each passage in the above BEIR benchmark dataset.

DocT5query model used: BeIR/query-gen-msmarco-t5-base-v1 id (str): unique document id in NFCorpus in the BEIR benchmark (corpus.jsonl). Questions generated: 20 Code used for generation: evaluate_anserini_docT5query_parallel.py

Below contains the old dataset card for the BEIR benchmark.

Dataset Card for BEIR… See the full description on the dataset page: https://huggingface.co/datasets/income/climate-fever-top-20-gen-queries.
earnings-raw
huggingface.co
Updated May 1, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Lamini (2024). earnings-raw [Dataset]. https://huggingface.co/datasets/lamini/earnings-raw
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
May 1, 2024
Dataset provided by
PowerML, Inc.
Authors
Lamini
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
lamini/earnings-raw dataset hosted on Hugging Face and contributed by the HF Datasets community
h
adult
huggingface.co
Updated Nov 2, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mattia (2023). adult [Dataset]. https://huggingface.co/datasets/mstz/adult
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Nov 2, 2023
Authors
Mattia
License
https://choosealicense.com/licenses/cc/https://choosealicense.com/licenses/cc/
Description
Adult

The Adult dataset from the UCI ML repository. Census dataset including personal characteristic of a person, and their income threshold.

Configurations and tasks

Configuration Task Description

encoding

Encoding dictionary showing original values of encoded features.

income Binary classification Classify the person's income as over or under the threshold.

income-no race Binary classification As income, but the race feature is removed.

race Multiclass… See the full description on the dataset page: https://huggingface.co/datasets/mstz/adult.
h
cr-y2024-summer-556-profit-points-17k
huggingface.co
Updated Aug 3, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Semyon Volkov (2024). cr-y2024-summer-556-profit-points-17k [Dataset]. https://huggingface.co/datasets/7wolf/cr-y2024-summer-556-profit-points-17k
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Aug 3, 2024
Authors
Semyon Volkov
Description
7wolf/cr-y2024-summer-556-profit-points-17k dataset hosted on Hugging Face and contributed by the HF Datasets community
h
arguana-top-20-gen-queries
huggingface.co
Updated Mar 8, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
INCOME (2023). arguana-top-20-gen-queries [Dataset]. https://huggingface.co/datasets/income/arguana-top-20-gen-queries
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Mar 8, 2023
Dataset authored and provided by
INCOME
License
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Description
NFCorpus: 20 generated queries (BEIR Benchmark)

This HF dataset contains the top-20 synthetic queries generated for each passage in the above BEIR benchmark dataset.

DocT5query model used: BeIR/query-gen-msmarco-t5-base-v1 id (str): unique document id in NFCorpus in the BEIR benchmark (corpus.jsonl). Questions generated: 20 Code used for generation: evaluate_anserini_docT5query_parallel.py

Below contains the old dataset card for the BEIR benchmark.

Dataset Card for BEIR… See the full description on the dataset page: https://huggingface.co/datasets/income/arguana-top-20-gen-queries.

Facebook

Twitter

Click to copy link

Link copied

Cite

InDyne (2024). Revenue [Dataset]. https://huggingface.co/datasets/InDyne/Revenue

Revenue

InDyne/Revenue

Explore at:

CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.

Dataset updated

May 5, 2024

Dataset authored and provided by

InDyne

Description

InDyne/Revenue dataset hosted on Hugging Face and contributed by the HF Datasets community

Clear search

Close search

Google apps

Main menu

Revenue

earnings-calls-qa

finRAG

cqadupstack-gaming-top-20-gen-queries

adult-census-income

revenue-estimate-stocks

earnings22_baseline_5_gram

earnings22

earnings_call

trec-news-top-20-gen-queries

CUADRevenueProfitSharingLegalBenchClassification

income

BusinessData

scidocs-top-20-gen-queries

Stocks-Quarterly-Earnings

climate-fever-top-20-gen-queries

earnings-raw

adult

cr-y2024-summer-556-profit-points-17k

arguana-top-20-gen-queries

Revenue

InDyne/Revenue