12 datasets found
  1. h

    financial_phrasebank

    • huggingface.co
    Updated May 23, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Pyry Takala (2024). financial_phrasebank [Dataset]. https://huggingface.co/datasets/takala/financial_phrasebank
    Explore at:
    Dataset updated
    May 23, 2024
    Authors
    Pyry Takala
    License

    Attribution-NonCommercial-ShareAlike 3.0 (CC BY-NC-SA 3.0)https://creativecommons.org/licenses/by-nc-sa/3.0/
    License information was derived automatically

    Description

    The key arguments for the low utilization of statistical techniques in financial sentiment analysis have been the difficulty of implementation for practical applications and the lack of high quality training data for building such models. Especially in the case of finance and economic texts, annotated collections are a scarce resource and many are reserved for proprietary use only. To resolve the missing training data problem, we present a collection of ∼ 5000 sentences to establish human-annotated standards for benchmarking alternative modeling techniques.

    The objective of the phrase level annotation task was to classify each example sentence into a positive, negative or neutral category by considering only the information explicitly available in the given sentence. Since the study is focused only on financial and economic domains, the annotators were asked to consider the sentences from the view point of an investor only; i.e. whether the news may have positive, negative or neutral influence on the stock price. As a result, sentences which have a sentiment that is not relevant from an economic or financial perspective are considered neutral.

    This release of the financial phrase bank covers a collection of 4840 sentences. The selected collection of phrases was annotated by 16 people with adequate background knowledge on financial markets. Three of the annotators were researchers and the remaining 13 annotators were master’s students at Aalto University School of Business with majors primarily in finance, accounting, and economics.

    Given the large number of overlapping annotations (5 to 8 annotations per sentence), there are several ways to define a majority vote based gold standard. To provide an objective comparison, we have formed 4 alternative reference datasets based on the strength of majority agreement: all annotators agree, >=75% of annotators agree, >=66% of annotators agree and >=50% of annotators agree.

  2. h

    financial_phrasebank

    • huggingface.co
    Updated Oct 9, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    FinMTEB (2024). financial_phrasebank [Dataset]. https://huggingface.co/datasets/FinanceMTEB/financial_phrasebank
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Oct 9, 2024
    Dataset authored and provided by
    FinMTEB
    Description

    FinanceMTEB/financial_phrasebank dataset hosted on Hugging Face and contributed by the HF Datasets community

  3. h

    financial_phrasebank

    • huggingface.co
    Updated Jul 19, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Avi Trost (2025). financial_phrasebank [Dataset]. https://huggingface.co/datasets/atrost/financial_phrasebank
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 19, 2025
    Authors
    Avi Trost
    Description

    Dataset Card for "financial_phrasebank"

    64/16/20 Split of the sentences_50agree subset of financial_phrasebank, according to the FinBERT paper.

  4. h

    enhanced-financial-phrasebank

    • huggingface.co
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Lei Z, enhanced-financial-phrasebank [Dataset]. https://huggingface.co/datasets/descartes100/enhanced-financial-phrasebank
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Authors
    Lei Z
    Description

    descartes100/enhanced-financial-phrasebank dataset hosted on Hugging Face and contributed by the HF Datasets community

  5. A

    ‘Financial Sentiment Analysis’ analyzed by Analyst-2

    • analyst-2.ai
    Updated Aug 4, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com) (2020). ‘Financial Sentiment Analysis’ analyzed by Analyst-2 [Dataset]. https://analyst-2.ai/analysis/kaggle-financial-sentiment-analysis-5b39/latest
    Explore at:
    Dataset updated
    Aug 4, 2020
    Dataset authored and provided by
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com)
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Analysis of ‘Financial Sentiment Analysis’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/sbhatti/financial-sentiment-analysis on 13 February 2022.

    --- Dataset description provided by original source is as follows ---

    Data

    The following data is intended for advancing financial sentiment analysis research. It's two datasets (FiQA, Financial PhraseBank) combined into one easy-to-use CSV file. It provides financial sentences with sentiment labels.

    Citations

    Malo, Pekka, et al. "Good debt or bad debt: Detecting semantic orientations in economic texts." Journal of the Association for Information Science and Technology 65.4 (2014): 782-796.

    --- Original source retains full ownership of the source dataset ---

  6. h

    financial_phrasebank

    • huggingface.co
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Vatolin Alexey, financial_phrasebank [Dataset]. https://huggingface.co/datasets/vatolinalex/financial_phrasebank
    Explore at:
    Authors
    Vatolin Alexey
    Description

    vatolinalex/financial_phrasebank dataset hosted on Hugging Face and contributed by the HF Datasets community

  7. h

    financial_phrasebank_sentences_allagree

    • huggingface.co
    Updated May 24, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Financial Services Innovation Lab, Georgia Tech (2025). financial_phrasebank_sentences_allagree [Dataset]. https://huggingface.co/datasets/gtfintechlab/financial_phrasebank_sentences_allagree
    Explore at:
    Dataset updated
    May 24, 2025
    Dataset authored and provided by
    Financial Services Innovation Lab, Georgia Tech
    License

    Attribution-ShareAlike 3.0 (CC BY-SA 3.0)https://creativecommons.org/licenses/by-sa/3.0/
    License information was derived automatically

    Description

    Dataset Card for financial_phrasebank

      Dataset Summary
    

    Polar sentiment dataset of sentences from financial news. The dataset consists of 4840 sentences from English language financial news categorised by sentiment. The dataset is divided by agreement rate of 5-8 annotators.

      Supported Tasks and Leaderboards
    

    Sentiment Classification

      Languages
    

    English

      Dataset Structure
    
    
    
    
    
      Data Instances
    

    { "sentence": "Pharmaceuticals group Orion Corp… See the full description on the dataset page: https://huggingface.co/datasets/gtfintechlab/financial_phrasebank_sentences_allagree.

  8. h

    autoeval-eval-financial_phrasebank-sentences_50agree-d5dbba-47711145221

    • huggingface.co
    Updated Oct 18, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Evaluation on the Hub (2023). autoeval-eval-financial_phrasebank-sentences_50agree-d5dbba-47711145221 [Dataset]. https://huggingface.co/datasets/autoevaluate/autoeval-eval-financial_phrasebank-sentences_50agree-d5dbba-47711145221
    Explore at:
    Dataset updated
    Oct 18, 2023
    Dataset authored and provided by
    Evaluation on the Hub
    Description

    autoevaluate/autoeval-eval-financial_phrasebank-sentences_50agree-d5dbba-47711145221 dataset hosted on Hugging Face and contributed by the HF Datasets community

  9. h

    autotrain-data-flan-t5-large-financial-phrasebank-lora

    • huggingface.co
    Updated Oct 24, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    kimsooyeon (2023). autotrain-data-flan-t5-large-financial-phrasebank-lora [Dataset]. https://huggingface.co/datasets/sooyeon/autotrain-data-flan-t5-large-financial-phrasebank-lora
    Explore at:
    Dataset updated
    Oct 24, 2023
    Authors
    kimsooyeon
    Description

    sooyeon/autotrain-data-flan-t5-large-financial-phrasebank-lora dataset hosted on Hugging Face and contributed by the HF Datasets community

  10. h

    auditor_review

    • huggingface.co
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rajiv Shah, auditor_review [Dataset]. https://huggingface.co/datasets/rajistics/auditor_review
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Authors
    Rajiv Shah
    License

    Attribution-NonCommercial-ShareAlike 3.0 (CC BY-NC-SA 3.0)https://creativecommons.org/licenses/by-nc-sa/3.0/
    License information was derived automatically

    Description

    Dataset Card for financial_phrasebank

      Dataset Description
    

    Auditor review data collected by News Department

    Point of Contact: Talked to COE for Auditing

      Dataset Summary
    

    Auditor sentiment dataset of sentences from financial news. The dataset consists of *** sentences from English language financial news categorized by sentiment. The dataset is divided by agreement rate of 5-8 annotators.

      Supported Tasks and Leaderboards
    

    Sentiment Classification… See the full description on the dataset page: https://huggingface.co/datasets/rajistics/auditor_review.

  11. h

    financial_phrasebank_75agree_german

    • huggingface.co
    Updated Mar 26, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Moritz Scherrmann (2025). financial_phrasebank_75agree_german [Dataset]. https://huggingface.co/datasets/scherrmann/financial_phrasebank_75agree_german
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Mar 26, 2025
    Authors
    Moritz Scherrmann
    License

    Attribution-NonCommercial-ShareAlike 3.0 (CC BY-NC-SA 3.0)https://creativecommons.org/licenses/by-nc-sa/3.0/
    License information was derived automatically

    Description

    Dataset Card for German financial_phrasebank

      Dataset Description
    
    
    
    
    
      Dataset Summary
    

    This datset is a German translation of the financial phrasebank of Malo et al. (2013) with a minimum agreement rate between annotators of 75% (3453 observations in total). The translation was mechanically accomplished with Deepl.

      Supported Tasks and Leaderboards
    

    Sentiment Classification

      Languages
    

    German

      Dataset Structure
    
    
    
    
    
      Data Instances
    

    {… See the full description on the dataset page: https://huggingface.co/datasets/scherrmann/financial_phrasebank_75agree_german.

  12. h

    financial_phrasebank_multilingual

    • huggingface.co
    Updated Jun 20, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Néstor Ojeda González (2025). financial_phrasebank_multilingual [Dataset]. https://huggingface.co/datasets/nojedag/financial_phrasebank_multilingual
    Explore at:
    Dataset updated
    Jun 20, 2025
    Authors
    Néstor Ojeda González
    Description

    Dataset Card for Multilingual Financial Sentiment Analysis

    This dataset is based in the combination of two datasets, FiQA and Financial PhraseBank, automatically translated to spanish, french and german.

      Dataset Details
    
    
    
    
    
      Dataset Sources
    

    KaggleHub: Financial Sentiment Analysis Paper: Good Debt or Bad Debt: Detecting Semantic Orientations in Economic Texts Financial Opinion Mining and Question Answering

      Uses
    

    Multilingual financial sentiment… See the full description on the dataset page: https://huggingface.co/datasets/nojedag/financial_phrasebank_multilingual.

  13. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Pyry Takala (2024). financial_phrasebank [Dataset]. https://huggingface.co/datasets/takala/financial_phrasebank

financial_phrasebank

FinancialPhrasebank

takala/financial_phrasebank

Explore at:
68 scholarly articles cite this dataset (View in Google Scholar)
Dataset updated
May 23, 2024
Authors
Pyry Takala
License

Attribution-NonCommercial-ShareAlike 3.0 (CC BY-NC-SA 3.0)https://creativecommons.org/licenses/by-nc-sa/3.0/
License information was derived automatically

Description

The key arguments for the low utilization of statistical techniques in financial sentiment analysis have been the difficulty of implementation for practical applications and the lack of high quality training data for building such models. Especially in the case of finance and economic texts, annotated collections are a scarce resource and many are reserved for proprietary use only. To resolve the missing training data problem, we present a collection of ∼ 5000 sentences to establish human-annotated standards for benchmarking alternative modeling techniques.

The objective of the phrase level annotation task was to classify each example sentence into a positive, negative or neutral category by considering only the information explicitly available in the given sentence. Since the study is focused only on financial and economic domains, the annotators were asked to consider the sentences from the view point of an investor only; i.e. whether the news may have positive, negative or neutral influence on the stock price. As a result, sentences which have a sentiment that is not relevant from an economic or financial perspective are considered neutral.

This release of the financial phrase bank covers a collection of 4840 sentences. The selected collection of phrases was annotated by 16 people with adequate background knowledge on financial markets. Three of the annotators were researchers and the remaining 13 annotators were master’s students at Aalto University School of Business with majors primarily in finance, accounting, and economics.

Given the large number of overlapping annotations (5 to 8 annotations per sentence), there are several ways to define a majority vote based gold standard. To provide an objective comparison, we have formed 4 alternative reference datasets based on the strength of majority agreement: all annotators agree, >=75% of annotators agree, >=66% of annotators agree and >=50% of annotators agree.

Search
Clear search
Close search
Google apps
Main menu