16 datasets found
  1. h

    FinancialPhraseBank

    • huggingface.co
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Luca Massaron, FinancialPhraseBank [Dataset]. https://huggingface.co/datasets/lmassaron/FinancialPhraseBank
    Explore at:
    Authors
    Luca Massaron
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Description

    Dataset Card for Financial PhraseBank

      Dataset Description
    

    Repository: [Link to the source, e.g., on Kaggle or original paper's site] Paper: Good debt or bad debt: Detecting semantic orientations in economic texts This dataset (FinancialPhraseBank) contains the sentiments for 4846 financial news headlines from the perspective of a retail investor. The dataset is labeled with "negative", "neutral", or "positive" sentiments.

      Content
    

    The dataset contains two… See the full description on the dataset page: https://huggingface.co/datasets/lmassaron/FinancialPhraseBank.

  2. distilbert-reddit-financial-phrasebank-allagree

    • kaggle.com
    zip
    Updated Nov 9, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Daniel M. (2021). distilbert-reddit-financial-phrasebank-allagree [Dataset]. https://www.kaggle.com/datasets/muniozdaniel0/distilbert-reddit-financial-phrasebank-allagree
    Explore at:
    zip(2437573921 bytes)Available download formats
    Dataset updated
    Nov 9, 2021
    Authors
    Daniel M.
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Dataset

    This dataset was created by Daniel M.

    Released under CC0: Public Domain

    Contents

  3. h

    financial-phrasebank-all-agree-classification

    • huggingface.co
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    ghbacct, financial-phrasebank-all-agree-classification [Dataset]. https://huggingface.co/datasets/ghbacct/financial-phrasebank-all-agree-classification
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Authors
    ghbacct
    Description

    Dataset Card for "financial-phrasebank-all-agree-classification"

    More Information needed

  4. Financial Sentiment Analysis

    • kaggle.com
    zip
    Updated Feb 19, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    sbhatti (2022). Financial Sentiment Analysis [Dataset]. https://www.kaggle.com/sbhatti/financial-sentiment-analysis
    Explore at:
    zip(282375 bytes)Available download formats
    Dataset updated
    Feb 19, 2022
    Authors
    sbhatti
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Data

    The following data is intended for advancing financial sentiment analysis research. It's two datasets (FiQA, Financial PhraseBank) combined into one easy-to-use CSV file. It provides financial sentences with sentiment labels.

    Citations

    Malo, Pekka, et al. "Good debt or bad debt: Detecting semantic orientations in economic texts." Journal of the Association for Information Science and Technology 65.4 (2014): 782-796.

  5. English Russian Financial Phrasebank

    • kaggle.com
    zip
    Updated Jan 7, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    MukhammedAbuSuveilim (2025). English Russian Financial Phrasebank [Dataset]. https://www.kaggle.com/datasets/mukhammedabusuveilim/english-russian-financial-phrasebank/suggestions
    Explore at:
    zip(297557 bytes)Available download formats
    Dataset updated
    Jan 7, 2025
    Authors
    MukhammedAbuSuveilim
    Description

    Dataset

    This dataset was created by MukhammedAbuSuveilim

    Contents

  6. h

    financial-classification

    • huggingface.co
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nicholas Muchinguri, financial-classification [Dataset]. https://huggingface.co/datasets/nickmuchi/financial-classification
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Authors
    Nicholas Muchinguri
    Description

    Dataset Creation

    This dataset combines financial phrasebank dataset and a financial text dataset from Kaggle. Given the financial phrasebank dataset does not have a validation split, I thought this might help to validate finance models and also capture the impact of COVID on financial earnings with the more recent Kaggle dataset.

  7. h

    financial-phrasebank-all-agree-clustering

    • huggingface.co
    Updated Jul 15, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    ghbacct (2023). financial-phrasebank-all-agree-clustering [Dataset]. https://huggingface.co/datasets/ghbacct/financial-phrasebank-all-agree-clustering
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 15, 2023
    Authors
    ghbacct
    Description

    Dataset Card for "financial-phrasebank-all-agree-clustering"

    More Information needed

  8. FinSen Financial Sentiment Dataset

    • kaggle.com
    zip
    Updated Oct 29, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Eagle W H L (2024). FinSen Financial Sentiment Dataset [Dataset]. https://www.kaggle.com/datasets/eaglewhl/finsen-financial-sentiment-dataset/code
    Explore at:
    zip(6549212 bytes)Available download formats
    Dataset updated
    Oct 29, 2024
    Authors
    Eagle W H L
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Enhancing Financial Market Predictions: Causality-Driven Feature Selection

    Note:[Please help give a Vote 👍 if you think this FinSen dataset is good for you, Thanks:)]

    This paper introduces FinSen dataset that revolutionizes financial market analysis by integrating economic and financial news articles from 197 countries with stock market data. The dataset’s extensive coverage spans 15 years from 2007 to 2023 with temporal information, offering a rich, global perspective 160,000 records on financial market news. Our study leverages causally validated sentiment scores and LSTM models to enhance market forecast accuracy and reliability.

    Technical Framework

    https://github.com/user-attachments/assets/5df3c4a7-2403-460a-ac7f-2d69572fec2f" alt="image">

    Our FinSen Dataset

    arXiv Pytorch 1.5 License: MIT

    This repository contains the dataset for "https://arxiv.org/abs/2408.01005">Enhancing Financial Market Predictions: Causality-Driven Feature Selection, which has been accepted in ADMA 2024.

    If the dataset or the paper has been useful in your research, please add a citation to our work:

    @article{liang2024enhancing,
     title={Enhancing Financial Market Predictions: Causality-Driven Feature Selection},
     author={Liang, Wenhao and Li, Zhengyang and Chen, Weitong},
     journal={arXiv e-prints},
     pages={arXiv--2408},
     year={2024}
    }
    

    Datasets

    [FinSen] can be downloaded manually from the repository as csv file. Sentiment and its score are generated by FinBert model from the Hugging Face Transformers library under the identifier "ProsusAI/finbert". (Araci, Dogu. "Finbert: Financial sentiment analysis with pre-trained language models." arXiv preprint arXiv:1908.10063 (2019).)

    We only provide US for research purpose usage, please contact w.liang@adelaide.edu.au for other countries (total 197 included) if necessary.

    https://github.com/user-attachments/assets/f28e670a-7329-409d-81cb-1fe47da22140" alt="image">

    Finsen Data Sample:

    imagehttps://github.com/user-attachments/assets/6ab08486-85b7-4cf6-b4fe-7d4294624f91">

    We also provide other NLP datasets for text classification tasks here, please cite them correspondingly once you used them in your research if any.

    1. 20Newsgroups. Joachims, T., et al.: A probabilistic analysis of the rocchio algorithm with tfidf for text categorization. In: ICML. vol. 97, pp. 143–151. Citeseer (1997)
    2. AG News. Zhang, X., Zhao, J., LeCun, Y.: Character-level convolutional networks for text classification. Advances in neural information processing systems 28 (2015)
    3. Financial PhraseBank. Malo, P., Sinha, A., Korhonen, P., Wallenius, J., Takala, P.: Good debt or bad debt: Detecting semantic orientations in economic texts. Journal of the Association for Information Science and Technology 65(4), 782–796 (2014)

    Dataloader for FinSen

    We provide the preprocessing file finsen.py for our FinSen dataset under dataloaders directory for more convienient usage.

    Models - Text Classification

    1. DAN-3.

    2. Gobal Pooling CNN.

    Models - Regression Prediction

    1. LSTM

    Using Sentiment Score from FinSen Predict Result on S&P500

    https://github.com/user-attachments/assets/2d9b4dd7-7f59-425c-b812-2cca57719243" alt="image">

    :smiley: ☺ Happy Research !

  9. h

    financial_phrasebank

    • huggingface.co
    Updated Jul 19, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Avi Trost (2025). financial_phrasebank [Dataset]. https://huggingface.co/datasets/atrost/financial_phrasebank
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 19, 2025
    Authors
    Avi Trost
    Description

    Dataset Card for "financial_phrasebank"

    64/16/20 Split of the sentences_50agree subset of financial_phrasebank, according to the FinBERT paper.

  10. h

    indonesian-financial-phrasebank

    • huggingface.co
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Intan Maharani, indonesian-financial-phrasebank [Dataset]. https://huggingface.co/datasets/intanm/indonesian-financial-phrasebank
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Authors
    Intan Maharani
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    intanm/indonesian-financial-phrasebank dataset hosted on Hugging Face and contributed by the HF Datasets community

  11. Stock Market News Data in Portuguese

    • kaggle.com
    zip
    Updated Jul 7, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mateus Picanco (2021). Stock Market News Data in Portuguese [Dataset]. https://www.kaggle.com/mateuspicanco/financial-phrase-bank-portuguese-translation
    Explore at:
    zip(481703 bytes)Available download formats
    Dataset updated
    Jul 7, 2021
    Authors
    Mateus Picanco
    License

    Attribution-ShareAlike 3.0 (CC BY-SA 3.0)https://creativecommons.org/licenses/by-sa/3.0/
    License information was derived automatically

    Description

    Stock Market News Data in Portuguese

    The Financial Phrase Bank is a dataset originally developed for the paper Good Debt or Bad Debt: Detecting Semantic Orientations in Economic Texts, made available by researchers from Aalto University and the Indian Institute of Management. The dataset allows for a useful benchmark for fine-tuning Language Models on Sentiment Analysis Tasks.

    As the amount of annotated text data (especially about the financial market) in Portuguese, I went ahead and translated the entire dataset for people to try out Sentiment Analysis tasks in Portuguese.

    Content

    The dataset originally contains about 4840 manually annotated financial news in English and consists of three columns: 1. y: the annotated label for the sentiment of the news text (neutral, positive, negative); 2. text: the original text for each record; 3. text_pt: the translated and that I manually validated version of the original record;

    Acknowledgments

    [1] Malo, P., Sinha, A., Korhonen, P., Wallenius, J., & Takala, P. (2014). Good debt or bad debt: Detecting semantic orientations in economic texts. Journal of the Association for Information Science and Technology, 65(4), 782-796.

    Photo by Markus Winkler on Unsplash

  12. financial-phrase-bank-portuguese-translation

    • kaggle.com
    zip
    Updated Jan 24, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Pixel_Dust_64 (2024). financial-phrase-bank-portuguese-translation [Dataset]. https://www.kaggle.com/datasets/pixeldust64/financial-phrase-bank-portuguese-translation/code
    Explore at:
    zip(480183 bytes)Available download formats
    Dataset updated
    Jan 24, 2024
    Authors
    Pixel_Dust_64
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Dataset

    This dataset was created by Pixel_Dust_64

    Released under Apache 2.0

    Contents

  13. h

    financial_reasoning_aggregated

    • huggingface.co
    Updated Nov 27, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yi Peng Neo (2025). financial_reasoning_aggregated [Dataset]. https://huggingface.co/datasets/neoyipeng/financial_reasoning_aggregated
    Explore at:
    Dataset updated
    Nov 27, 2025
    Authors
    Yi Peng Neo
    Description

    Aggregated Financial Reasoning Dataset for Reinforcement Fine Tuning(RFT) in Finance

    A multi-source NLP dataset combining Financial PhraseBank, FinQA, news Headlines, and Twitter data, labeled for sentiment and QA tasks.

      I don't own any of the datasets, just curating for my own reasoning experiments and teaching materials.
    
    
    
    
    
      Dataset Overview
    

    PurposeThis dataset is an aggregation of text sources that have a discrete output, which allows downstream RFT while… See the full description on the dataset page: https://huggingface.co/datasets/neoyipeng/financial_reasoning_aggregated.

  14. financial_phrase_bank_pt_br

    • kaggle.com
    zip
    Updated Jan 22, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Daniele Simas (2024). financial_phrase_bank_pt_br [Dataset]. https://www.kaggle.com/datasets/danielesimas/financial-phrase-bank-pt-br
    Explore at:
    zip(481703 bytes)Available download formats
    Dataset updated
    Jan 22, 2024
    Authors
    Daniele Simas
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Dataset

    This dataset was created by Daniele Simas

    Released under MIT

    Contents

  15. h

    financial_phrasebank_multilingual

    • huggingface.co
    Updated Jun 20, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Néstor Ojeda González (2025). financial_phrasebank_multilingual [Dataset]. https://huggingface.co/datasets/nojedag/financial_phrasebank_multilingual
    Explore at:
    Dataset updated
    Jun 20, 2025
    Authors
    Néstor Ojeda González
    Description

    Dataset Card for Multilingual Financial Sentiment Analysis

    This dataset is based in the combination of two datasets, FiQA and Financial PhraseBank, automatically translated to spanish, french and german.

      Dataset Details
    
    
    
    
    
      Dataset Sources
    

    KaggleHub: Financial Sentiment Analysis Paper: Good Debt or Bad Debt: Detecting Semantic Orientations in Economic Texts Financial Opinion Mining and Question Answering

      Uses
    

    Multilingual financial sentiment… See the full description on the dataset page: https://huggingface.co/datasets/nojedag/financial_phrasebank_multilingual.

  16. h

    financial_phrasebank_75agree_german

    • huggingface.co
    Updated Mar 26, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Moritz Scherrmann (2025). financial_phrasebank_75agree_german [Dataset]. https://huggingface.co/datasets/scherrmann/financial_phrasebank_75agree_german
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Mar 26, 2025
    Authors
    Moritz Scherrmann
    License

    Attribution-NonCommercial-ShareAlike 3.0 (CC BY-NC-SA 3.0)https://creativecommons.org/licenses/by-nc-sa/3.0/
    License information was derived automatically

    Description

    Dataset Card for German financial_phrasebank

      Dataset Description
    
    
    
    
    
      Dataset Summary
    

    This datset is a German translation of the financial phrasebank of Malo et al. (2013) with a minimum agreement rate between annotators of 75% (3453 observations in total). The translation was mechanically accomplished with Deepl.

      Supported Tasks and Leaderboards
    

    Sentiment Classification

      Languages
    

    German

      Dataset Structure
    
    
    
    
    
      Data Instances
    

    {… See the full description on the dataset page: https://huggingface.co/datasets/scherrmann/financial_phrasebank_75agree_german.

  17. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Luca Massaron, FinancialPhraseBank [Dataset]. https://huggingface.co/datasets/lmassaron/FinancialPhraseBank

FinancialPhraseBank

Financial PhraseBank

lmassaron/FinancialPhraseBank

Explore at:
89 scholarly articles cite this dataset (View in Google Scholar)
Authors
Luca Massaron
License

Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically

Description

Dataset Card for Financial PhraseBank

  Dataset Description

Repository: [Link to the source, e.g., on Kaggle or original paper's site] Paper: Good debt or bad debt: Detecting semantic orientations in economic texts This dataset (FinancialPhraseBank) contains the sentiments for 4846 financial news headlines from the perspective of a retail investor. The dataset is labeled with "negative", "neutral", or "positive" sentiments.

  Content

The dataset contains two… See the full description on the dataset page: https://huggingface.co/datasets/lmassaron/FinancialPhraseBank.

Search
Clear search
Close search
Google apps
Main menu