Facebook
TwitterAttribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Dataset Card for Financial PhraseBank
Dataset Description
Repository: [Link to the source, e.g., on Kaggle or original paper's site] Paper: Good debt or bad debt: Detecting semantic orientations in economic texts This dataset (FinancialPhraseBank) contains the sentiments for 4846 financial news headlines from the perspective of a retail investor. The dataset is labeled with "negative", "neutral", or "positive" sentiments.
Content
The dataset contains two… See the full description on the dataset page: https://huggingface.co/datasets/lmassaron/FinancialPhraseBank.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset was created by Daniel M.
Released under CC0: Public Domain
Facebook
TwitterDataset Card for "financial-phrasebank-all-agree-classification"
More Information needed
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
The following data is intended for advancing financial sentiment analysis research. It's two datasets (FiQA, Financial PhraseBank) combined into one easy-to-use CSV file. It provides financial sentences with sentiment labels.
Malo, Pekka, et al. "Good debt or bad debt: Detecting semantic orientations in economic texts." Journal of the Association for Information Science and Technology 65.4 (2014): 782-796.
Facebook
TwitterThis dataset was created by MukhammedAbuSuveilim
Facebook
TwitterDataset Creation
This dataset combines financial phrasebank dataset and a financial text dataset from Kaggle. Given the financial phrasebank dataset does not have a validation split, I thought this might help to validate finance models and also capture the impact of COVID on financial earnings with the more recent Kaggle dataset.
Facebook
TwitterDataset Card for "financial-phrasebank-all-agree-clustering"
More Information needed
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Note:[Please help give a Vote 👍 if you think this FinSen dataset is good for you, Thanks:)]
This paper introduces FinSen dataset that revolutionizes financial market analysis by integrating economic and financial news articles from 197 countries with stock market data. The dataset’s extensive coverage spans 15 years from 2007 to 2023 with temporal information, offering a rich, global perspective 160,000 records on financial market news. Our study leverages causally validated sentiment scores and LSTM models to enhance market forecast accuracy and reliability.
https://github.com/user-attachments/assets/5df3c4a7-2403-460a-ac7f-2d69572fec2f" alt="image">
This repository contains the dataset for "https://arxiv.org/abs/2408.01005">Enhancing Financial Market Predictions: Causality-Driven Feature Selection, which has been accepted in ADMA 2024.
If the dataset or the paper has been useful in your research, please add a citation to our work:
@article{liang2024enhancing,
title={Enhancing Financial Market Predictions: Causality-Driven Feature Selection},
author={Liang, Wenhao and Li, Zhengyang and Chen, Weitong},
journal={arXiv e-prints},
pages={arXiv--2408},
year={2024}
}
[FinSen] can be downloaded manually from the repository as csv file. Sentiment and its score are generated by FinBert model from the Hugging Face Transformers library under the identifier "ProsusAI/finbert". (Araci, Dogu. "Finbert: Financial sentiment analysis with pre-trained language models." arXiv preprint arXiv:1908.10063 (2019).)
We only provide US for research purpose usage, please contact w.liang@adelaide.edu.au for other countries (total 197 included) if necessary.
https://github.com/user-attachments/assets/f28e670a-7329-409d-81cb-1fe47da22140" alt="image">
Finsen Data Sample:
https://github.com/user-attachments/assets/6ab08486-85b7-4cf6-b4fe-7d4294624f91">
We also provide other NLP datasets for text classification tasks here, please cite them correspondingly once you used them in your research if any.
We provide the preprocessing file finsen.py for our FinSen dataset under dataloaders directory for more convienient usage.
DAN-3.
Gobal Pooling CNN.
https://github.com/user-attachments/assets/2d9b4dd7-7f59-425c-b812-2cca57719243" alt="image">
:smiley: ☺ Happy Research !
Facebook
TwitterDataset Card for "financial_phrasebank"
64/16/20 Split of the sentences_50agree subset of financial_phrasebank, according to the FinBERT paper.
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
intanm/indonesian-financial-phrasebank dataset hosted on Hugging Face and contributed by the HF Datasets community
Facebook
TwitterAttribution-ShareAlike 3.0 (CC BY-SA 3.0)https://creativecommons.org/licenses/by-sa/3.0/
License information was derived automatically
The Financial Phrase Bank is a dataset originally developed for the paper Good Debt or Bad Debt: Detecting Semantic Orientations in Economic Texts, made available by researchers from Aalto University and the Indian Institute of Management. The dataset allows for a useful benchmark for fine-tuning Language Models on Sentiment Analysis Tasks.
As the amount of annotated text data (especially about the financial market) in Portuguese, I went ahead and translated the entire dataset for people to try out Sentiment Analysis tasks in Portuguese.
The dataset originally contains about 4840 manually annotated financial news in English and consists of three columns:
1. y: the annotated label for the sentiment of the news text (neutral, positive, negative);
2. text: the original text for each record;
3. text_pt: the translated and that I manually validated version of the original record;
[1] Malo, P., Sinha, A., Korhonen, P., Wallenius, J., & Takala, P. (2014). Good debt or bad debt: Detecting semantic orientations in economic texts. Journal of the Association for Information Science and Technology, 65(4), 782-796.
Photo by Markus Winkler on Unsplash
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
This dataset was created by Pixel_Dust_64
Released under Apache 2.0
Facebook
TwitterAggregated Financial Reasoning Dataset for Reinforcement Fine Tuning(RFT) in Finance
A multi-source NLP dataset combining Financial PhraseBank, FinQA, news Headlines, and Twitter data, labeled for sentiment and QA tasks.
I don't own any of the datasets, just curating for my own reasoning experiments and teaching materials.
Dataset Overview
PurposeThis dataset is an aggregation of text sources that have a discrete output, which allows downstream RFT while… See the full description on the dataset page: https://huggingface.co/datasets/neoyipeng/financial_reasoning_aggregated.
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
This dataset was created by Daniele Simas
Released under MIT
Facebook
TwitterDataset Card for Multilingual Financial Sentiment Analysis
This dataset is based in the combination of two datasets, FiQA and Financial PhraseBank, automatically translated to spanish, french and german.
Dataset Details
Dataset Sources
KaggleHub: Financial Sentiment Analysis Paper: Good Debt or Bad Debt: Detecting Semantic Orientations in Economic Texts Financial Opinion Mining and Question Answering
Uses
Multilingual financial sentiment… See the full description on the dataset page: https://huggingface.co/datasets/nojedag/financial_phrasebank_multilingual.
Facebook
TwitterAttribution-NonCommercial-ShareAlike 3.0 (CC BY-NC-SA 3.0)https://creativecommons.org/licenses/by-nc-sa/3.0/
License information was derived automatically
Dataset Card for German financial_phrasebank
Dataset Description
Dataset Summary
This datset is a German translation of the financial phrasebank of Malo et al. (2013) with a minimum agreement rate between annotators of 75% (3453 observations in total). The translation was mechanically accomplished with Deepl.
Supported Tasks and Leaderboards
Sentiment Classification
Languages
German
Dataset Structure
Data Instances
{… See the full description on the dataset page: https://huggingface.co/datasets/scherrmann/financial_phrasebank_75agree_german.
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Facebook
TwitterAttribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Dataset Card for Financial PhraseBank
Dataset Description
Repository: [Link to the source, e.g., on Kaggle or original paper's site] Paper: Good debt or bad debt: Detecting semantic orientations in economic texts This dataset (FinancialPhraseBank) contains the sentiments for 4846 financial news headlines from the perspective of a retail investor. The dataset is labeled with "negative", "neutral", or "positive" sentiments.
Content
The dataset contains two… See the full description on the dataset page: https://huggingface.co/datasets/lmassaron/FinancialPhraseBank.