Facebook
TwitterAttribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
This dataset (FinancialPhraseBank) contains the sentiments for financial news headlines from the perspective of a retail investor.
The dataset contains two columns, "Sentiment" and "News Headline". The sentiment can be negative, neutral or positive.
Malo, P., Sinha, A., Korhonen, P., Wallenius, J., & Takala, P. (2014). Good debt or bad debt: Detecting semantic orientations in economic texts. Journal of the Association for Information Science and Technology, 65(4), 782-796.
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Dataset Description
The Twitter Financial News dataset is an English-language dataset containing an annotated corpus of finance-related tweets. This dataset is used to classify finance-related tweets for their sentiment.
The dataset holds 11,932 documents annotated with 3 labels:
sentiments = { "LABEL_0": "Bearish", "LABEL_1": "Bullish", "LABEL_2": "Neutral" }
The data was collected using the Twitter API. The current dataset supports the multi-class classification… See the full description on the dataset page: https://huggingface.co/datasets/zeroshot/twitter-financial-news-sentiment.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset contains financial news articles published by HuffPost between 2012 and 2022, curated to support research in financial sentiment analysis, market forecasting, and portfolio optimization. Each entry is formatted in JSON and includes structured fields such as headline, article_link, short_description, author, category, and date_published.
Researchers can leverage this dataset for a wide range of natural language processing (NLP) tasks, including the development and testing of FinBERT and other finance-focused sentiment models. The year-wise separation of the data also facilitates time-series modeling and historical financial trend analyses.
Key Features:
Source: HuffPost financial news articles
Timeframe: 2012–2022
Format: JSON, structured by year
Fields: Headline, link, summary, author, category, publication date
Use Cases:
Sentiment-informed market prediction
Event-driven trading strategies
Portfolio rebalancing based on news sentiment
Backtesting NLP-driven financial models
Ideal For: Researchers and practitioners in financial engineering, quantitative finance, machine learning, and computational economics.
Licensing: Released under Creative Commons CC0 1.0, making it freely available for both academic and commercial use.
Facebook
TwitterAttribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Finance news labeled by their sentiment. Can be used for NLP.
Here are the data operations made on the texts:
This dataset still needs some data cleaning operations:
Also, note that emojis are present in some texts. I let you decide if you want to process them for your sentiment analysis.
This dataset is the cleaned concatenation of multiple finance news sentiments datasets:
Thanks for their work!
Facebook
TwitterAttribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Daniel-ML/sentiment-analysis-for-financial-news dataset hosted on Hugging Face and contributed by the HF Datasets community
Facebook
Twitterhttps://brightdata.com/licensehttps://brightdata.com/license
Stay informed with our comprehensive Financial News Dataset, designed for investors, analysts, and businesses to track market trends, monitor financial events, and make data-driven decisions.
Dataset Features
Financial News Articles: Access structured financial news data, including headlines, summaries, full articles, publication dates, and source details. Market & Economic Indicators: Track financial reports, stock market updates, economic forecasts, and corporate earnings announcements. Sentiment & Trend Analysis: Analyze news sentiment, categorize articles by financial topics, and monitor emerging trends in global markets. Historical & Real-Time Data: Retrieve historical financial news archives or access continuously updated feeds for real-time insights.
Customizable Subsets for Specific Needs Our Financial News Dataset is fully customizable, allowing you to filter data based on publication date, region, financial topics, sentiment, or specific news sources. Whether you need broad coverage for market research or focused data for investment analysis, we tailor the dataset to your needs.
Popular Use Cases
Investment Strategy & Risk Management: Monitor financial news to assess market risks, identify investment opportunities, and optimize trading strategies. Market & Competitive Intelligence: Track industry trends, competitor financial performance, and economic developments. AI & Machine Learning Training: Use structured financial news data to train AI models for sentiment analysis, stock prediction, and automated trading. Regulatory & Compliance Monitoring: Stay updated on financial regulations, policy changes, and corporate governance news. Economic Research & Forecasting: Analyze financial news trends to predict economic shifts and market movements.
Whether you're tracking stock market trends, analyzing financial sentiment, or training AI models, our Financial News Dataset provides the structured data you need. Get started today and customize your dataset to fit your business objectives.
Facebook
TwitterAttribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Welcome to our Bengali Financial News Sentiment Analysis dataset! This collection comprises 7,695 financial news articles extracted, covering the period from March 3, 2014, to December 29, 2021. Utilizing the powerful web scraping tool "Beautiful Soup 4.4.0" in Python.
This dataset was a crucial part of our research published in the journal paper titled "Stock Market Prediction of Bangladesh Using Multivariate Long Short-Term Memory with Sentiment Identification." The paper can be accessed and cited at http://doi.org/10.11591/ijece.v13i5.pp5696-5706.
We are excited to share this unique dataset, which we hope will empower researchers, analysts, and enthusiasts to explore and understand the dynamics of the Bengali financial market through sentiment analysis. Join us on this journey of uncovering the hidden emotions driving market trends and decisions in Bangladesh. Happy analyzing!
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
This dataset aggregates real-time sentiment scores and metadata for financial news headlines, enabling rapid detection of market-moving events and trends. It includes headline text, publication details, sentiment analysis, relevance to financial markets, and links to affected stocks and sectors. Ideal for quantitative trading, risk monitoring, and financial news analytics.
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Dataset Description
The Twitter Financial News dataset is an English-language dataset containing an annotated corpus of finance-related tweets. This dataset is used to classify finance-related tweets for their topic.
The dataset holds 21,107 documents annotated with 20 labels:
topics = { "LABEL_0": "Analyst Update", "LABEL_1": "Fed | Central Banks", "LABEL_2": "Company | Product News", "LABEL_3": "Treasuries | Corporate Debt", "LABEL_4": "Dividend"… See the full description on the dataset page: https://huggingface.co/datasets/zeroshot/twitter-financial-news-topic.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset contains news headlines relevant to key forex pairs: AUDUSD, EURCHF, EURUSD, GBPUSD, and USDJPY. The data was extracted from reputable platforms Forex Live and FXstreet over a period of 86 days, from January to May 2023. The dataset comprises 2,291 unique news headlines. Each headline includes an associated forex pair, timestamp, source, author, URL, and the corresponding article text. Data was collected using web scraping techniques executed via a custom service on a virtual machine. This service periodically retrieves the latest news for a specified forex pair (ticker) from each platform, parsing all available information. The collected data is then processed to extract details such as the article's timestamp, author, and URL. The URL is further used to retrieve the full text of each article. This data acquisition process repeats approximately every 15 minutes.
To ensure the reliability of the dataset, we manually annotated each headline for sentiment. Instead of solely focusing on the textual content, we ascertained sentiment based on the potential short-term impact of the headline on its corresponding forex pair. This method recognizes the currency market's acute sensitivity to economic news, which significantly influences many trading strategies. As such, this dataset could serve as an invaluable resource for fine-tuning sentiment analysis models in the financial realm.
We used three categories for annotation: 'positive', 'negative', and 'neutral', which correspond to bullish, bearish, and hold sentiments, respectively, for the forex pair linked to each headline. The following Table provides examples of annotated headlines along with brief explanations of the assigned sentiment.
Examples of Annotated Headlines
Forex Pair
Headline
Sentiment
Explanation
GBPUSD
Diminishing bets for a move to 12400
Neutral
Lack of strong sentiment in either direction
GBPUSD
No reasons to dislike Cable in the very near term as long as the Dollar momentum remains soft
Positive
Positive sentiment towards GBPUSD (Cable) in the near term
GBPUSD
When are the UK jobs and how could they affect GBPUSD
Neutral
Poses a question and does not express a clear sentiment
JPYUSD
Appropriate to continue monetary easing to achieve 2% inflation target with wage growth
Positive
Monetary easing from Bank of Japan (BoJ) could lead to a weaker JPY in the short term due to increased money supply
USDJPY
Dollar rebounds despite US data. Yen gains amid lower yields
Neutral
Since both the USD and JPY are gaining, the effects on the USDJPY forex pair might offset each other
USDJPY
USDJPY to reach 124 by Q4 as the likelihood of a BoJ policy shift should accelerate Yen gains
Negative
USDJPY is expected to reach a lower value, with the USD losing value against the JPY
AUDUSD
RBA Governor Lowe’s Testimony High inflation is damaging and corrosive
Positive
Reserve Bank of Australia (RBA) expresses concerns about inflation. Typically, central banks combat high inflation with higher interest rates, which could strengthen AUD.
Moreover, the dataset includes two columns with the predicted sentiment class and score as predicted by the FinBERT model. Specifically, the FinBERT model outputs a set of probabilities for each sentiment class (positive, negative, and neutral), representing the model's confidence in associating the input headline with each sentiment category. These probabilities are used to determine the predicted class and a sentiment score for each headline. The sentiment score is computed by subtracting the negative class probability from the positive one.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset includes information from three financial news organizations: CNBC, Guardian, and Reuters; such as dates of articles, headlines, and BERT sentiment analyses. The BERT code used to create sentiment will be pinned under 'code'.
Facebook
TwitterExplore the Russian Financial News Dataset with 91,955 articles and metadata. Perfect for sentiment analysis, text summarization, keyword extraction, and financial AI research.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Financial news significantly influences investment decisions, stock market trends, and corporate strategies. However, extracting meaningful insights from unstructured news articles, particularly event-cause relationships, remains a challenge. This dataset addresses this gap by providing manually annotated event-cause pairs from financial news, enabling improved predictive modeling, risk assessment, and automated trading strategies.
Dataset Composition:
The dataset comprises 456 financial news articles from the following four major Indian financial news sources.
Business Standard
Economic Times
Live Mint
Moneycontrol
It covers articles from 2021 to 2025. Each entry includes annotated event-cause relationships along with metadata such as stock symbols, stock change, company names, and financial indicators. The dataset categorizes events into five key types:
Financial Reports & Earnings Announcements
Mergers & Acquisitions
Regulatory Changes & Legal Actions
Executive Leadership Changes
Market & Economic Trends
Dataset Attributes
The dataset comprises the following attributes:
Source: The origin of the news article (e.g., financial news websites).
Title: The headline of the article.
Content: The full text of the article.
Date: The publication date of the article.
Stock: Name of the Stock.
Labels: The annotation Tags (e.g., ORG, EVENT, CAUSE)
Stock Gain/Loss Percent: The percentage change in stock price associated with the event described in the article. The gain/loss percent was manually extracted from the Tickertape website.
The dataset is structured in JSON format and CSV, ensuring efficient storage and accessibility.
Applications:
This dataset supports event-cause extraction in financial NLP applications such as:
Stock market prediction using causal analysis
Algorithmic trading models incorporating financial event impact
Sentiment analysis & risk assessment for investment strategies
Corporate strategy evaluation based on financial event insights
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
This dataset was created by Skywalker
Released under MIT
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
This dataset is the result of research in applying Large Language Models (LLMs) to financial news processing. It contains over 5,000 news articles from various financial publishers, with the following information available for each article: - Title - Summary - Relevant stock tickers - The sentiment of each stock ticker in the article along with the reasoning for the categorization - Metadata including publish date, author, and image URL
The data is a result of an LLM-powered pipeline proposed by the data provider Polygon.io in a recent white paper. A live and continuously updated version of this dataset can be obtained via API here.
Cite as:
@article{dolphin2024extracting,
title={Extracting Structured Insights from Financial News: An Augmented LLM Driven Approach},
author={Dolphin, Rian and Dursun, Joe and Chow, Jonathan and Blankenship, Jarrett and Adams, Katie and Pike, Quinton},
journal={arXiv preprint arXiv:2407.15788},
year={2024}
}
Facebook
TwitterAttribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
The Loughran and McDonald Sentiment Word Lists were developed using corporate 10-K reports between 1994 and 2008 [14]. These reports are relevant to companies in the United States of America and required by the U.S. Securities and Exchange Commission (SEC)14.The motivation for building the LM-SA-2020 word list was based on an experiment using the above-mentioned original lists to detect sentiment-carrying words in South African financial article headlines. A corpus of 808 financial articles (relating to Sasol) were used and only 37% of headlines had words of which the sentiment matched that of the words in the Loughran and McDonald Sentiment Word Lists correctly according to ground truth labels. A gap was therefore identified in developing a method for predicting sentiment of financial articles in a South African context. Due to the size of data set, it was possible to manually examine the head-lines to identify sentiment-carrying words to be included in the original wordlists. Furthermore, synonyms were added for the existing words in the Loughran and McDonald Sentiment Word Lists using NLTK’s WordNet16 interface. The sentiment detection/prediction accuracy improved by 29% using the new word list. This sentiment word list can be further expanded/improved in future by increasing the size of the data set and/or including data from other companies. It highlights the need for not only domain-specific sentiment prediction tools but also region-specific corporate.
Facebook
TwitterFNSPID: A Comprehensive Financial News Dataset in Time Series
Description
FNSPID is a meticulously curated dataset designed to support research and applications in the field of financial news analysis within the context of time-series forecasting. Our dataset encompasses a wide range of financial news articles, providing a rich resource for developing and testing models aimed at understanding market trends, investor sentiment, and other critical financial indicators. Link… See the full description on the dataset page: https://huggingface.co/datasets/Zihan1004/FNSPID.
Facebook
TwitterDataset Card for Auditor Sentiment
Dataset Description
Auditor review sentiment collected by News Department
Point of Contact: Talked to COE for Auditing, currently sue@demo.org
Dataset Summary
Auditor sentiment dataset of sentences from financial news. The dataset consists of several thousand sentences from English language financial news categorized by sentiment.
Supported Tasks and Leaderboards
Sentiment Classification
Languages… See the full description on the dataset page: https://huggingface.co/datasets/Tianzhou/auditor_sentiment.
Facebook
Twitter
According to our latest research, the global sentiment analysis for financial services market size reached USD 4.2 billion in 2024 and is projected to grow at a robust CAGR of 15.8% from 2025 to 2033, ultimately reaching USD 14.7 billion by 2033. This impressive growth is primarily driven by the increasing adoption of artificial intelligence and machine learning technologies in financial institutions seeking to enhance decision-making, manage risks, and deliver superior customer experiences. The rising volume of unstructured data from social media, news feeds, and customer interactions has made sentiment analysis a critical tool for financial services firms aiming to gain actionable insights and maintain a competitive edge in a dynamic market landscape.
One of the most significant growth factors for the sentiment analysis for financial services market is the exponential increase in data generated across digital channels. Financial institutions are inundated with vast amounts of textual and voice data from sources such as social media platforms, online reviews, call center transcripts, and news articles. Sentiment analysis solutions enable these organizations to efficiently process and analyze this unstructured data, extracting valuable insights into market trends, customer sentiment, and emerging risks. By leveraging advanced natural language processing (NLP) and machine learning algorithms, financial firms can proactively respond to market fluctuations, identify reputational risks, and tailor their products and services to align with evolving customer preferences. This data-driven approach is fueling the rapid adoption of sentiment analysis tools, particularly among banks, asset management firms, and fintech companies.
Another driving force behind the expansion of the sentiment analysis for financial services market is the growing need for enhanced risk management and fraud detection capabilities. The financial sector is highly regulated and faces constant threats from cybercriminals and fraudulent activities. Sentiment analysis enables institutions to monitor customer communications, transaction patterns, and public sentiment in real-time, helping to detect anomalies, suspicious behaviors, and potential compliance breaches. Early detection of negative sentiment or unusual activity can trigger timely investigations, minimizing financial losses and reputational damage. As regulatory requirements become more stringent and the complexity of financial crimes increases, the demand for sophisticated sentiment analysis solutions is expected to surge, further propelling market growth.
Additionally, the relentless pursuit of improved customer experience is a major catalyst for the adoption of sentiment analysis in the financial services industry. TodayÂ’s customers expect personalized, responsive, and transparent interactions with their financial service providers. Sentiment analysis tools empower organizations to gauge customer emotions, satisfaction levels, and pain points across various touchpoints, enabling them to deliver targeted interventions, resolve issues swiftly, and foster long-term loyalty. By integrating sentiment analysis into customer relationship management (CRM) systems, financial institutions can prioritize high-value clients, anticipate churn, and develop innovative products that resonate with their audience. This focus on customer-centricity is a key differentiator in an increasingly competitive market, driving sustained investment in sentiment analysis technologies.
Sentiment-Driven Routing AI is emerging as a transformative technology in the financial services sector. This AI-driven approach leverages sentiment analysis to dynamically route customer queries and interactions based on the emotional tone detected in communications. By understanding the sentiment behind customer messages, financial institutions can prioritize and direct inquiries to the most appropriate resources, enhancing response times and customer satisfaction. Sentiment-Driven Routing AI not only improves operational efficiency but also empowers financial firms to deliver more personalized and empathetic customer service. As the volume of customer interactions continues to grow, the integration of sentiment-driven routing capabilities is becoming increasingly vital for maintaining a competitive edge and fostering cu
Facebook
Twitterhttps://crawlfeeds.com/privacy_policyhttps://crawlfeeds.com/privacy_policy
CNBC Economy Articles Dataset is an invaluable collection of data extracted from CNBC’s economy section, offering deep insights into global and U.S. economic trends, market dynamics, financial policies, and industry developments.
This dataset encompasses a diverse array of economic articles on critical topics like GDP growth, inflation rates, employment statistics, central bank policies, and major global events influencing the market. Designed for researchers, analysts, and businesses, it serves as an essential resource for understanding economic patterns, conducting sentiment analysis, and developing financial forecasting models.
Each record in the dataset is meticulously structured and includes:
This rich combination of fields ensures seamless integration into data science projects, research papers, and market analyses.
Interested in additional structured news datasets for your research or analytics needs? Check out our news dataset collection to find datasets tailored for diverse analytical applications.
Facebook
TwitterAttribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
This dataset (FinancialPhraseBank) contains the sentiments for financial news headlines from the perspective of a retail investor.
The dataset contains two columns, "Sentiment" and "News Headline". The sentiment can be negative, neutral or positive.
Malo, P., Sinha, A., Korhonen, P., Wallenius, J., & Takala, P. (2014). Good debt or bad debt: Detecting semantic orientations in economic texts. Journal of the Association for Information Science and Technology, 65(4), 782-796.