MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
The Twitter Financial News dataset is an English-language dataset containing an annotated corpus of finance-related tweets. This dataset is used to classify finance-related tweets for their sentiment.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
please cite this dataset by :
Nicolas Turenne, Ziwei Chen, Guitao Fan, Jianlong Li, Yiwen Li, Siyuan Wang, Jiaqi Zhou (2021) Mining an English-Chinese parallel Corpus of Financial News, BNU HKBU UIC, technical report
The dataset comes from Financial Times news website (https://www.ft.com/)
news are written in both languages Chinese and English.
FTIE.zip contains all documents in a file individually
FT-en-zh.rar contains all documents in one file
Below is a sample document in the dataset defined by these fields and syntax :
id;time;english_title;chinese_title;integer;english_body;chinese_body
1021892;2008-09-10T00:00:00Z;FLAW IN TWIN TOWERS REVEALED;科学家发现纽约双子塔倒塌的根本原因;1;Scientists have discovered the fundamental reason the Twin Towers collapsed on September 11 2001. The steel used in the buildings softened fatally at 500?C – far below its melting point – as a result of a magnetic change in the metal. @ The finding, announced at the BA Festival of Science in Liverpool yesterday, should lead to a new generation of steels capable of retaining strength at much higher temperatures.;科学家发现了纽约世贸双子大厦(Twin Towers)在2001年9月11日倒塌的根本原因。由于磁性变化,大厦使用的钢在500摄氏度——远远低于其熔点——时变软,从而产生致命后果。 @ 这一发现在昨日利物浦举行的BA科学节(BA Festival of Science)上公布。这应会推动能够在更高温度下保持强度的新一代钢铁的问世。
The dataset contains 60,473 bilingual documents.
Time range is from 2007 and 2020.
This dataset has been used for parallel bilingual news mining in Finance domain.
https://brightdata.com/licensehttps://brightdata.com/license
Stay informed with our comprehensive Financial News Dataset, designed for investors, analysts, and businesses to track market trends, monitor financial events, and make data-driven decisions.
Dataset Features
Financial News Articles: Access structured financial news data, including headlines, summaries, full articles, publication dates, and source details. Market & Economic Indicators: Track financial reports, stock market updates, economic forecasts, and corporate earnings announcements. Sentiment & Trend Analysis: Analyze news sentiment, categorize articles by financial topics, and monitor emerging trends in global markets. Historical & Real-Time Data: Retrieve historical financial news archives or access continuously updated feeds for real-time insights.
Customizable Subsets for Specific Needs Our Financial News Dataset is fully customizable, allowing you to filter data based on publication date, region, financial topics, sentiment, or specific news sources. Whether you need broad coverage for market research or focused data for investment analysis, we tailor the dataset to your needs.
Popular Use Cases
Investment Strategy & Risk Management: Monitor financial news to assess market risks, identify investment opportunities, and optimize trading strategies. Market & Competitive Intelligence: Track industry trends, competitor financial performance, and economic developments. AI & Machine Learning Training: Use structured financial news data to train AI models for sentiment analysis, stock prediction, and automated trading. Regulatory & Compliance Monitoring: Stay updated on financial regulations, policy changes, and corporate governance news. Economic Research & Forecasting: Analyze financial news trends to predict economic shifts and market movements.
Whether you're tracking stock market trends, analyzing financial sentiment, or training AI models, our Financial News Dataset provides the structured data you need. Get started today and customize your dataset to fit your business objectives.
FNSPID: A Comprehensive Financial News Dataset in Time Series
Description
FNSPID is a meticulously curated dataset designed to support research and applications in the field of financial news analysis within the context of time-series forecasting. Our dataset encompasses a wide range of financial news articles, providing a rich resource for developing and testing models aimed at understanding market trends, investor sentiment, and other critical financial… See the full description on the dataset page: https://huggingface.co/datasets/Zihan1004/FNSPID.
The table Headlines is part of the dataset Daily Financial News, available at https://redivis.com/datasets/97xh-655sbm328. It contains 1845559 rows across 5 variables.
Enhancing Financial Market Predictions: Causality-Driven Feature Selection This paper introduces FinSen dataset that revolutionizes financial market analysis by integrating economic and financial news articles from 197 countries with stock market data. The dataset’s extensive coverage spans 15 years from 2007 to 2023 with temporal information, offering a rich, global perspective 160,000 records on financial market news. Our study leverages causally validated sentiment scores and LSTM models to enhance market forecast accuracy and reliability.
Our FinSen Dataset
This repository contains the dataset for Enhancing Financial Market Predictions: Causality-Driven Feature Selection, which has been accepted in ADMA 2024.
If the dataset or the paper has been useful in your research, please add a citation to our work:
@article{liang2024enhancing, title={Enhancing Financial Market Predictions: Causality-Driven Feature Selection}, author={Liang, Wenhao and Li, Zhengyang and Chen, Weitong}, journal={arXiv e-prints}, pages={arXiv--2408}, year={2024} }
Datasets [FinSen] can be downloaded manually from the repository as csv file. Sentiment and its score are generated by FinBert model from the Hugging Face Transformers library under the identifier "ProsusAI/finbert". (Araci, Dogu. "Finbert: Financial sentiment analysis with pre-trained language models." arXiv preprint arXiv:1908.10063 (2019).)
We only provide US for research purpose usage, please contact w.liang@adelaide.edu.au for other countries (total 197 included) if necessary.
We also provide other NLP datasets for text classification tasks here, please cite them correspondingly once you used them in your research if any.
20Newsgroups. Joachims, T., et al.: A probabilistic analysis of the rocchio algorithm with tfidf for text categorization. In: ICML. vol. 97, pp. 143–151. Citeseer (1997) AG News. Zhang, X., Zhao, J., LeCun, Y.: Character-level convolutional networks for text classification. Advances in neural information processing systems 28 (2015) Financial PhraseBank. Malo, P., Sinha, A., Korhonen, P., Wallenius, J., Takala, P.: Good debt or bad debt: Detecting semantic orientations in economic texts. Journal of the Association for Information Science and Technology 65(4), 782–796 (2014)
Dataloader for FinSen We provide the preprocessing file finsen.py for our FinSen dataset under dataloaders directory for more convienient usage.
Models - Text Classification
DAN-3.
Gobal Pooling CNN.
Models - Regression Prediction
LSTM
Using Sentiment Score from FinSen Predict Result on S&P500 Dependencies The code is based on PyTorch under code frame of https://github.com/torrvision/focal_calibration, please cite their work if you found it is useful.
:smiley: ☺ Happy Research !
https://www.lseg.com/en/policies/website-disclaimerhttps://www.lseg.com/en/policies/website-disclaimer
Get access to leading financial news coverage including exclusive access to Reuters news as well as 10,500 additional news sources and feeds.
CJCJ3030/twitter-financial-news-sentiment dataset hosted on Hugging Face and contributed by the HF Datasets community
Due to the limited sentiment classification dataset online, I labeled more than 200 news title(from well-known financial websites such as CNBC, Financial times etc.) with 3 sentiment categories. This dataset contains relative new information which may be helpful for you in predicting new trends such as COVID-19). The standard that how I labeled is based on the other two already exist datasets. So when you judge the sentences you might have some different feelings. Hope if you also do this job you can share your data with us if you can! Also looking forward to have a thumb up from you!
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Daniel-ML/sentiment-analysis-for-financial-news-v2 dataset hosted on Hugging Face and contributed by the HF Datasets community
https://www.lseg.com/en/policies/website-disclaimerhttps://www.lseg.com/en/policies/website-disclaimer
Find unrivaled company, commodity and economic stories formatted for automated consumption, with LSEG Real-Time News, powered by Reuters.
Live Briefs Investor – US Covering thousands of listed securities and events across 80 news categories, Live Briefs Investor US is specifically designed to keep individual investors and active traders on top of breaking news that is likely to affect their portfolios.
Most of the largest and most respected retail and self-directed brokerage firms in the North America rely on MT Newswires to provide their clients with complete coverage of the financial markets. The Investor service includes timely and insightful commentary on equities, commodities, ETFs, economics, forex, options and fixed income assets throughout the day (6:30 am to 6:30 pm EST).
Every story is ticker-tagged and category-coded to allow for seamless platform integration. US Equities – significant events affecting individual public companies in the US: After-hours and pre-market news, trading activity and technical price level indications; Earnings estimate change alerts; Analyst Rating Changes- the most comprehensive view and coverage of rating changes available anywhere; ETF Power Play – daily trends in ETF trading activity; Mini and detailed sector summaries – pre-market, mid-day, and closing; Market Chatter – real-time coverage of trading desk rumors and breaking news; Zero noise: Only premium, original news and event analysis. Never any fillers (press releases, non-market related news, etc.).
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset contains news headlines relevant to key forex pairs: AUDUSD, EURCHF, EURUSD, GBPUSD, and USDJPY. The data was extracted from reputable platforms Forex Live and FXstreet over a period of 86 days, from January to May 2023. The dataset comprises 2,291 unique news headlines. Each headline includes an associated forex pair, timestamp, source, author, URL, and the corresponding article text. Data was collected using web scraping techniques executed via a custom service on a virtual machine. This service periodically retrieves the latest news for a specified forex pair (ticker) from each platform, parsing all available information. The collected data is then processed to extract details such as the article's timestamp, author, and URL. The URL is further used to retrieve the full text of each article. This data acquisition process repeats approximately every 15 minutes.
To ensure the reliability of the dataset, we manually annotated each headline for sentiment. Instead of solely focusing on the textual content, we ascertained sentiment based on the potential short-term impact of the headline on its corresponding forex pair. This method recognizes the currency market's acute sensitivity to economic news, which significantly influences many trading strategies. As such, this dataset could serve as an invaluable resource for fine-tuning sentiment analysis models in the financial realm.
We used three categories for annotation: 'positive', 'negative', and 'neutral', which correspond to bullish, bearish, and hold sentiments, respectively, for the forex pair linked to each headline. The following Table provides examples of annotated headlines along with brief explanations of the assigned sentiment.
Examples of Annotated Headlines
Forex Pair
Headline
Sentiment
Explanation
GBPUSD
Diminishing bets for a move to 12400
Neutral
Lack of strong sentiment in either direction
GBPUSD
No reasons to dislike Cable in the very near term as long as the Dollar momentum remains soft
Positive
Positive sentiment towards GBPUSD (Cable) in the near term
GBPUSD
When are the UK jobs and how could they affect GBPUSD
Neutral
Poses a question and does not express a clear sentiment
JPYUSD
Appropriate to continue monetary easing to achieve 2% inflation target with wage growth
Positive
Monetary easing from Bank of Japan (BoJ) could lead to a weaker JPY in the short term due to increased money supply
USDJPY
Dollar rebounds despite US data. Yen gains amid lower yields
Neutral
Since both the USD and JPY are gaining, the effects on the USDJPY forex pair might offset each other
USDJPY
USDJPY to reach 124 by Q4 as the likelihood of a BoJ policy shift should accelerate Yen gains
Negative
USDJPY is expected to reach a lower value, with the USD losing value against the JPY
AUDUSD
RBA Governor Lowe’s Testimony High inflation is damaging and corrosive
Positive
Reserve Bank of Australia (RBA) expresses concerns about inflation. Typically, central banks combat high inflation with higher interest rates, which could strengthen AUD.
Moreover, the dataset includes two columns with the predicted sentiment class and score as predicted by the FinBERT model. Specifically, the FinBERT model outputs a set of probabilities for each sentiment class (positive, negative, and neutral), representing the model's confidence in associating the input headline with each sentiment category. These probabilities are used to determine the predicted class and a sentiment score for each headline. The sentiment score is computed by subtracting the negative class probability from the positive one.
This dataset provides comprehensive access to financial market data from Google Finance in real-time. Get detailed information on stocks, market quotes, trends, ETFs, international exchanges, forex, crypto, and related news. Perfect for financial applications, trading platforms, and market analysis tools. The dataset is delivered in a JSON format via REST API.
https://www.lseg.com/en/policies/website-disclaimerhttps://www.lseg.com/en/policies/website-disclaimer
Read the biggest business and political stories from around the world with Reuters Top News, providing a customized experience in an easy-to-use format.
Dataset Card for Dataset Name
The FinancialNewsSentiment_26000 dataset comprises 26,000 rows of financial news articles related to the Indian market. It features four columns: URL, Content (scrapped content), Summary (generated using the T5-base model), and Sentiment Analysis (gathered using the GPT add-on for Google Sheets). The dataset is designed for sentiment analysis tasks, providing a comprehensive view of sentiments expressed in financial news.
Dataset… See the full description on the dataset page: https://huggingface.co/datasets/kdave/Indian_Financial_News.
Model Card for Sentiment Analysis on Financial News
Overview
This dataset contains sentiments for financial news headlines from the perspective of a retail investor. The data is derived from the research by Malo et al. (2014), which focuses on detecting semantic orientations in economic texts.
Dataset Details
Source: Malo, P., Sinha, A., Takala, P., Korhonen, P., and Wallenius, J. (2014). “Good debt or bad debt: Detecting semantic orientations in economic… See the full description on the dataset page: https://huggingface.co/datasets/mltrev23/financial-sentiment-analysis.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Raw News JSON files
https://www.lseg.com/en/policies/website-disclaimerhttps://www.lseg.com/en/policies/website-disclaimer
Get access to leading financial market news coverage including exclusive access to Reuters news as well as 10,500 additional news sources and feeds.
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
This dataset contains over 19,000+ rows of financial headlines from 2008 to 2024, paired with daily closing prices of the S&P 500 index.
Columns:
- date
: Trading date (YYYY-MM-DD)
- headline
: Financial news headline for the day
- close
: S&P 500 closing price on that date
You can use this dataset to: - Perform sentiment analysis on news vs. market behavior - Correlate sentiment score with price movement - Build predictive models or NLP-based trading strategies
Combine this with financial sentiment lexicons for more accuracy.
If you find this dataset useful, an upvote would mean a lot — it helps others discover it too!
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
The Twitter Financial News dataset is an English-language dataset containing an annotated corpus of finance-related tweets. This dataset is used to classify finance-related tweets for their sentiment.