Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset contains news headlines relevant to key forex pairs: AUDUSD, EURCHF, EURUSD, GBPUSD, and USDJPY. The data was extracted from reputable platforms Forex Live and FXstreet over a period of 86 days, from January to May 2023. The dataset comprises 2,291 unique news headlines. Each headline includes an associated forex pair, timestamp, source, author, URL, and the corresponding article text. Data was collected using web scraping techniques executed via a custom service on a virtual machine. This service periodically retrieves the latest news for a specified forex pair (ticker) from each platform, parsing all available information. The collected data is then processed to extract details such as the article's timestamp, author, and URL. The URL is further used to retrieve the full text of each article. This data acquisition process repeats approximately every 15 minutes.
To ensure the reliability of the dataset, we manually annotated each headline for sentiment. Instead of solely focusing on the textual content, we ascertained sentiment based on the potential short-term impact of the headline on its corresponding forex pair. This method recognizes the currency market's acute sensitivity to economic news, which significantly influences many trading strategies. As such, this dataset could serve as an invaluable resource for fine-tuning sentiment analysis models in the financial realm.
We used three categories for annotation: 'positive', 'negative', and 'neutral', which correspond to bullish, bearish, and hold sentiments, respectively, for the forex pair linked to each headline. The following Table provides examples of annotated headlines along with brief explanations of the assigned sentiment.
Examples of Annotated Headlines
Forex Pair
Headline
Sentiment
Explanation
GBPUSD
Diminishing bets for a move to 12400
Neutral
Lack of strong sentiment in either direction
GBPUSD
No reasons to dislike Cable in the very near term as long as the Dollar momentum remains soft
Positive
Positive sentiment towards GBPUSD (Cable) in the near term
GBPUSD
When are the UK jobs and how could they affect GBPUSD
Neutral
Poses a question and does not express a clear sentiment
JPYUSD
Appropriate to continue monetary easing to achieve 2% inflation target with wage growth
Positive
Monetary easing from Bank of Japan (BoJ) could lead to a weaker JPY in the short term due to increased money supply
USDJPY
Dollar rebounds despite US data. Yen gains amid lower yields
Neutral
Since both the USD and JPY are gaining, the effects on the USDJPY forex pair might offset each other
USDJPY
USDJPY to reach 124 by Q4 as the likelihood of a BoJ policy shift should accelerate Yen gains
Negative
USDJPY is expected to reach a lower value, with the USD losing value against the JPY
AUDUSD
RBA Governor Lowe’s Testimony High inflation is damaging and corrosive
Positive
Reserve Bank of Australia (RBA) expresses concerns about inflation. Typically, central banks combat high inflation with higher interest rates, which could strengthen AUD.
Moreover, the dataset includes two columns with the predicted sentiment class and score as predicted by the FinBERT model. Specifically, the FinBERT model outputs a set of probabilities for each sentiment class (positive, negative, and neutral), representing the model's confidence in associating the input headline with each sentiment category. These probabilities are used to determine the predicted class and a sentiment score for each headline. The sentiment score is computed by subtracting the negative class probability from the positive one.
Real-time price data collected by the Boston Market News Reporter. The NOAA Fisheries' "Fishery Market News" began operations in New York City on February 14, 1938. The primary function of this joint Federal/industry program is to provide accurate and unbiased reports depicting current conditions affecting the trade in fish and fishery products. The Boston and New York Market News Reports are now hosted by the Northeast Fisheries Science Center. Please navigate to the URL below for 2014 and newer data: https://www.nefsc.noaa.gov/read/socialsci/marketNews.php
https://brightdata.com/licensehttps://brightdata.com/license
Stay informed with our comprehensive Financial News Dataset, designed for investors, analysts, and businesses to track market trends, monitor financial events, and make data-driven decisions.
Dataset Features
Financial News Articles: Access structured financial news data, including headlines, summaries, full articles, publication dates, and source details. Market & Economic Indicators: Track financial reports, stock market updates, economic forecasts, and corporate earnings announcements. Sentiment & Trend Analysis: Analyze news sentiment, categorize articles by financial topics, and monitor emerging trends in global markets. Historical & Real-Time Data: Retrieve historical financial news archives or access continuously updated feeds for real-time insights.
Customizable Subsets for Specific Needs Our Financial News Dataset is fully customizable, allowing you to filter data based on publication date, region, financial topics, sentiment, or specific news sources. Whether you need broad coverage for market research or focused data for investment analysis, we tailor the dataset to your needs.
Popular Use Cases
Investment Strategy & Risk Management: Monitor financial news to assess market risks, identify investment opportunities, and optimize trading strategies. Market & Competitive Intelligence: Track industry trends, competitor financial performance, and economic developments. AI & Machine Learning Training: Use structured financial news data to train AI models for sentiment analysis, stock prediction, and automated trading. Regulatory & Compliance Monitoring: Stay updated on financial regulations, policy changes, and corporate governance news. Economic Research & Forecasting: Analyze financial news trends to predict economic shifts and market movements.
Whether you're tracking stock market trends, analyzing financial sentiment, or training AI models, our Financial News Dataset provides the structured data you need. Get started today and customize your dataset to fit your business objectives.
https://brightdata.com/licensehttps://brightdata.com/license
Stay ahead with our comprehensive News Dataset, designed for businesses, analysts, and researchers to track global events, monitor media trends, and extract valuable insights from news sources worldwide.
Dataset Features
News Articles: Access structured news data, including headlines, summaries, full articles, publication dates, and source details. Ideal for media monitoring and sentiment analysis. Publisher & Source Information: Extract details about news publishers, including domain, region, and credibility indicators. Sentiment & Topic Classification: Analyze news sentiment, categorize articles by topic, and track emerging trends in real time. Historical & Real-Time Data: Retrieve historical archives or access continuously updated news feeds for up-to-date insights.
Customizable Subsets for Specific Needs Our News Dataset is fully customizable, allowing you to filter data based on publication date, region, topic, sentiment, or specific news sources. Whether you need broad coverage for trend analysis or focused data for competitive intelligence, we tailor the dataset to your needs.
Popular Use Cases
Media Monitoring & Reputation Management: Track brand mentions, analyze media coverage, and assess public sentiment. Market & Competitive Intelligence: Monitor industry trends, competitor activity, and emerging market opportunities. AI & Machine Learning Training: Use structured news data to train AI models for sentiment analysis, topic classification, and predictive analytics. Financial & Investment Research: Analyze news impact on stock markets, commodities, and economic indicators. Policy & Risk Analysis: Track regulatory changes, geopolitical events, and crisis developments in real time.
Whether you're analyzing market trends, monitoring brand reputation, or training AI models, our News Dataset provides the structured data you need. Get started today and customize your dataset to fit your business objectives.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The main stock market index of United States, the US500, fell to 6583 points on September 12, 2025, losing 0.06% from the previous session. Over the past month, the index has climbed 1.80% and is up 17.01% compared to the same time last year, according to trading on a contract for difference (CFD) that tracks this benchmark index from United States. United States Stock Market Index - values, historical data, forecasts and news - updated on September of 2025.
This dataset provides comprehensive access to financial market data from Google Finance in real-time. Get detailed information on stocks, market quotes, trends, ETFs, international exchanges, forex, crypto, and related news. Perfect for financial applications, trading platforms, and market analysis tools. The dataset is delivered in a JSON format via REST API.
https://crawlfeeds.com/privacy_policyhttps://crawlfeeds.com/privacy_policy
Explore the "Largest News Articles Dataset from CNBC," a comprehensive collection of news articles published by CNBC, one of the leading global news sources for business, finance, and current affairs.
This dataset includes thousands of articles covering a wide range of topics, such as financial markets, economic trends, technology, politics, health, and more. Each article in the dataset provides detailed information, including headlines, publication dates, authors, article content, and categories, offering valuable insights for researchers, data analysts, and media professionals.
Key Features:
Whether you're conducting research on financial markets, analyzing media trends, or developing new content, the "Largest News Articles Dataset from CNBC" is an invaluable resource that provides detailed insights and comprehensive coverage of the latest news.
https://choosealicense.com/licenses/other/https://choosealicense.com/licenses/other/
Multi-Source Financial & General News
🚀 47 MILLION ROWS OF NEWS CONTENT — one unified corpus for market-aware AI/ML
I combined 15 public news datasets (many small on their own) into one consistent, ready-to-use layer so you don’t have to wrangle them yourself. Everything is normalized to a minimal schema (date, text, extra_fields) and shipped as Parquet shards per subset—streamable, DuckDB-friendly, and built with a trading date policy (this can be edited if folks see other use… See the full description on the dataset page: https://huggingface.co/datasets/Brianferrell787/financial-news-multisource.
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Actually, I prepare this dataset for students on my Deep Learning and NLP course.
But I am also very happy to see kagglers play around with it.
Have fun!
Description:
There are two channels of data provided in this dataset:
News data: I crawled historical news headlines from Reddit WorldNews Channel (/r/worldnews). They are ranked by reddit users' votes, and only the top 25 headlines are considered for a single date. (Range: 2008-06-08 to 2016-07-01)
Stock data: Dow Jones Industrial Average (DJIA) is used to "prove the concept". (Range: 2008-08-08 to 2016-07-01)
I provided three data files in .csv format:
RedditNews.csv: two columns The first column is the "date", and second column is the "news headlines". All news are ranked from top to bottom based on how hot they are. Hence, there are 25 lines for each date.
DJIA_table.csv: Downloaded directly from Yahoo Finance: check out the web page for more info.
Combined_News_DJIA.csv: To make things easier for my students, I provide this combined dataset with 27 columns. The first column is "Date", the second is "Label", and the following ones are news headlines ranging from "Top1" to "Top25".
=========================================
To my students:
I made this a binary classification task. Hence, there are only two labels:
"1" when DJIA Adj Close value rose or stayed as the same;
"0" when DJIA Adj Close value decreased.
For task evaluation, please use data from 2008-08-08 to 2014-12-31 as Training Set, and Test Set is then the following two years data (from 2015-01-02 to 2016-07-01). This is roughly a 80%/20% split.
And, of course, use AUC as the evaluation metric.
=========================================
+++++++++++++++++++++++++++++++++++++++++
To all kagglers:
Please upvote this dataset if you like this idea for market prediction.
If you think you coded an amazing trading algorithm,
friendly advice
do play safe with your own money :)
+++++++++++++++++++++++++++++++++++++++++
Feel free to contact me if there is any question~
And, remember me when you become a millionaire :P
Note: If you'd like to cite this dataset in your publications, please use:
Sun, J. (2016, August). Daily News for Stock Market Prediction, Version 1. Retrieved [Date You Retrieved This Data] from https://www.kaggle.com/aaron7sun/stocknews.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Dataset Description
The Twitter Financial News dataset is an English-language dataset containing an annotated corpus of finance-related tweets. This dataset is used to classify finance-related tweets for their sentiment.
The dataset holds 11,932 documents annotated with 3 labels:
sentiments = { "LABEL_0": "Bearish", "LABEL_1": "Bullish", "LABEL_2": "Neutral" }
The data was collected using the Twitter API. The current dataset supports the multi-class classification… See the full description on the dataset page: https://huggingface.co/datasets/zeroshot/twitter-financial-news-sentiment.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
This dataset consists of financial news articles collected from HuffPost, spanning from 2012 to 2022. The data is structured in JSON format, containing headlines, article links, short descriptions, authors, categories, and publication dates. This dataset supports applications in financial NLP, time-series analysis, and sentiment-based trading strategies.
Dataset Card for Dataset Name
The FinancialNewsSentiment_26000 dataset comprises 26,000 rows of financial news articles related to the Indian market. It features four columns: URL, Content (scrapped content), Summary (generated using the T5-base model), and Sentiment Analysis (gathered using the GPT add-on for Google Sheets). The dataset is designed for sentiment analysis tasks, providing a comprehensive view of sentiments expressed in financial news.
Dataset… See the full description on the dataset page: https://huggingface.co/datasets/kdave/Indian_Financial_News.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset is about news. It has 3 rows and is filtered where the keywords includes General Agreement on Tariffs and Trade (Organization)-History. It features 2 columns including news link.
Techsalerator’s News Event Data in Asia offers a detailed and expansive dataset designed to provide businesses, analysts, journalists, and researchers with comprehensive insights into significant news events across the Asian continent. This dataset captures and categorizes major events reported from a diverse range of news sources, including press releases, industry news sites, blogs, and PR platforms, offering valuable perspectives on regional developments, economic shifts, political changes, and cultural occurrences.
Key Features of the Dataset: Extensive Coverage:
The dataset aggregates news events from a wide range of sources such as company press releases, industry-specific news outlets, blogs, PR sites, and traditional media. This broad coverage ensures a diverse array of information from multiple reporting channels. Categorization of Events:
News events are categorized into various types including business and economic updates, political developments, technological advancements, legal and regulatory changes, and cultural events. This categorization helps users quickly find and analyze information relevant to their interests or sectors. Real-Time Updates:
The dataset is updated regularly to include the most current events, ensuring users have access to the latest news and can stay informed about recent developments as they happen. Geographic Segmentation:
Events are tagged with their respective countries and regions within Asia. This geographic segmentation allows users to filter and analyze news events based on specific locations, facilitating targeted research and analysis. Event Details:
Each event entry includes comprehensive details such as the date of occurrence, source of the news, a description of the event, and relevant keywords. This thorough detailing helps users understand the context and significance of each event. Historical Data:
The dataset includes historical news event data, enabling users to track trends and perform comparative analysis over time. This feature supports longitudinal studies and provides insights into the evolution of news events. Advanced Search and Filter Options:
Users can search and filter news events based on criteria such as date range, event type, location, and keywords. This functionality allows for precise and efficient retrieval of relevant information. Asian Countries and Territories Covered: Central Asia: Kazakhstan Kyrgyzstan Tajikistan Turkmenistan Uzbekistan East Asia: China Hong Kong (Special Administrative Region of China) Japan Mongolia North Korea South Korea Taiwan South Asia: Afghanistan Bangladesh Bhutan India Maldives Nepal Pakistan Sri Lanka Southeast Asia: Brunei Cambodia East Timor (Timor-Leste) Indonesia Laos Malaysia Myanmar (Burma) Philippines Singapore Thailand Vietnam Western Asia (Middle East): Armenia Azerbaijan Bahrain Cyprus Georgia Iraq Israel Jordan Kuwait Lebanon Oman Palestine Qatar Saudi Arabia Syria Turkey (partly in Europe, but often included in Asia contextually) United Arab Emirates Yemen Benefits of the Dataset: Strategic Insights: Businesses and analysts can use the dataset to gain insights into significant regional developments, economic conditions, and political changes, aiding in strategic decision-making and market analysis. Market and Industry Trends: The dataset provides valuable information on industry-specific trends and events, helping users understand market dynamics and identify emerging opportunities. Media and PR Monitoring: Journalists and PR professionals can track relevant news across Asia, enabling them to monitor media coverage, identify emerging stories, and manage public relations efforts effectively. Academic and Research Use: Researchers can utilize the dataset for longitudinal studies, trend analysis, and academic research on various topics related to Asian news and events. Techsalerator’s News Event Data in Asia is a crucial resource for accessing and analyzing significant news events across the continent. By offering detailed, categorized, and up-to-date information, it supports effective decision-making, research, and media monitoring across diverse sectors.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset is about news. It has 14 rows and is filtered where the keywords includes International trade under threat : a constructive response. It features 2 columns including news link.
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
This dataset was created by Rifath F
Released under Apache 2.0
FNSPID: A Comprehensive Financial News Dataset in Time Series
Description
FNSPID is a meticulously curated dataset designed to support research and applications in the field of financial news analysis within the context of time-series forecasting. Our dataset encompasses a wide range of financial news articles, providing a rich resource for developing and testing models aimed at understanding market trends, investor sentiment, and other critical financial… See the full description on the dataset page: https://huggingface.co/datasets/Zihan1004/FNSPID.
The primary function of the Fruit and Vegetable Market News Division of the Fruit and Vegetable Programs is to provide an exchange of information for growers, shippers, wholesalers, researchers and others on supplies, demand and prices of fresh fruit and vegetables and speciality crops.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The United States recorded a trade deficit of 78.31 USD Billion in July of 2025. This dataset provides the latest reported value for - United States Balance of Trade - plus previous releases, historical high and low, short-term forecast and long-term prediction, economic calendar, survey consensus and news.
Techsalerator’s News Event Data in Latin America offers a detailed and extensive dataset designed to provide businesses, analysts, journalists, and researchers with an in-depth view of significant news events across the Latin American region. This dataset captures and categorizes key events reported from a wide array of news sources, including press releases, industry news sites, blogs, and PR platforms, offering valuable insights into regional developments, economic changes, political shifts, and cultural events.
Key Features of the Dataset: Comprehensive Coverage:
The dataset aggregates news events from numerous sources such as company press releases, industry news outlets, blogs, PR sites, and traditional news media. This broad coverage ensures a wide range of information from multiple reporting channels. Categorization of Events:
News events are categorized into various types including business and economic updates, political developments, technological advancements, legal and regulatory changes, and cultural events. This categorization helps users quickly locate and analyze information relevant to their interests or sectors. Real-Time Updates:
The dataset is updated regularly to include the most recent events, ensuring users have access to the latest news and can stay informed about current developments. Geographic Segmentation:
Events are tagged with their respective countries and regions within Latin America. This geographic segmentation allows users to filter and analyze news events based on specific locations, facilitating targeted research and analysis. Event Details:
Each event entry includes comprehensive details such as the date of occurrence, source of the news, a description of the event, and relevant keywords. This thorough detailing helps in understanding the context and significance of each event. Historical Data:
The dataset includes historical news event data, enabling users to track trends and perform comparative analysis over time. This feature supports longitudinal studies and provides insights into how news events evolve. Advanced Search and Filter Options:
Users can search and filter news events based on criteria such as date range, event type, location, and keywords. This functionality allows for precise and efficient retrieval of relevant information. Latin American Countries Covered: South America: Argentina Bolivia Brazil Chile Colombia Ecuador Guyana Paraguay Peru Suriname Uruguay Venezuela Central America: Belize Costa Rica El Salvador Guatemala Honduras Nicaragua Panama Caribbean: Cuba Dominican Republic Haiti (Note: Primarily French-speaking but included due to geographic and cultural ties) Jamaica Trinidad and Tobago Benefits of the Dataset: Strategic Insights: Businesses and analysts can use the dataset to gain insights into significant regional developments, economic conditions, and political changes, aiding in strategic decision-making and market analysis. Market and Industry Trends: The dataset provides valuable information on industry-specific trends and events, helping users understand market dynamics and emerging opportunities. Media and PR Monitoring: Journalists and PR professionals can track relevant news across Latin America, enabling them to monitor media coverage, identify emerging stories, and manage public relations efforts effectively. Academic and Research Use: Researchers can utilize the dataset for longitudinal studies, trend analysis, and academic research on various topics related to Latin American news and events. Techsalerator’s News Event Data in Latin America is a crucial resource for accessing and analyzing significant news events across the region. By providing detailed, categorized, and up-to-date information, it supports effective decision-making, research, and media monitoring across diverse sectors.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset contains news headlines relevant to key forex pairs: AUDUSD, EURCHF, EURUSD, GBPUSD, and USDJPY. The data was extracted from reputable platforms Forex Live and FXstreet over a period of 86 days, from January to May 2023. The dataset comprises 2,291 unique news headlines. Each headline includes an associated forex pair, timestamp, source, author, URL, and the corresponding article text. Data was collected using web scraping techniques executed via a custom service on a virtual machine. This service periodically retrieves the latest news for a specified forex pair (ticker) from each platform, parsing all available information. The collected data is then processed to extract details such as the article's timestamp, author, and URL. The URL is further used to retrieve the full text of each article. This data acquisition process repeats approximately every 15 minutes.
To ensure the reliability of the dataset, we manually annotated each headline for sentiment. Instead of solely focusing on the textual content, we ascertained sentiment based on the potential short-term impact of the headline on its corresponding forex pair. This method recognizes the currency market's acute sensitivity to economic news, which significantly influences many trading strategies. As such, this dataset could serve as an invaluable resource for fine-tuning sentiment analysis models in the financial realm.
We used three categories for annotation: 'positive', 'negative', and 'neutral', which correspond to bullish, bearish, and hold sentiments, respectively, for the forex pair linked to each headline. The following Table provides examples of annotated headlines along with brief explanations of the assigned sentiment.
Examples of Annotated Headlines
Forex Pair
Headline
Sentiment
Explanation
GBPUSD
Diminishing bets for a move to 12400
Neutral
Lack of strong sentiment in either direction
GBPUSD
No reasons to dislike Cable in the very near term as long as the Dollar momentum remains soft
Positive
Positive sentiment towards GBPUSD (Cable) in the near term
GBPUSD
When are the UK jobs and how could they affect GBPUSD
Neutral
Poses a question and does not express a clear sentiment
JPYUSD
Appropriate to continue monetary easing to achieve 2% inflation target with wage growth
Positive
Monetary easing from Bank of Japan (BoJ) could lead to a weaker JPY in the short term due to increased money supply
USDJPY
Dollar rebounds despite US data. Yen gains amid lower yields
Neutral
Since both the USD and JPY are gaining, the effects on the USDJPY forex pair might offset each other
USDJPY
USDJPY to reach 124 by Q4 as the likelihood of a BoJ policy shift should accelerate Yen gains
Negative
USDJPY is expected to reach a lower value, with the USD losing value against the JPY
AUDUSD
RBA Governor Lowe’s Testimony High inflation is damaging and corrosive
Positive
Reserve Bank of Australia (RBA) expresses concerns about inflation. Typically, central banks combat high inflation with higher interest rates, which could strengthen AUD.
Moreover, the dataset includes two columns with the predicted sentiment class and score as predicted by the FinBERT model. Specifically, the FinBERT model outputs a set of probabilities for each sentiment class (positive, negative, and neutral), representing the model's confidence in associating the input headline with each sentiment category. These probabilities are used to determine the predicted class and a sentiment score for each headline. The sentiment score is computed by subtracting the negative class probability from the positive one.