68 datasets found
  1. News Datasets

    • brightdata.com
    .json, .csv, .xlsx
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bright Data, News Datasets [Dataset]. https://brightdata.com/products/datasets/news
    Explore at:
    .json, .csv, .xlsxAvailable download formats
    Dataset authored and provided by
    Bright Datahttps://brightdata.com/
    License

    https://brightdata.com/licensehttps://brightdata.com/license

    Area covered
    Worldwide
    Description

    Stay ahead with our comprehensive News Dataset, designed for businesses, analysts, and researchers to track global events, monitor media trends, and extract valuable insights from news sources worldwide.

    Dataset Features

    News Articles: Access structured news data, including headlines, summaries, full articles, publication dates, and source details. Ideal for media monitoring and sentiment analysis. Publisher & Source Information: Extract details about news publishers, including domain, region, and credibility indicators. Sentiment & Topic Classification: Analyze news sentiment, categorize articles by topic, and track emerging trends in real time. Historical & Real-Time Data: Retrieve historical archives or access continuously updated news feeds for up-to-date insights.

    Customizable Subsets for Specific Needs Our News Dataset is fully customizable, allowing you to filter data based on publication date, region, topic, sentiment, or specific news sources. Whether you need broad coverage for trend analysis or focused data for competitive intelligence, we tailor the dataset to your needs.

    Popular Use Cases

    Media Monitoring & Reputation Management: Track brand mentions, analyze media coverage, and assess public sentiment. Market & Competitive Intelligence: Monitor industry trends, competitor activity, and emerging market opportunities. AI & Machine Learning Training: Use structured news data to train AI models for sentiment analysis, topic classification, and predictive analytics. Financial & Investment Research: Analyze news impact on stock markets, commodities, and economic indicators. Policy & Risk Analysis: Track regulatory changes, geopolitical events, and crisis developments in real time.

    Whether you're analyzing market trends, monitoring brand reputation, or training AI models, our News Dataset provides the structured data you need. Get started today and customize your dataset to fit your business objectives.

  2. tech-company-news-data-dump

    • huggingface.co
    Updated Jan 16, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    HackerNoon (2024). tech-company-news-data-dump [Dataset]. https://huggingface.co/datasets/HackerNoon/tech-company-news-data-dump
    Explore at:
    Dataset updated
    Jan 16, 2024
    Dataset authored and provided by
    HackerNoonhttps://hackernoon.com/
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    HackerNoon curated the internet's most cited 7M+ tech company news articles and blog posts about the 3k+ most valuable tech companies in 2022 and 2023. These stories were curated to power HackerNoon.com/Companies, where we update daily news on top technology companies like Microsoft, Google, and HuggingFace. Please use this news data freely for your project, and as always anyone is welcome to publish on HackerNoon.

  3. b

    Financial Datasets

    • brightdata.com
    .json, .csv, .xlsx
    Updated Dec 5, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bright Data (2023). Financial Datasets [Dataset]. https://brightdata.com/products/datasets/news/financial
    Explore at:
    .json, .csv, .xlsxAvailable download formats
    Dataset updated
    Dec 5, 2023
    Dataset authored and provided by
    Bright Data
    License

    https://brightdata.com/licensehttps://brightdata.com/license

    Area covered
    Worldwide
    Description

    Stay informed with our comprehensive Financial News Dataset, designed for investors, analysts, and businesses to track market trends, monitor financial events, and make data-driven decisions.

    Dataset Features

    Financial News Articles: Access structured financial news data, including headlines, summaries, full articles, publication dates, and source details. Market & Economic Indicators: Track financial reports, stock market updates, economic forecasts, and corporate earnings announcements. Sentiment & Trend Analysis: Analyze news sentiment, categorize articles by financial topics, and monitor emerging trends in global markets. Historical & Real-Time Data: Retrieve historical financial news archives or access continuously updated feeds for real-time insights.

    Customizable Subsets for Specific Needs Our Financial News Dataset is fully customizable, allowing you to filter data based on publication date, region, financial topics, sentiment, or specific news sources. Whether you need broad coverage for market research or focused data for investment analysis, we tailor the dataset to your needs.

    Popular Use Cases

    Investment Strategy & Risk Management: Monitor financial news to assess market risks, identify investment opportunities, and optimize trading strategies. Market & Competitive Intelligence: Track industry trends, competitor financial performance, and economic developments. AI & Machine Learning Training: Use structured financial news data to train AI models for sentiment analysis, stock prediction, and automated trading. Regulatory & Compliance Monitoring: Stay updated on financial regulations, policy changes, and corporate governance news. Economic Research & Forecasting: Analyze financial news trends to predict economic shifts and market movements.

    Whether you're tracking stock market trends, analyzing financial sentiment, or training AI models, our Financial News Dataset provides the structured data you need. Get started today and customize your dataset to fit your business objectives.

  4. c

    Techcrunch news dataset

    • crawlfeeds.com
    csv, zip
    Updated May 16, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Crawl Feeds (2025). Techcrunch news dataset [Dataset]. https://crawlfeeds.com/datasets/techcrunch-news-dataset
    Explore at:
    csv, zipAvailable download formats
    Dataset updated
    May 16, 2025
    Dataset authored and provided by
    Crawl Feeds
    License

    https://crawlfeeds.com/privacy_policyhttps://crawlfeeds.com/privacy_policy

    Description

    Get access to a structured dataset of articles from TechCrunch, a top source for startup, technology, and business news. This dataset includes thousands of articles covering topics like venture funding, product launches, AI, crypto, and more.

    Perfect for use in:

    • News aggregation and monitoring

    • Sentiment or trend analysis

    • NLP model training

    • Startup or tech sector research

    The data is available in CSV and JSON formats and can be customized by date or topic on request.

    👉 Contact us for full access or a filtered sample.

  5. g

    Office for National Statistics - Number of Businesses by Detailed Industry,...

    • gimi9.com
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Office for National Statistics - Number of Businesses by Detailed Industry, 2, 3 and 4 Digit SIC [Dataset]. https://gimi9.com/dataset/london_number-businesses-detailed-industry-2-3-and-4-digit-sic/
    Explore at:
    Description

    Estimates of total businesses broken down by industry (2, 3, 4 digit SIC 2007 codes and industry section). Workplace data units from Annual Business Inquiry (ABI) for London and Great Britain. Data rounded to the nearest 100. Percentages calculated on unrounded data. An extract compiled from the Inter Departmental Business Register (IDBR) recording the number of local units that were live at a reference date in March. Estimates can be broken down by employment size band, detailed industry (5 digit SIC2007) and legal status. Available from country down to mid layer super output area and Scottish intermediate zones. A local unit is an individual site (for example a factory or shop) associated with an enterprise. It can also be referred to as a workplace. Industry is broken down using SIC 2007 codes. Read more about SIC here http://www.statistics.gov.uk/methods_quality/sic/downloads/SIC2007explanatorynotes.pdf The ABI is a business survey which collects both employment and financial information. Only employment information for the location of an employees workplace is available from Nomis The ABI is based on a sample of approximately 78,000 businesses and is used to provide an estimate of the number of employees. The difference between the estimate and its true value is known as the sampling error. The actual sampling error for any estimate is unknown but we can estimate, from the sample, a typical error, known as the standard error. This provides a means of assessing the precision of the estimate; the lower the standard error, the more confident we can be the estimate is close to the true value. https://www.nomisweb.co.uk/articles/showArticle.asp?title=Information&article=news/071212_abi-stderrors.htm This dataset excludes farm based agriculture data contained in SIC class 0100. Relevant link: https://www.nomisweb.co.uk/Default.asp

  6. d

    GaiaLens News Data: real-time (refreshed daily), covers c.17,000 global...

    • datarade.ai
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    GaiaLens, GaiaLens News Data: real-time (refreshed daily), covers c.17,000 global publicly traded companies, tracks 50 ESG themes [Dataset]. https://datarade.ai/data-products/gaialens-news-data-real-time-refreshed-daily-covers-c-17-gaialens
    Explore at:
    .json, .xml, .csv, .xls, .txtAvailable download formats
    Dataset authored and provided by
    GaiaLens
    Area covered
    Georgia, Nigeria, Indonesia, Slovenia, Togo, Bahamas, Pakistan, Norway, Croatia, New Zealand
    Description

    We can offer the news data in two formats: 1) News flow: all news flow for our company coverage including articles and tweets. 2) ESG Incidents: highlights any pressing issues that companies are facing in the news.

    1) News flow

    Our system executes around 100,000 searches per day across the internet. We search specific websites deemed to be high-quality and informationally additive for news about our whole company coverage.

    These include: • Mainstream publications like Reuters, CNN, CNBC, NBC News etc. • NGO websites such as Ethical Consumer and Anti-Slavery International • Investigative journalist websites like MLex • National papers like the Japan Times • Trade publications like Insurance Journal • Sustainability publications like Edie.net

    Each article that we download goes through rigorous processing. This includes cleaning the body of the article and adding its metadata e.g., the date that it was published.

    We then calculate our proprietary “relevance” scores. This is a calculation to determine how relevant the article is to the company, CEO, biggest Insider and biggest Outsider.

    Natural Language Processing (NLP) techniques are used to calculate the similarity and sentiment scores for each article for each news topic.

    We use Twitter’s API to download the latest tweets from Thought Leader Accounts. We track over 100 Thought Leaders such as Ceres and Science Based Targets.

    These tweets are then searched to see if any of our company coverage is mentioned.

    Afterwards, the same processing and calculation steps are followed as for the news articles.

    2) ESG Incidents

    ESG Incidents is the second news feed that we display for users. It is designed to show any pressing issues that a company is facing in the news in real-time.

    To get ESG Incidents outputs we follow these steps: 1. Choose a time period of news to look at e.g., 3 months. 2. For each news topic (we have around 50) pick out the article(s) that have the highest relevance to a company and the highest similarity score over that time period. We multiply these two scores together to calculate an “Incidence Score”. 3. Calculate how many times that new topic has come up in the news over the chosen time period as a proportion of the total articles for that company.

    We are then able to see emerging trends and incidents for a particular company over a time period and also have the ability to see the most relevant articles for each news topic. This allows investors to see any emerging incidents or scandals for a company in real-time.

  7. w

    Number of Businesses by Detailed Industry, 2, 3 and 4 Digit SIC

    • data.wu.ac.at
    • gimi9.com
    csv, xls
    Updated Sep 26, 2015
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    London Datastore Archive (2015). Number of Businesses by Detailed Industry, 2, 3 and 4 Digit SIC [Dataset]. https://data.wu.ac.at/schema/datahub_io/YjhkN2NlMmYtNDkwMC00ZGJiLWIyMmItZDNkZDU1Yzk5NGYy
    Explore at:
    csv(91064.0), xls(287744.0)Available download formats
    Dataset updated
    Sep 26, 2015
    Dataset provided by
    London Datastore Archive
    License

    http://reference.data.gov.uk/id/open-government-licencehttp://reference.data.gov.uk/id/open-government-licence

    Description

    Estimates of total businesses broken down by industry (2, 3, 4 digit SIC 2007 codes and industry section). Workplace data units from Annual Business Inquiry (ABI) for London and Great Britain.

    Data rounded to the nearest 100. Percentages calculated on unrounded data

    Industry is broken down using SIC 2007 codes. Read more about SIC here http://www.statistics.gov.uk/methods_quality/sic/downloads/SIC2007explanatorynotes.pdf
    The ABI is a business survey which collects both employment and financial information. Only employment information for the location of an employees workplace is available from Nomis
    The ABI is based on a sample of approximately 78,000 businesses and is used to provide an estimate of the number of employees.
    The difference between the estimate and its true value is known as the sampling error. The actual sampling error for any estimate is unknown but we can estimate, from the sample, a typical error, known as the standard error. This provides a means of assessing the precision of the estimate; the lower the standard error, the more confident we can be the estimate is close to the true value. https://www.nomisweb.co.uk/articles/showArticle.asp?title=Information&article=news/071212_abi-stderrors.htm

    This dataset excludes farm based agriculture data contained in SIC class 0100.

    Relevant link: https://www.nomisweb.co.uk/Default.asp

  8. d

    Web Scraping News Data | B2B Sentiment Data | Categorized News Events | 19M...

    • datarade.ai
    .json
    Updated Jun 27, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    PredictLeads (2024). Web Scraping News Data | B2B Sentiment Data | Categorized News Events | 19M Blogs, PR Sites and News Sites | 8.3M+ Records [Dataset]. https://datarade.ai/data-products/predictleads-web-scraping-data-news-data-categorited-new-predictleads
    Explore at:
    .jsonAvailable download formats
    Dataset updated
    Jun 27, 2024
    Dataset authored and provided by
    PredictLeads
    Area covered
    Namibia, Svalbard and Jan Mayen, Canada, Northern Mariana Islands, Gabon, Sweden, South Africa, Vietnam, Italy, Niger
    Description

    PredictLeads News Events Data provides real-time market intelligence by capturing business-critical news events, categorizing them for sentiment analysis, company profiling, and competitive tracking. Our dataset leverages advanced web scraping and AI-driven classification, ensuring access to highly relevant insights that help businesses monitor competitors, assess risks, and refine growth strategies.

    Use Cases: ✅ Sentiment Analysis – Gauge public perception and market sentiment to refine brand positioning. ✅ Account Profiling – Enrich CRM systems with real-time company event tracking. ✅ Competitive Intelligence – Monitor industry news, mergers, and expansions to anticipate market shifts. ✅ Market Research – Analyze business website updates and categorized news data for trend forecasting. ✅ Risk Assessment – Detect negative sentiment or financial distress indicators in key market players.

    Key API Attributes: - id (string, UUID) – Unique identifier for the news event. - category (string) – Categorization of the event (e.g., funding, acquisition, leadership change). - summary (string) – A brief overview of the detected event. - sentiment_score (float, nullable) – Positive, neutral, or negative sentiment rating for the event. - found_at (ISO 8601 date-time) – Timestamp when the news event was detected. - article_sentence (string, nullable) – Extracted key sentence from the news article. - location (string, nullable) – Geographic relevance of the event (e.g., company HQ, expansion region). - company (object) – The company associated with the event, including: - domain (string) – Company’s website domain. - company_name (string) – Official company name. - ticker (string, nullable) – Stock ticker (if publicly traded). - source_url (string, URL) – Link to the original news article or company update.

    📌 PredictLeads News Events Data is trusted by market leaders for real-time competitive intelligence, ensuring faster, data-driven decision-making in sales, finance, and strategic planning.

    PredictLeads News Events Dataset Docs: https://docs.predictleads.com/v3/guide/news_events_dataset

  9. c

    Complete News Data Extracted from CNBC in JSON Format: Covering Business,...

    • crawlfeeds.com
    json, zip
    Updated Jul 6, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Crawl Feeds (2025). Complete News Data Extracted from CNBC in JSON Format: Covering Business, Finance, Technology, and Global Trends for Europe, US, and UK Audiences [Dataset]. https://crawlfeeds.com/datasets/complete-news-data-extracted-from-cnbc-in-json-format-covering-business-finance-technology-and-global-trends-for-europe-us-and-uk-audiences
    Explore at:
    zip, jsonAvailable download formats
    Dataset updated
    Jul 6, 2025
    Dataset authored and provided by
    Crawl Feeds
    License

    https://crawlfeeds.com/privacy_policyhttps://crawlfeeds.com/privacy_policy

    Area covered
    United States, United Kingdom
    Description

    We have successfully extracted a comprehensive news dataset from CNBC, covering not only financial updates but also an extensive range of news categories relevant to diverse audiences in Europe, the US, and the UK. This dataset includes over 500,000 records, meticulously structured in JSON format for seamless integration and analysis.

    Diverse News Segments for In-Depth Analysis

    This extensive extraction spans multiple segments, such as:

    • Business and Market Analysis: Stay updated on major companies, mergers, and acquisitions.
    • Technology and Innovation: Explore developments in AI, cybersecurity, and digital transformation.
    • Economic Forecasts: Access insights into GDP, employment rates, inflation, and other economic indicators.
    • Geopolitical Developments: Understand the impact of political events and global trade dynamics on markets.
    • Personal Finance: Learn about saving strategies, investment tips, and real estate trends.

    Each record in the dataset is enriched with metadata tags, enabling precise filtering by region, sector, topic, and publication date.

    Why Choose This Dataset?

    The comprehensive news dataset provides real-time insights into global developments, corporate strategies, leadership changes, and sector-specific trends. Designed for media analysts, research firms, and businesses, it empowers users to perform:

    • Trend Analysis
    • Sentiment Analysis
    • Predictive Modeling

    Additionally, the JSON format ensures easy integration with analytics platforms for advanced processing.

    Access More News Datasets

    Looking for a rich repository of structured news data? Visit our news dataset collection to explore additional offerings tailored to your analysis needs.

    Sample Dataset Available

    To get a preview, check out the CSV sample of the CNBC economy articles dataset.

  10. c

    AG News Classification Dataset

    • cubig.ai
    Updated Aug 1, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CUBIG (2024). AG News Classification Dataset [Dataset]. https://cubig.ai/store/products/35/ag-news-classification-dataset
    Explore at:
    Dataset updated
    Aug 1, 2024
    Dataset authored and provided by
    CUBIG
    License

    https://cubig.ai/store/terms-of-servicehttps://cubig.ai/store/terms-of-service

    Measurement technique
    Synthetic data generation using AI techniques for model training, Privacy-preserving data transformation via differential privacy
    Description

    1) Data Introduction • News topic dataset consists of news articles classified into four major categories: World, Sports, Business, and Science/Technology. It is a subset of AG's corpus of news articles, providing a structured dataset for NLP-based text classification tasks.

    2) Data Utilization (1) News topic data has characteristics that: • The dataset includes descriptions of articles. (2) News topic data can be used to: • Media Monitoring: Helps media companies and news aggregators categorize articles automatically, improving content management and recommendations. • Academic Research: Provides data for studies on automatic text classification, topic discovery, and machine learning model performance.

  11. c

    Fox News dataset is for analyzing media trends and narratives

    • crawlfeeds.com
    csv, zip
    Updated May 19, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Crawl Feeds (2025). Fox News dataset is for analyzing media trends and narratives [Dataset]. https://crawlfeeds.com/datasets/fox-news-dataset
    Explore at:
    zip, csvAvailable download formats
    Dataset updated
    May 19, 2025
    Dataset authored and provided by
    Crawl Feeds
    License

    https://crawlfeeds.com/privacy_policyhttps://crawlfeeds.com/privacy_policy

    Description

    The Fox News Dataset is a comprehensive collection of over 1 million news articles, offering an unparalleled resource for analyzing media narratives, public discourse, and political trends. Covering articles up to the year 2023, this dataset is a treasure trove for researchers, analysts, and businesses interested in gaining deeper insights into the topics and trends covered by Fox News.

    Key Features of the Fox News Dataset

    • Extensive Coverage: Contains more than 1 million articles spanning various topics and events up to 2023.
    • Research-Ready: Perfect for text classification, natural language processing (NLP), and other research purposes.
    • Format: Provided in CSV format for seamless integration into analytical and research tools.

    Why Use This Dataset?

    This large dataset is ideal for:

    • Text Classification: Develop machine learning models to classify and categorize news content.
    • Natural Language Processing (NLP): Conduct sentiment analysis, keyword extraction, or topic modeling.
    • Media and Political Research: Analyze media narratives, public opinion, and political trends reflected in Fox News articles.
    • Trend Analysis: Identify shifts in public discourse and media focus over time.

    Explore More News Datasets

    Discover additional resources for your research needs by visiting our news dataset collection. These datasets are tailored to support diverse analytical applications, including sentiment analysis and trend modeling.

    The Fox News Dataset is a must-have for anyone interested in exploring large-scale media data and leveraging it for advanced analysis. Ready to dive into this wealth of information? Download the dataset now in CSV format and start uncovering the stories behind the headlines.

  12. AI use in newsrooms worldwide 2023

    • statista.com
    Updated Nov 28, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2024). AI use in newsrooms worldwide 2023 [Dataset]. https://www.statista.com/statistics/1119232/predictions-ai-initiatives-for-publishers/
    Explore at:
    Dataset updated
    Nov 28, 2024
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    Nov 27, 2023 - Dec 20, 2023
    Area covered
    Worldwide
    Description

    According to 56 percent of industry leaders surveyed in December 2023, back-end automation would be the most important use of artificial intelligence in newsrooms in 2024. Additionally, utilizing AI for distribution and recommendations such as personalized home pages and alerts was considered necessary for future business operations. News gathering was ranked as the least important use of AI, with only 22 percent of publishers feeling this would be important for their company in 2024. The ethics of AI in the newsroom Data from news and media organizations around the world revealed concerns about the ethical implications of AI in the newsroom. More than 80 percent of respondents said they were concerned about the ethics of AI when it came to editorial quality and the industry in general – but readers’ perceptions were less of a worry. Readers themselves, on the other hand, have priorities of their own – a UK study found that the majority of adults believed that media organizations should be required to display the ways AI was used to create a news article. The issue of trust News organizations should be mindful of how their readers feel about the use of AI in news – UK consumers are especially skeptical about the idea of an AI journalist and AI editor working on online news without human assistance. At a time when trust in human journalists is already relatively low, introducing AI into the mix could further damage public trust in the news and those reporting on it.

  13. d

    Replication Data for: Investigating positive/negative bias in Canadian...

    • search.dataone.org
    • borealisdata.ca
    Updated Dec 28, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Gagnon, Chantal; Boulanger, Pier-Pascale (2023). Replication Data for: Investigating positive/negative bias in Canadian newspapers through translation: A study of “confidence” in a corpus of business news [Dataset]. http://doi.org/10.5683/SP3/JBLRTS
    Explore at:
    Dataset updated
    Dec 28, 2023
    Dataset provided by
    Borealis
    Authors
    Gagnon, Chantal; Boulanger, Pier-Pascale
    Area covered
    Canada
    Description

    This data was used in our article Investigating positive/negative bias in Canadian newspapers through translation. To conduct our research, we used a subset of the Canadian Press Corpus in Finance (CAPCOF), composed of news items covering the 2007-2008 financial crisis and the years that led up to it (2001-2006). CAPCOF is a bilingual comparable corpus, containing texts in English and in French. The 2008 CAPCOF data subset contains 1,357,088 words in French and 1,403,907 words in English. The present Excel file was obtained using WordSmith 8.0 concordancer tool (Scott, 2020) and the 2008 CAPCOF data subset, and extracting occurrences of “confidence” and “confiance”.

  14. f

    Data from: Uneasy Bedfellows: AI in the News, Platform Companies and the...

    • tandf.figshare.com
    docx
    Updated Jun 3, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Felix M. Simon (2023). Uneasy Bedfellows: AI in the News, Platform Companies and the Issue of Journalistic Autonomy [Dataset]. http://doi.org/10.6084/m9.figshare.19803504.v1
    Explore at:
    docxAvailable download formats
    Dataset updated
    Jun 3, 2023
    Dataset provided by
    Taylor & Francis
    Authors
    Felix M. Simon
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Platform companies play an important role in the production and distribution of news. This article analyses this role and questions of control, dependence and autonomy in the light of the ‘AI goldrush’ in the news. I argue that the introduction of AI in the news risks shifting even more control to and increasing the news industry’s dependence on platform companies. While platform companies’ power over news organisations has to date mainly flown from their control over the channels of distribution, AI potentially allows them to extend this control to the means of production as the technology increasingly permeates all stages of the news-making process. As a result, news organisations risk becoming even more tethered to platform companies in the long-run, potentially limiting their autonomy and, by extension, contributing to a restructuring of the public arena as news organisations are re-shaped according to the logics of platform businesses. I conclude by mapping a research agenda that highlights potential implications and spells out areas in need of further exploration.

  15. m

    Economic Relevant News from The Guardian

    • data.mendeley.com
    • datosdeinvestigacion.conicet.gov.ar
    • +2more
    Updated Dec 10, 2019
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mariano Maisonnave (2019). Economic Relevant News from The Guardian [Dataset]. http://doi.org/10.17632/yt8j2f3hpp.2
    Explore at:
    Dataset updated
    Dec 10, 2019
    Authors
    Mariano Maisonnave
    License

    Attribution-NonCommercial 3.0 (CC BY-NC 3.0)https://creativecommons.org/licenses/by-nc/3.0/
    License information was derived automatically

    Description

    The news: The present dataset consists of 1789 news articles from the British daily newspaper The Guardian extracted using the content endpoint of The Guardian Open Platform. The news articles were, at the time, all the news corresponding to the sections: business, politics, society and world news for the entire month of January of 2013 (for a total of 1689 news) and an extra set of news articles randomly selected from the period Febrary of 2013 to December of 2015 (100 news articles). The first set of 1689 news articles was used for training and the second set of 100 news articles was used for testing in two publications:

    * Maisonnave, M., Delbianco, F., TohmĂŠ, F.A. and Maguitman, A.G., 2018, November. A Supervised Term-Weighting Method and its Application to Variable Extraction from Digital Media. In XIX Simposio Argentino de Inteligencia Artificial (ASAI)-JAIIO 47 (CABA, 2018).
    * Maisonnave, M., Delbianco, F., TohmĂŠ, F.A. and Maguitman, A.G., 2019. A Flexible Supervised Term-Weighting Technique and its Application to Variable Extraction and Information Retrieval. Inteligencia Artificial, 22(63), pp.61-80.
    

    The labels: The entire dataset was manually classified into two possible categories: economically relevant and irrelevant. The labelling process was carried out by two experts in Economy working in collaboration. For each news article, the full text of the article was analyzed to determine the category.

    The format: There are two different versions for this dataset: the reduced and the full versions. The former consists of a CSV and a readme file. The CSV file has five columns: "Instance No.", "Title", "Web Publication Date", "web URL" and "Economically Relevant". This version is reduced in columns as it does not include the full article texts; however, it does include all the 1789 instances.

    Requesting the full dataset: To gain access to the full version of the dataset (which includes the body of the news articles), please send an email to mariano.maisonnave@cs.uns.edu.ar with a copy to openplatform@theguardian.com requesting authorization and making it clear that the data set will not be used for commercial purposes.

  16. COVID-19 INDIA

    • kaggle.com
    Updated Mar 29, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Aditya Kyatham (2020). COVID-19 INDIA [Dataset]. https://www.kaggle.com/adityakyatham/covid19/activity
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Mar 29, 2020
    Dataset provided by
    Kaggle
    Authors
    Aditya Kyatham
    Area covered
    India
    Description

    The dataset has some data which is officially available for research in COVID-19 and some I have added randomly by refering to various common facts and news articles because the required data for every feature for my project is not available yet.

  17. Fake-News-Dataset

    • kaggle.com
    Updated Apr 19, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    sumanthvrao (2019). Fake-News-Dataset [Dataset]. https://www.kaggle.com/sumanthvrao/fakenewsdataset/metadata
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Apr 19, 2019
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    sumanthvrao
    Description

    Introduction

    This describes two fake news datasets covering seven different news domains. One of the datasets is collected by combining manual and crowdsourced annotation approaches (FakeNewsAMT), while the second is collected directly from the web (Celebrity).

    Data collection

    The FakeNewsDatabase dataset contains news in six different domains: technology, education, business, sports, politics, and entertainment. The legitimate news included in the dataset were collected from a variety of mainstream news websites predominantly in the US such as the ABCNews, CNN, USAToday, NewYorkTimes, FoxNews, Bloomberg, and CNET among others. The fake news included in this dataset consist of fake versions of the legitimate news in the dataset, written using Mechanical Turk. More details on the data collection are provided in section 3 of the paper.

    The Celebrity dataset contain news about celebrities (actors, singers, socialites, and politicians). The legitimate news in the dataset were obtained from entertainment, fashion and style news sections in mainstream news websites and from entertainment magazines websites. The fake news were obtained from gossip websites such as Entertainment Weekly, People Magazine, RadarOnline, and other tabloid and entertainment-oriented publications. The news articles were collected in pairs, with one article being legitimate and the other fake (rumors and false reports). The articles were manually verified using gossip-checking sites such as "GossipCop.com", and also cross-referenced with information from other entertainment news sources on the web.

    The data directory contains two fake news datasets:

    • Celebrity The fake and legitimate news are provided in two separate folders. The fake and legitimate labels are also provided as part of the filename.

    • FakeNewsAMT The fake and legitimate news are provided in two separate folders. Each folder contains 40 news from six different domains: technology, education, business, sports, politics, and entertainment. The file names indicate the news domain: business (biz), education (edu), entertainment (entmt), politics (polit), sports (sports) and technology (tech). The fake and legitimate labels are also provided as part of the filename.

    Dataset citation :

    @article{Perez-Rosas18Automatic, author = {Ver\’{o}nica P\'{e}rez-Rosas, Bennett Kleinberg, Alexandra Lefevre, Rada Mihalcea}, title = {Automatic Detection of Fake News}, journal = {International Conference on Computational Linguistics (COLING)}, year = {2018} }

  18. f

    News Intensity data in "Indirect News Coverage and Economic Policy...

    • brunel.figshare.com
    xlsx
    Updated Dec 5, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Fang Xu; Jiaying Wu (2024). News Intensity data in "Indirect News Coverage and Economic Policy Uncertainty" [Dataset]. http://doi.org/10.17633/rd.brunel.27854760.v1
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Dec 5, 2024
    Dataset provided by
    Brunel University London
    Authors
    Fang Xu; Jiaying Wu
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This data file contains news intensity measures for the UK and US, based on semantic fingerprints of the news articles from New York Times and the respective country. News articles in the following categories are used: Business Day, New York, U.S., World, Technology, Travel, Health, Real Estate, Science, Education, Automobiles, Your Money, Washington, Climate.

  19. Business Demography, UK: 2021

    • gov.uk
    Updated Nov 17, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Office for National Statistics (2022). Business Demography, UK: 2021 [Dataset]. https://www.gov.uk/government/statistics/business-demography-uk-2021
    Explore at:
    Dataset updated
    Nov 17, 2022
    Dataset provided by
    GOV.UKhttp://gov.uk/
    Authors
    Office for National Statistics
    Area covered
    United Kingdom
    Description

    Official statistics are produced impartially and free from political influence.

  20. f

    Data from: From Industry Hype to Emerging Criticism: Analysing Chilean News...

    • figshare.com
    • tandf.figshare.com
    xlsx
    Updated Jul 10, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    MatĂ­as Valderrama BarragĂĄn; Martin Tironi; Dusan Cotoras; Teresa Correa; MĂłnica Humeres; Claudia LĂłpez (2025). From Industry Hype to Emerging Criticism: Analysing Chilean News Media Coverage of Artificial Intelligence [Dataset]. http://doi.org/10.6084/m9.figshare.28247879.v1
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Jul 10, 2025
    Dataset provided by
    Taylor & Francis
    Authors
    MatĂ­as Valderrama BarragĂĄn; Martin Tironi; Dusan Cotoras; Teresa Correa; MĂłnica Humeres; Claudia LĂłpez
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    AI has become (again) a matter of public interest, and it is crucial to investigate how the news media intervenes in the hype and publicity around AI in different countries. At the intersection between Media Studies and Science and Technology Studies (STS), this article examines portrayals of AI and related technologies in the Chilean news media. We curated a corpus of nearly 7000 AI-related news articles from 2008 to 2023 from four Chilean newspapers. We combined an LDA topic modelling with an analysis with dictionaries of the key actors and critical issues discussed around AI. The analysis shows the explosive growth of the media coverage of AI in recent years, as well as the diversity of topics associated with AI in Chile. We found a high prominence of topics related to industry and technology, a high visibility of international actors, mostly U.S. tech companies, and a low level of mentions of critical issues around AI. Moreover, we also discuss the low coverage of the State’s AI use, the emergent use of generative AI in tech journalism, and the prominence of topics such as the arts and humanities that appear as emerging spaces for the problematisation of AI in Chile.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Bright Data, News Datasets [Dataset]. https://brightdata.com/products/datasets/news
Organization logo

News Datasets

Explore at:
.json, .csv, .xlsxAvailable download formats
Dataset authored and provided by
Bright Datahttps://brightdata.com/
License

https://brightdata.com/licensehttps://brightdata.com/license

Area covered
Worldwide
Description

Stay ahead with our comprehensive News Dataset, designed for businesses, analysts, and researchers to track global events, monitor media trends, and extract valuable insights from news sources worldwide.

Dataset Features

News Articles: Access structured news data, including headlines, summaries, full articles, publication dates, and source details. Ideal for media monitoring and sentiment analysis. Publisher & Source Information: Extract details about news publishers, including domain, region, and credibility indicators. Sentiment & Topic Classification: Analyze news sentiment, categorize articles by topic, and track emerging trends in real time. Historical & Real-Time Data: Retrieve historical archives or access continuously updated news feeds for up-to-date insights.

Customizable Subsets for Specific Needs Our News Dataset is fully customizable, allowing you to filter data based on publication date, region, topic, sentiment, or specific news sources. Whether you need broad coverage for trend analysis or focused data for competitive intelligence, we tailor the dataset to your needs.

Popular Use Cases

Media Monitoring & Reputation Management: Track brand mentions, analyze media coverage, and assess public sentiment. Market & Competitive Intelligence: Monitor industry trends, competitor activity, and emerging market opportunities. AI & Machine Learning Training: Use structured news data to train AI models for sentiment analysis, topic classification, and predictive analytics. Financial & Investment Research: Analyze news impact on stock markets, commodities, and economic indicators. Policy & Risk Analysis: Track regulatory changes, geopolitical events, and crisis developments in real time.

Whether you're analyzing market trends, monitoring brand reputation, or training AI models, our News Dataset provides the structured data you need. Get started today and customize your dataset to fit your business objectives.

Search
Clear search
Close search
Google apps
Main menu