100+ datasets found
  1. Forex News Annotated Dataset for Sentiment Analysis

    • zenodo.org
    • data.niaid.nih.gov
    csv
    Updated Nov 11, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Georgios Fatouros; Georgios Fatouros; Kalliopi Kouroumali; Kalliopi Kouroumali (2023). Forex News Annotated Dataset for Sentiment Analysis [Dataset]. http://doi.org/10.5281/zenodo.7976208
    Explore at:
    csvAvailable download formats
    Dataset updated
    Nov 11, 2023
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Georgios Fatouros; Georgios Fatouros; Kalliopi Kouroumali; Kalliopi Kouroumali
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset contains news headlines relevant to key forex pairs: AUDUSD, EURCHF, EURUSD, GBPUSD, and USDJPY. The data was extracted from reputable platforms Forex Live and FXstreet over a period of 86 days, from January to May 2023. The dataset comprises 2,291 unique news headlines. Each headline includes an associated forex pair, timestamp, source, author, URL, and the corresponding article text. Data was collected using web scraping techniques executed via a custom service on a virtual machine. This service periodically retrieves the latest news for a specified forex pair (ticker) from each platform, parsing all available information. The collected data is then processed to extract details such as the article's timestamp, author, and URL. The URL is further used to retrieve the full text of each article. This data acquisition process repeats approximately every 15 minutes.

    To ensure the reliability of the dataset, we manually annotated each headline for sentiment. Instead of solely focusing on the textual content, we ascertained sentiment based on the potential short-term impact of the headline on its corresponding forex pair. This method recognizes the currency market's acute sensitivity to economic news, which significantly influences many trading strategies. As such, this dataset could serve as an invaluable resource for fine-tuning sentiment analysis models in the financial realm.

    We used three categories for annotation: 'positive', 'negative', and 'neutral', which correspond to bullish, bearish, and hold sentiments, respectively, for the forex pair linked to each headline. The following Table provides examples of annotated headlines along with brief explanations of the assigned sentiment.

    Examples of Annotated Headlines
    
    
        Forex Pair
        Headline
        Sentiment
        Explanation
    
    
    
    
        GBPUSD 
        Diminishing bets for a move to 12400 
        Neutral
        Lack of strong sentiment in either direction
    
    
        GBPUSD 
        No reasons to dislike Cable in the very near term as long as the Dollar momentum remains soft 
        Positive
        Positive sentiment towards GBPUSD (Cable) in the near term
    
    
        GBPUSD 
        When are the UK jobs and how could they affect GBPUSD 
        Neutral
        Poses a question and does not express a clear sentiment
    
    
        JPYUSD
        Appropriate to continue monetary easing to achieve 2% inflation target with wage growth 
        Positive
        Monetary easing from Bank of Japan (BoJ) could lead to a weaker JPY in the short term due to increased money supply
    
    
        USDJPY
        Dollar rebounds despite US data. Yen gains amid lower yields 
        Neutral
        Since both the USD and JPY are gaining, the effects on the USDJPY forex pair might offset each other
    
    
        USDJPY
        USDJPY to reach 124 by Q4 as the likelihood of a BoJ policy shift should accelerate Yen gains 
        Negative
        USDJPY is expected to reach a lower value, with the USD losing value against the JPY
    
    
        AUDUSD
    
        <p>RBA Governor Lowe’s Testimony High inflation is damaging and corrosive </p>
    
        Positive
        Reserve Bank of Australia (RBA) expresses concerns about inflation. Typically, central banks combat high inflation with higher interest rates, which could strengthen AUD.
    

    Moreover, the dataset includes two columns with the predicted sentiment class and score as predicted by the FinBERT model. Specifically, the FinBERT model outputs a set of probabilities for each sentiment class (positive, negative, and neutral), representing the model's confidence in associating the input headline with each sentiment category. These probabilities are used to determine the predicted class and a sentiment score for each headline. The sentiment score is computed by subtracting the negative class probability from the positive one.

  2. T

    United States Fed Funds Interest Rate

    • tradingeconomics.com
    • ko.tradingeconomics.com
    • +13more
    csv, excel, json, xml
    Updated Jul 30, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    TRADING ECONOMICS (2025). United States Fed Funds Interest Rate [Dataset]. https://tradingeconomics.com/united-states/interest-rate
    Explore at:
    xml, excel, json, csvAvailable download formats
    Dataset updated
    Jul 30, 2025
    Dataset authored and provided by
    TRADING ECONOMICS
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Aug 4, 1971 - Jul 30, 2025
    Area covered
    United States
    Description

    The benchmark interest rate in the United States was last recorded at 4.50 percent. This dataset provides the latest reported value for - United States Fed Funds Rate - plus previous releases, historical high and low, short-term forecast and long-term prediction, economic calendar, survey consensus and news.

  3. Ten Thousand German News Articles Dataset

    • kaggle.com
    • tblock.github.io
    zip
    Updated Jan 20, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Timo Block (2022). Ten Thousand German News Articles Dataset [Dataset]. https://www.kaggle.com/tblock/10kgnad
    Explore at:
    zip(21144764 bytes)Available download formats
    Dataset updated
    Jan 20, 2022
    Authors
    Timo Block
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Description

    (see https://tblock.github.io/10kGNAD/ for the original dataset page)

    This page introduces the 10k German News Articles Dataset (10kGNAD) german topic classification dataset. The 10kGNAD is based on the One Million Posts Corpus and avalaible under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. You can download the dataset here.

    Why a German dataset?

    English text classification datasets are common. Examples are the big AG News, the class-rich 20 Newsgroups and the large-scale DBpedia ontology datasets for topic classification and for example the commonly used IMDb and Yelp datasets for sentiment analysis. Non-english datasets, especially German datasets, are less common. There is a collection of sentiment analysis datasets assembled by the Interest Group on German Sentiment Analysis. However, to my knowlege, no german topic classification dataset is avaliable to the public.

    Due to grammatical differences between the English and the German language, a classifyer might be effective on a English dataset, but not as effectiv on a German dataset. The German language has a higher inflection and long compound words are quite common compared to the English language. One would need to evaluate a classifyer on multiple German datasets to get a sense of it's effectivness.

    The dataset

    The 10kGNAD dataset is intended to solve part of this problem as the first german topic classification dataset. It consists of 10273 german language news articles from an austrian online newspaper categorized into nine topics. These articles are a till now unused part of the One Million Posts Corpus.

    In the One Million Posts Corpus each article has a topic path. For example Newsroom/Wirtschaft/Wirtschaftpolitik/Finanzmaerkte/Griechenlandkrise. The 10kGNAD uses the second part of the topic path, here Wirtschaft, as class label. In result the dataset can be used for multi-class classification.

    I created and used this dataset in my thesis to train and evaluate four text classifyers on the German language. By publishing the dataset I hope to support the advancement of tools and models for the German language. Additionally this dataset can be used as a benchmark dataset for german topic classification.

    Numbers and statistics

    As in most real-world datasets the class distribution of the 10kGNAD is not balanced. The biggest class Web consists of 1678, while the smalles class Kultur contains only 539 articles. However articles from the Web class have on average the fewest words, while artilces from the culture class have the second most words.

    Splitting into train and test

    I propose a stratifyed split of 10% for testing and the remaining articles for training. To use the dataset as a benchmark dataset, please used the train.csv and test.csv files located in the project root.

    Code

    Python scripts to extract the articles and split them into a train- and a testset avaliable in the code directory of this project. Make sure to install the requirements. The original corpus.sqlite3 is required to extract the articles (download here (compressed) or here (uncompressed)).

    License

    Creative Commons License

    This dataset is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. Please consider citing the authors of the One Million Post Corpus if you use the dataset.

  4. N

    Newport News, VA Population Breakdown by Gender and Age Dataset: Male and...

    • neilsberg.com
    csv, json
    Updated Feb 24, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Neilsberg Research (2025). Newport News, VA Population Breakdown by Gender and Age Dataset: Male and Female Population Distribution Across 18 Age Groups // 2025 Edition [Dataset]. https://www.neilsberg.com/research/datasets/e1f4e8b5-f25d-11ef-8c1b-3860777c1fe6/
    Explore at:
    csv, jsonAvailable download formats
    Dataset updated
    Feb 24, 2025
    Dataset authored and provided by
    Neilsberg Research
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Virginia, Newport News
    Variables measured
    Male and Female Population Under 5 Years, Male and Female Population over 85 years, Male and Female Population Between 5 and 9 years, Male and Female Population Between 10 and 14 years, Male and Female Population Between 15 and 19 years, Male and Female Population Between 20 and 24 years, Male and Female Population Between 25 and 29 years, Male and Female Population Between 30 and 34 years, Male and Female Population Between 35 and 39 years, Male and Female Population Between 40 and 44 years, and 8 more
    Measurement technique
    The data presented in this dataset is derived from the latest U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates. To measure the three variables, namely (a) Population (Male), (b) Population (Female), and (c) Gender Ratio (Males per 100 Females), we initially analyzed and categorized the data for each of the gender classifications (biological sex) reported by the US Census Bureau across 18 age groups, ranging from under 5 years to 85 years and above. These age groups are described above in the variables section. For further information regarding these estimates, please feel free to reach out to us via email at research@neilsberg.com.
    Dataset funded by
    Neilsberg Research
    Description
    About this dataset

    Context

    The dataset tabulates the population of Newport News by gender across 18 age groups. It lists the male and female population in each age group along with the gender ratio for Newport News. The dataset can be utilized to understand the population distribution of Newport News by gender and age. For example, using this dataset, we can identify the largest age group for both Men and Women in Newport News. Additionally, it can be used to see how the gender ratio changes from birth to senior most age group and male to female ratio across each age group for Newport News.

    Key observations

    Largest age group (population): Male # 20-24 years (8,018) | Female # 30-34 years (7,684). Source: U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.

    Content

    When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.

    Age groups:

    • Under 5 years
    • 5 to 9 years
    • 10 to 14 years
    • 15 to 19 years
    • 20 to 24 years
    • 25 to 29 years
    • 30 to 34 years
    • 35 to 39 years
    • 40 to 44 years
    • 45 to 49 years
    • 50 to 54 years
    • 55 to 59 years
    • 60 to 64 years
    • 65 to 69 years
    • 70 to 74 years
    • 75 to 79 years
    • 80 to 84 years
    • 85 years and over

    Scope of gender :

    Please note that American Community Survey asks a question about the respondents current sex, but not about gender, sexual orientation, or sex at birth. The question is intended to capture data for biological sex, not gender. Respondents are supposed to respond with the answer as either of Male or Female. Our research and this dataset mirrors the data reported as Male and Female for gender distribution analysis.

    Variables / Data Columns

    • Age Group: This column displays the age group for the Newport News population analysis. Total expected values are 18 and are define above in the age groups section.
    • Population (Male): The male population in the Newport News is shown in the following column.
    • Population (Female): The female population in the Newport News is shown in the following column.
    • Gender Ratio: Also known as the sex ratio, this column displays the number of males per 100 females in Newport News for each age group.

    Good to know

    Margin of Error

    Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

    Custom data

    If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

    Inspiration

    Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

    Recommended for further research

    This dataset is a part of the main dataset for Newport News Population by Gender. You can refer the same here

  5. N

    Newport News, VA Population Pyramid Dataset: Age Groups, Male and Female...

    • neilsberg.com
    csv, json
    Updated Feb 22, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Neilsberg Research (2025). Newport News, VA Population Pyramid Dataset: Age Groups, Male and Female Population, and Total Population for Demographics Analysis // 2025 Edition [Dataset]. https://www.neilsberg.com/research/datasets/5262e080-f122-11ef-8c1b-3860777c1fe6/
    Explore at:
    csv, jsonAvailable download formats
    Dataset updated
    Feb 22, 2025
    Dataset authored and provided by
    Neilsberg Research
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Virginia, Newport News
    Variables measured
    Male and Female Population Under 5 Years, Male and Female Population over 85 years, Male and Female Total Population for Age Groups, Male and Female Population Between 5 and 9 years, Male and Female Population Between 10 and 14 years, Male and Female Population Between 15 and 19 years, Male and Female Population Between 20 and 24 years, Male and Female Population Between 25 and 29 years, Male and Female Population Between 30 and 34 years, Male and Female Population Between 35 and 39 years, and 9 more
    Measurement technique
    The data presented in this dataset is derived from the latest U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates. To measure the three variables, namely (a) male population, (b) female population and (b) total population, we initially analyzed and categorized the data for each of the age groups. For age groups we divided it into roughly a 5 year bucket for ages between 0 and 85. For over 85, we aggregated data into a single group for all ages. For further information regarding these estimates, please feel free to reach out to us via email at research@neilsberg.com.
    Dataset funded by
    Neilsberg Research
    Description
    About this dataset

    Context

    The dataset tabulates the data for the Newport News, VA population pyramid, which represents the Newport News population distribution across age and gender, using estimates from the U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates. It lists the male and female population for each age group, along with the total population for those age groups. Higher numbers at the bottom of the table suggest population growth, whereas higher numbers at the top indicate declining birth rates. Furthermore, the dataset can be utilized to understand the youth dependency ratio, old-age dependency ratio, total dependency ratio, and potential support ratio.

    Key observations

    • Youth dependency ratio, which is the number of children aged 0-14 per 100 persons aged 15-64, for Newport News, VA, is 29.6.
    • Old-age dependency ratio, which is the number of persons aged 65 or over per 100 persons aged 15-64, for Newport News, VA, is 20.5.
    • Total dependency ratio for Newport News, VA is 50.1.
    • Potential support ratio, which is the number of youth (working age population) per elderly, for Newport News, VA is 4.9.
    Content

    When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.

    Age groups:

    • Under 5 years
    • 5 to 9 years
    • 10 to 14 years
    • 15 to 19 years
    • 20 to 24 years
    • 25 to 29 years
    • 30 to 34 years
    • 35 to 39 years
    • 40 to 44 years
    • 45 to 49 years
    • 50 to 54 years
    • 55 to 59 years
    • 60 to 64 years
    • 65 to 69 years
    • 70 to 74 years
    • 75 to 79 years
    • 80 to 84 years
    • 85 years and over

    Variables / Data Columns

    • Age Group: This column displays the age group for the Newport News population analysis. Total expected values are 18 and are define above in the age groups section.
    • Population (Male): The male population in the Newport News for the selected age group is shown in the following column.
    • Population (Female): The female population in the Newport News for the selected age group is shown in the following column.
    • Total Population: The total population of the Newport News for the selected age group is shown in the following column.

    Good to know

    Margin of Error

    Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

    Custom data

    If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

    Inspiration

    Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

    Recommended for further research

    This dataset is a part of the main dataset for Newport News Population by Age. You can refer the same here

  6. National Universities Rankings

    • kaggle.com
    Updated Dec 3, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Devastator (2022). National Universities Rankings [Dataset]. https://www.kaggle.com/datasets/thedevastator/national-universities-rankings-explore-quality-t/suggestions
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Dec 3, 2022
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    The Devastator
    Description

    National Universities Rankings

    Analyze 1,800 U.S. Universities and their Academic Performance

    By Education [source]

    About this dataset

    Welcome to the U.S. News & World Report's 2017 National Universities Rankings, a comprehensive dataset of over 1,800 schools across the United States providing quality data on admissions criteria, cost of tuition and fees, enrollment numbers, and overall rankings! Here you'll find up-to-date information on institutes of higher learning from Princeton University at the top spot in Best National Universities to Williams College at No. 1 on the Best National Liberal Arts Colleges list.

    This collection of data is all that's needed for potential students - parents, counselors and more - to evaluate their choices in selecting a college or university that perfectly meets their needs. For instance: what is the total tuition & fees cost? What are student enrollment numbers? How have students rated this school? Which universities have been recognized as top institutions in academics by U.S. News & World Report? What admissions criteria do these schools evaluate when considering an applicant's profile? The answers lie within this dataset!

    Explore each category separately as well as with other considerations through visuals like our scatter plot to get an inside look into collegiate education from enrollment patterns charted against yearly expenses including room & board charges without forgetting several crucial factors such as six-year graduation rates and freshman retention rates measured among nations' universities included here -allowing for comparison and assessment beforehand for a well-rounded experience such that you can find your own path ahead!

    More Datasets

    For more datasets, click here.

    Featured Notebooks

    • 🚨 Your notebook can be here! 🚨!

    How to use the dataset

    This dataset contains information on the quality, tuition, and enrollment data of 1,800 U.S.-based universities ranked by U.S. News & World Report from 2017. It includes rankings from the National University and Liberal Arts College lists in addition to relevant data points like tuition fees and undergraduate enrollments for each school.

    Users can take advantage of this dataset to build models that predict ranking or predicting cost-benefit results for students by using cost-related (tuition) metrics along with quality metrics (rankings). Alternatively users can use it to analyze trends between investments in higher education versus outcomes (ranking), or explore the relationship between enrollments for schools of varying rank tiers, etc...

    For more information on how rankings are calculated please refer to this methodology explainer on U.S news website

    Here is an overview of all columns included in this dataset:

    Columns:Name - institution name,Location - City and state where located,Rank - Ranking according to U.S News & World Report ,Description - Snippet of text overview from U.S News ,Tuition and fees – Combined tuition and fees for out–of–state students ,In–state – Tuition and fees for in–state students ,Undergraduate Enrollment – Number of enrolled undergraduate students .

    Using this column detail as a guide we can answer questions like ‘which colleges give highest ROI ?’ or ‘Which college has highest number undergraduates?’ . For statistical analysis such as correlation we may use a visual representation such as a scatter plots or bar graphs accordingly making it easier analyses trends found within our dataset ans well as exploring any relationships between different factors such us tuitions vs ranks

    Research Ideas

    • Developing a searchable database to help high school students identify colleges that match their criteria in terms of tuition, graduation rate, location, and rank.
    • Identifying correlations between enrollment numbers and university rank in order to better understand how the number of enrolled students effects the overall ranking of a university.
    • Comparing universities with similar rankings in order to highlight differences between programs’ tuition and fees as well as retention rates

    Acknowledgements

    If you use this dataset in your research, please credit the original authors. Data Source

    License

    License: Dataset copyright by authors - You are free to: - Share - copy and redistribute the material in any medium or format for any purpose, even commercially. - Adapt - remix, ...

  7. T

    United States Personal Savings Rate

    • tradingeconomics.com
    • tr.tradingeconomics.com
    • +13more
    csv, excel, json, xml
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    TRADING ECONOMICS, United States Personal Savings Rate [Dataset]. https://tradingeconomics.com/united-states/personal-savings
    Explore at:
    xml, excel, json, csvAvailable download formats
    Dataset authored and provided by
    TRADING ECONOMICS
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Jan 31, 1959 - Jun 30, 2025
    Area covered
    United States
    Description

    Household Saving Rate in the United States remained unchanged at 4.50 percent in June from 4.50 percent in May of 2025. This dataset provides - United States Personal Savings Rate - actual values, historical data, forecast, chart, statistics, economic calendar and news.

  8. d

    Johns Hopkins COVID-19 Case Tracker

    • data.world
    csv, zip
    Updated Aug 13, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Associated Press (2025). Johns Hopkins COVID-19 Case Tracker [Dataset]. https://data.world/associatedpress/johns-hopkins-coronavirus-case-tracker
    Explore at:
    zip, csvAvailable download formats
    Dataset updated
    Aug 13, 2025
    Authors
    The Associated Press
    Description

    Updates

    • Notice of data discontinuation: Since the start of the pandemic, AP has reported case and death counts from data provided by Johns Hopkins University. Johns Hopkins University has announced that they will stop their daily data collection efforts after March 10. As Johns Hopkins stops providing data, the AP will also stop collecting daily numbers for COVID cases and deaths. The HHS and CDC now collect and visualize key metrics for the pandemic. AP advises using those resources when reporting on the pandemic going forward.

    • April 9, 2020

      • The population estimate data for New York County, NY has been updated to include all five New York City counties (Kings County, Queens County, Bronx County, Richmond County and New York County). This has been done to match the Johns Hopkins COVID-19 data, which aggregates counts for the five New York City counties to New York County.
    • April 20, 2020

      • Johns Hopkins death totals in the US now include confirmed and probable deaths in accordance with CDC guidelines as of April 14. One significant result of this change was an increase of more than 3,700 deaths in the New York City count. This change will likely result in increases for death counts elsewhere as well. The AP does not alter the Johns Hopkins source data, so probable deaths are included in this dataset as well.
    • April 29, 2020

      • The AP is now providing timeseries data for counts of COVID-19 cases and deaths. The raw counts are provided here unaltered, along with a population column with Census ACS-5 estimates and calculated daily case and death rates per 100,000 people. Please read the updated caveats section for more information.
    • September 1st, 2020

      • Johns Hopkins is now providing counts for the five New York City counties individually.
    • February 12, 2021

      • The Ohio Department of Health recently announced that as many as 4,000 COVID-19 deaths may have been underreported through the state’s reporting system, and that the "daily reported death counts will be high for a two to three-day period."
      • Because deaths data will be anomalous for consecutive days, we have chosen to freeze Ohio's rolling average for daily deaths at the last valid measure until Johns Hopkins is able to back-distribute the data. The raw daily death counts, as reported by Johns Hopkins and including the backlogged death data, will still be present in the new_deaths column.
    • February 16, 2021

      - Johns Hopkins has reconciled Ohio's historical deaths data with the state.

      Overview

    The AP is using data collected by the Johns Hopkins University Center for Systems Science and Engineering as our source for outbreak caseloads and death counts for the United States and globally.

    The Hopkins data is available at the county level in the United States. The AP has paired this data with population figures and county rural/urban designations, and has calculated caseload and death rates per 100,000 people. Be aware that caseloads may reflect the availability of tests -- and the ability to turn around test results quickly -- rather than actual disease spread or true infection rates.

    This data is from the Hopkins dashboard that is updated regularly throughout the day. Like all organizations dealing with data, Hopkins is constantly refining and cleaning up their feed, so there may be brief moments where data does not appear correctly. At this link, you’ll find the Hopkins daily data reports, and a clean version of their feed.

    The AP is updating this dataset hourly at 45 minutes past the hour.

    To learn more about AP's data journalism capabilities for publishers, corporations and financial institutions, go here or email kromano@ap.org.

    Queries

    Use AP's queries to filter the data or to join to other datasets we've made available to help cover the coronavirus pandemic

    Interactive

    The AP has designed an interactive map to track COVID-19 cases reported by Johns Hopkins.

    @(https://datawrapper.dwcdn.net/nRyaf/15/)

    Interactive Embed Code

    <iframe title="USA counties (2018) choropleth map Mapping COVID-19 cases by county" aria-describedby="" id="datawrapper-chart-nRyaf" src="https://datawrapper.dwcdn.net/nRyaf/10/" scrolling="no" frameborder="0" style="width: 0; min-width: 100% !important;" height="400"></iframe><script type="text/javascript">(function() {'use strict';window.addEventListener('message', function(event) {if (typeof event.data['datawrapper-height'] !== 'undefined') {for (var chartId in event.data['datawrapper-height']) {var iframe = document.getElementById('datawrapper-chart-' + chartId) || document.querySelector("iframe[src*='" + chartId + "']");if (!iframe) {continue;}iframe.style.height = event.data['datawrapper-height'][chartId] + 'px';}}});})();</script>
    

    Caveats

    • This data represents the number of cases and deaths reported by each state and has been collected by Johns Hopkins from a number of sources cited on their website.
    • In some cases, deaths or cases of people who've crossed state lines -- either to receive treatment or because they became sick and couldn't return home while traveling -- are reported in a state they aren't currently in, because of state reporting rules.
    • In some states, there are a number of cases not assigned to a specific county -- for those cases, the county name is "unassigned to a single county"
    • This data should be credited to Johns Hopkins University's COVID-19 tracking project. The AP is simply making it available here for ease of use for reporters and members.
    • Caseloads may reflect the availability of tests -- and the ability to turn around test results quickly -- rather than actual disease spread or true infection rates.
    • Population estimates at the county level are drawn from 2014-18 5-year estimates from the American Community Survey.
    • The Urban/Rural classification scheme is from the Center for Disease Control and Preventions's National Center for Health Statistics. It puts each county into one of six categories -- from Large Central Metro to Non-Core -- according to population and other characteristics. More details about the classifications can be found here.

    Johns Hopkins timeseries data - Johns Hopkins pulls data regularly to update their dashboard. Once a day, around 8pm EDT, Johns Hopkins adds the counts for all areas they cover to the timeseries file. These counts are snapshots of the latest cumulative counts provided by the source on that day. This can lead to inconsistencies if a source updates their historical data for accuracy, either increasing or decreasing the latest cumulative count. - Johns Hopkins periodically edits their historical timeseries data for accuracy. They provide a file documenting all errors in their timeseries files that they have identified and fixed here

    Attribution

    This data should be credited to Johns Hopkins University COVID-19 tracking project

  9. T

    United States MBA 30-Yr Mortgage Rate

    • tradingeconomics.com
    • zh.tradingeconomics.com
    • +13more
    csv, excel, json, xml
    Updated Aug 8, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    TRADING ECONOMICS (2025). United States MBA 30-Yr Mortgage Rate [Dataset]. https://tradingeconomics.com/united-states/mortgage-rate
    Explore at:
    xml, excel, json, csvAvailable download formats
    Dataset updated
    Aug 8, 2025
    Dataset authored and provided by
    TRADING ECONOMICS
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Jan 5, 1990 - Aug 8, 2025
    Area covered
    United States
    Description

    Fixed 30-year mortgage rates in the United States averaged 6.67 percent in the week ending August 8 of 2025. This dataset provides the latest reported value for - United States MBA 30-Yr Mortgage Rate - plus previous releases, historical high and low, short-term forecast and long-term prediction, economic calendar, survey consensus and news.

  10. d

    Replication Data for: The words have power: the impact of news on exchange...

    • dataone.org
    • dataverse.harvard.edu
    Updated Dec 16, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Shugliashvili, Teona (2023). Replication Data for: The words have power: the impact of news on exchange rates [Dataset]. http://doi.org/10.7910/DVN/AXVTYQ
    Explore at:
    Dataset updated
    Dec 16, 2023
    Dataset provided by
    Harvard Dataverse
    Authors
    Shugliashvili, Teona
    Description

    Hereby I am sharing the data used in the paper: "The words have power: the impact of news on exchange rates". The dataset includes: Taylor Rule Fundamentals: - inflation, - industrial production index (as a high-frequency proxy of GDP), - money market rate from 2000 until 2018. Textual information: - Entropies of news items about the U.S. Dollar from Nexis-Uni database. This is how we get the textual data from Nexis-Uni database: We enter “U.S. Dollar” as a keyword in searching for the news, which gives over 15 Million non-duplicate news. Next, we clean data news and select the relevant news items as follows. We select news about U.S. Dollar with the following criteria: (i) the U.S. Dollar appears in the title of news items, (ii) U.S. Dollar is repeated several times in the news, (iii) the first paragraph of news contains the word “U.S. Dollar”, (iv) U.S. Dollar is the subject of news items which are automatically selected by Nexis-Uni database. - economic policy uncertainty index from https://www.policyuncertainty.com/index.html

  11. f

    Quantifying underreporting of law-enforcement-related deaths in United...

    • plos.figshare.com
    pdf
    Updated Jun 4, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Justin M. Feldman; Sofia Gruskin; Brent A. Coull; Nancy Krieger (2023). Quantifying underreporting of law-enforcement-related deaths in United States vital statistics and news-media-based data sources: A capture–recapture analysis [Dataset]. http://doi.org/10.1371/journal.pmed.1002399
    Explore at:
    pdfAvailable download formats
    Dataset updated
    Jun 4, 2023
    Dataset provided by
    PLOS Medicine
    Authors
    Justin M. Feldman; Sofia Gruskin; Brent A. Coull; Nancy Krieger
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    United States
    Description

    BackgroundPrior research suggests that United States governmental sources documenting the number of law-enforcement-related deaths (i.e., fatalities due to injuries inflicted by law enforcement officers) undercount these incidents. The National Vital Statistics System (NVSS), administered by the federal government and based on state death certificate data, identifies such deaths by assigning them diagnostic codes corresponding to “legal intervention” in accordance with the International Classification of Diseases–10th Revision (ICD-10). Newer, nongovernmental databases track law-enforcement-related deaths by compiling news media reports and provide an opportunity to assess the magnitude and determinants of suspected NVSS underreporting. Our a priori hypotheses were that underreporting by the NVSS would exceed that by the news media sources, and that underreporting rates would be higher for decedents of color versus white, decedents in lower versus higher income counties, decedents killed by non-firearm (e.g., Taser) versus firearm mechanisms, and deaths recorded by a medical examiner versus coroner.Methods and findingsWe created a new US-wide dataset by matching cases reported in a nongovernmental, news-media-based dataset produced by the newspaper The Guardian, The Counted, to identifiable NVSS mortality records for 2015. We conducted 2 main analyses for this cross-sectional study: (1) an estimate of the total number of deaths and the proportion unreported by each source using capture–recapture analysis and (2) an assessment of correlates of underreporting of law-enforcement-related deaths (demographic characteristics of the decedent, mechanism of death, death investigator type [medical examiner versus coroner], county median income, and county urbanicity) in the NVSS using multilevel logistic regression. We estimated that the total number of law-enforcement-related deaths in 2015 was 1,166 (95% CI: 1,153, 1,184). There were 599 deaths reported in The Counted only, 36 reported in the NVSS only, 487 reported in both lists, and an estimated 44 (95% CI: 31, 62) not reported in either source. The NVSS documented 44.9% (95% CI: 44.2%, 45.4%) of the total number of deaths, and The Counted documented 93.1% (95% CI: 91.7%, 94.2%). In a multivariable mixed-effects logistic model that controlled for all individual- and county-level covariates, decedents injured by non-firearm mechanisms had higher odds of underreporting in the NVSS than those injured by firearms (odds ratio [OR]: 68.2; 95% CI: 15.7, 297.5; p < 0.01), and underreporting was also more likely outside of the highest-income-quintile counties (OR for the lowest versus highest income quintile: 10.1; 95% CI: 2.4, 42.8; p < 0.01). There was no statistically significant difference in the odds of underreporting in the NVSS for deaths certified by coroners compared to medical examiners, and the odds of underreporting did not vary by race/ethnicity. One limitation of our analyses is that we were unable to examine the characteristics of cases that were unreported in The Counted.ConclusionsThe media-based source, The Counted, reported a considerably higher proportion of law-enforcement-related deaths than the NVSS, which failed to report a majority of these incidents. For the NVSS, rates of underreporting were higher in lower income counties and for decedents killed by non-firearm mechanisms. There was no evidence suggesting that underreporting varied by death investigator type (medical examiner versus coroner) or race/ethnicity.

  12. T

    United States Unemployment Rate

    • tradingeconomics.com
    • pt.tradingeconomics.com
    • +13more
    csv, excel, json, xml
    Updated Aug 1, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    TRADING ECONOMICS (2025). United States Unemployment Rate [Dataset]. https://tradingeconomics.com/united-states/unemployment-rate
    Explore at:
    excel, xml, csv, jsonAvailable download formats
    Dataset updated
    Aug 1, 2025
    Dataset authored and provided by
    TRADING ECONOMICS
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Jan 31, 1948 - Jul 31, 2025
    Area covered
    United States
    Description

    Unemployment Rate in the United States increased to 4.20 percent in July from 4.10 percent in June of 2025. This dataset provides the latest reported value for - United States Unemployment Rate - plus previous releases, historical high and low, short-term forecast and long-term prediction, economic calendar, survey consensus and news.

  13. H

    Applying machine learning to study correlations, if any, between news...

    • dataverse.harvard.edu
    Updated Aug 14, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Amey Purandare (2020). Applying machine learning to study correlations, if any, between news content and stock price movements [Dataset]. http://doi.org/10.7910/DVN/HUK9TF
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Aug 14, 2020
    Dataset provided by
    Harvard Dataverse
    Authors
    Amey Purandare
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Text classification problems are quite successfully solved by current machine learning techniques. Text content such as consumer reviews, email content etc. can be classified as favorable/unfavorable, spam/not-spam, etc. with a high success rate. News content too is known to affect human sentiment leading to sharp, short term price movements in stocks that follows a positive/negative news. The attached sample dataset may be used to train a machine learning model to classify news text and predict its influence on stock price, and subsequently to deduce buy/sell recommendations. A predicted downward price movement may also help institutions engaged in lombard lending (securities lending) employ proactive risk mitigation. The dataset contains news articles and the empirical stock price movements following the news publication date. To attribute the stock price move to a specific news incident alone is difficult, as there are several factors influencing the stock price. However, we have selected stocks and incident dates, where the stock has significantly outperformed or underperformed its industry peers. Thus, the effects of broader market and industry factors can be assumed to have less significance, because such factors would cause all industry peers to rise/fall in tandem, if at all any cause-effect relationship exists. In other words, if the company's stock price showed a statistically significant up/downward change relative to its industry peers in the reference time period, only then such data points are taken in consideration. Secondly, earnings related news content (fundamental factor in attractiveness of a stock) is omitted from consideration, to keep the analysis limited in scope to incident news alone. Reference time period for evaluating the under/out performance is kept to a maximum of 10 days, to only capture "short-term" price movements. This helps omit the scenarios where stock price was affected by business operational realities of the company e.g. actual (not reported) success/failure of its product/service, as such events are relatively long term. In short, due care (feature engineering) has been employed to curate this dataset to serve its intended application. Please note that this is only a sample dataset of roughly 100 records. Full dataset can be requested for non commercial use. Please contact me via this platform or via Linkedin.

  14. N

    Newport News, VA Population Pyramid Dataset: Age Groups, Male and Female...

    • neilsberg.com
    csv, json
    Updated Jul 24, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Neilsberg Research (2024). Newport News, VA Population Pyramid Dataset: Age Groups, Male and Female Population, and Total Population for Demographics Analysis // 2024 Edition [Dataset]. https://www.neilsberg.com/research/datasets/f03e03fd-4983-11ef-ae5d-3860777c1fe6/
    Explore at:
    json, csvAvailable download formats
    Dataset updated
    Jul 24, 2024
    Dataset authored and provided by
    Neilsberg Research
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Virginia, Newport News
    Variables measured
    Male and Female Population Under 5 Years, Male and Female Population over 85 years, Male and Female Total Population for Age Groups, Male and Female Population Between 5 and 9 years, Male and Female Population Between 10 and 14 years, Male and Female Population Between 15 and 19 years, Male and Female Population Between 20 and 24 years, Male and Female Population Between 25 and 29 years, Male and Female Population Between 30 and 34 years, Male and Female Population Between 35 and 39 years, and 9 more
    Measurement technique
    The data presented in this dataset is derived from the latest U.S. Census Bureau American Community Survey (ACS) 2018-2022 5-Year Estimates. To measure the three variables, namely (a) male population, (b) female population and (b) total population, we initially analyzed and categorized the data for each of the age groups. For age groups we divided it into roughly a 5 year bucket for ages between 0 and 85. For over 85, we aggregated data into a single group for all ages. For further information regarding these estimates, please feel free to reach out to us via email at research@neilsberg.com.
    Dataset funded by
    Neilsberg Research
    Description
    About this dataset

    Context

    The dataset tabulates the data for the Newport News, VA population pyramid, which represents the Newport News population distribution across age and gender, using estimates from the U.S. Census Bureau American Community Survey (ACS) 2018-2022 5-Year Estimates. It lists the male and female population for each age group, along with the total population for those age groups. Higher numbers at the bottom of the table suggest population growth, whereas higher numbers at the top indicate declining birth rates. Furthermore, the dataset can be utilized to understand the youth dependency ratio, old-age dependency ratio, total dependency ratio, and potential support ratio.

    Key observations

    • Youth dependency ratio, which is the number of children aged 0-14 per 100 persons aged 15-64, for Newport News, VA, is 29.3.
    • Old-age dependency ratio, which is the number of persons aged 65 or over per 100 persons aged 15-64, for Newport News, VA, is 20.0.
    • Total dependency ratio for Newport News, VA is 49.3.
    • Potential support ratio, which is the number of youth (working age population) per elderly, for Newport News, VA is 5.0.
    Content

    When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2018-2022 5-Year Estimates.

    Age groups:

    • Under 5 years
    • 5 to 9 years
    • 10 to 14 years
    • 15 to 19 years
    • 20 to 24 years
    • 25 to 29 years
    • 30 to 34 years
    • 35 to 39 years
    • 40 to 44 years
    • 45 to 49 years
    • 50 to 54 years
    • 55 to 59 years
    • 60 to 64 years
    • 65 to 69 years
    • 70 to 74 years
    • 75 to 79 years
    • 80 to 84 years
    • 85 years and over

    Variables / Data Columns

    • Age Group: This column displays the age group for the Newport News population analysis. Total expected values are 18 and are define above in the age groups section.
    • Population (Male): The male population in the Newport News for the selected age group is shown in the following column.
    • Population (Female): The female population in the Newport News for the selected age group is shown in the following column.
    • Total Population: The total population of the Newport News for the selected age group is shown in the following column.

    Good to know

    Margin of Error

    Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

    Custom data

    If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

    Inspiration

    Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

    Recommended for further research

    This dataset is a part of the main dataset for Newport News Population by Age. You can refer the same here

  15. H

    Replication Data for: 'Beyond Tradition: A Hybrid Model Unveiling News...

    • dataverse.harvard.edu
    • search.dataone.org
    Updated Apr 1, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Author (2024). Replication Data for: 'Beyond Tradition: A Hybrid Model Unveiling News Impact on Exchange Rates'. [Dataset]. http://doi.org/10.7910/DVN/0IIZJO
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Apr 1, 2024
    Dataset provided by
    Harvard Dataverse
    Authors
    The Author
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    I am hereby sharing the dataset used in the paper titled 'Beyond Tradition: A Hybrid Model Unveiling News Impact on Exchange Rates'. The dataset comprises the following components: Taylor Rule Fundamentals: - Inflation - Industrial production index (as a high-frequency proxy of GDP) - Money market rate spanning from 2000 to 2018. Textual Information: - Economic Policy Uncertainty Index from https://www.policyuncertainty.com/index.html (as of November 9, 2023). - Time series of entropies calculated for U.S. Dollar-related news topics extracted from the Nexis-Uni database. Note: To acquire the textual data from the Nexis-Uni database, we conducted the following steps: We entered "U.S. Dollar" as a keyword in the search for news, resulting in over 15 million non-duplicate news items. Subsequently, we cleaned the news data and selected relevant news items using the following criteria: (i) The U.S. Dollar appears in the title of news items, (ii) The term "U.S. Dollar" is repeated several times in the news, (iii) The first paragraph of the news contains the word "U.S. Dollar", (iv) The news items are automatically selected by the Nexis-Uni database with the U.S. Dollar as the subject. Subsequently, we identified the topics related to the US Dollar from the news using LDA and calculated the Shannon entropies over time for each topic.

  16. T

    United States Inflation Rate

    • tradingeconomics.com
    • fa.tradingeconomics.com
    • +13more
    csv, excel, json, xml
    Updated Aug 12, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    TRADING ECONOMICS (2025). United States Inflation Rate [Dataset]. https://tradingeconomics.com/united-states/inflation-cpi
    Explore at:
    json, excel, xml, csvAvailable download formats
    Dataset updated
    Aug 12, 2025
    Dataset authored and provided by
    TRADING ECONOMICS
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Dec 31, 1914 - Jul 31, 2025
    Area covered
    United States
    Description

    Inflation Rate in the United States remained unchanged at 2.70 percent in July. This dataset provides - United States Inflation Rate - actual values, historical data, forecast, chart, statistics, economic calendar and news.

  17. S

    Data from: DIPSEER: A Dataset for In-Person Student Emotion and Engagement...

    • scidb.cn
    • observatorio-cientifico.ua.es
    Updated Sep 4, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Luis Márquez-Carpintero; Sergio Suescun-Ferrandiz; Carolina Lorenzo Álvarez; Jorge Fernandez-Herrero; Diego Viejo; Rosabel Roig-Vila; Miguel Cazorla (2024). DIPSEER: A Dataset for In-Person Student Emotion and Engagement Recognition in the Wild [Dataset]. http://doi.org/10.57760/sciencedb.11541
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Sep 4, 2024
    Dataset provided by
    Science Data Bank
    Authors
    Luis Márquez-Carpintero; Sergio Suescun-Ferrandiz; Carolina Lorenzo Álvarez; Jorge Fernandez-Herrero; Diego Viejo; Rosabel Roig-Vila; Miguel Cazorla
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Data DescriptionThe DIPSER dataset is designed to assess student attention and emotion in in-person classroom settings, consisting of RGB camera data, smartwatch sensor data, and labeled attention and emotion metrics. It includes multiple camera angles per student to capture posture and facial expressions, complemented by smartwatch data for inertial and biometric metrics. Attention and emotion labels are derived from self-reports and expert evaluations. The dataset includes diverse demographic groups, with data collected in real-world classroom environments, facilitating the training of machine learning models for predicting attention and correlating it with emotional states.Data Collection and Generation ProceduresThe dataset was collected in a natural classroom environment at the University of Alicante, Spain. The recording setup consisted of six general cameras positioned to capture the overall classroom context and individual cameras placed at each student’s desk. Additionally, smartwatches were used to collect biometric data, such as heart rate, accelerometer, and gyroscope readings.Experimental SessionsNine distinct educational activities were designed to ensure a comprehensive range of engagement scenarios:News Reading – Students read projected or device-displayed news.Brainstorming Session – Idea generation for problem-solving.Lecture – Passive listening to an instructor-led session.Information Organization – Synthesizing information from different sources.Lecture Test – Assessment of lecture content via mobile devices.Individual Presentations – Students present their projects.Knowledge Test – Conducted using Kahoot.Robotics Experimentation – Hands-on session with robotics.MTINY Activity Design – Development of educational activities with computational thinking.Technical SpecificationsRGB Cameras: Individual cameras recorded at 640×480 pixels, while context cameras captured at 1280×720 pixels.Frame Rate: 9-10 FPS depending on the setup.Smartwatch Sensors: Collected heart rate, accelerometer, gyroscope, rotation vector, and light sensor data at a frequency of 1–100 Hz.Data Organization and FormatsThe dataset follows a structured directory format:/groupX/experimentY/subjectZ.zip Each subject-specific folder contains:images/ (individual facial images)watch_sensors/ (sensor readings in JSON format)labels/ (engagement & emotion annotations)metadata/ (subject demographics & session details)Annotations and LabelingEach data entry includes engagement levels (1-5) and emotional states (9 categories) based on both self-reported labels and evaluations by four independent experts. A custom annotation tool was developed to ensure consistency across evaluations.Missing Data and Data QualitySynchronization: A centralized server ensured time alignment across devices. Brightness changes were used to verify synchronization.Completeness: No major missing data, except for occasional random frame drops due to embedded device performance.Data Consistency: Uniform collection methodology across sessions, ensuring high reliability.Data Processing MethodsTo enhance usability, the dataset includes preprocessed bounding boxes for face, body, and hands, along with gaze estimation and head pose annotations. These were generated using YOLO, MediaPipe, and DeepFace.File Formats and AccessibilityImages: Stored in standard JPEG format.Sensor Data: Provided as structured JSON files.Labels: Available as CSV files with timestamps.The dataset is publicly available under the CC-BY license and can be accessed along with the necessary processing scripts via the DIPSER GitHub repository.Potential Errors and LimitationsDue to camera angles, some student movements may be out of frame in collaborative sessions.Lighting conditions vary slightly across experiments.Sensor latency variations are minimal but exist due to embedded device constraints.CitationIf you find this project helpful for your research, please cite our work using the following bibtex entry:@misc{marquezcarpintero2025dipserdatasetinpersonstudent1, title={DIPSER: A Dataset for In-Person Student Engagement Recognition in the Wild}, author={Luis Marquez-Carpintero and Sergio Suescun-Ferrandiz and Carolina Lorenzo Álvarez and Jorge Fernandez-Herrero and Diego Viejo and Rosabel Roig-Vila and Miguel Cazorla}, year={2025}, eprint={2502.20209}, archivePrefix={arXiv}, primaryClass={cs.CV}, url={https://arxiv.org/abs/2502.20209}, } Usage and ReproducibilityResearchers can utilize standard tools like OpenCV, TensorFlow, and PyTorch for analysis. The dataset supports research in machine learning, affective computing, and education analytics, offering a unique resource for engagement and attention studies in real-world classroom environments.

  18. d

    Bangladesh rape case dataset

    • dataone.org
    Updated Sep 24, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rubaiat, Sajratul Yakin (2024). Bangladesh rape case dataset [Dataset]. http://doi.org/10.7910/DVN/OE7NFR
    Explore at:
    Dataset updated
    Sep 24, 2024
    Dataset provided by
    Harvard Dataverse
    Authors
    Rubaiat, Sajratul Yakin
    Description

    The "Bangladesh Rape Cases Data" dataset contains detailed information on rape cases reported in various districts of Bangladesh. This dataset is valuable for analyzing trends, patterns, and regional distributions of reported rape cases over a decade. It can be utilized by researchers, policymakers, and social scientists to study and address the issue of rape in Bangladesh. Total Sample Size: This dataset comprises a total of 2,813 rows, each representing an individual case reported in various districts across Bangladesh. Data Description: headline: Type: String Description: The headline of the news article reporting the rape case. It provides a brief summary of the incident. district-tag: Type: String Description: The district where the incident occurred. This helps in identifying the geographical distribution of the cases. division-tag: Type: String Description: The division of Bangladesh to which the district belongs. This is useful for broader regional analysis. subdistrict-tag: Type: String Description: The specific subdistrict or locality within the district where the incident occurred. This column may contain missing values if the subdistrict is not specified. id: Type: String (UUID format) Description: A unique identifier for each news article, ensuring that each entry can be distinctly referenced. url: Type: String Description: The web link to the original news article, allowing users to access the full report for more detailed information. last-published-at: Type: DateTime Description: The date and time when the news article was last published, helping to understand the timeline of the reported cases. offset: Type: Integer Description: An offset value for the article, potentially indicating its position in a larger dataset or the order of processing. content: Type: String Description: The main content of the news article, providing detailed information about the incident. Temporal Coverage: Minimum Date: February 22, 2013 Maximum Date: April 10, 2023 The dataset spans over a decade, allowing for a comprehensive temporal analysis of the reported cases. Potential Uses: Trend Analysis: Analyze how the frequency of reported cases changes over time. Geographical Analysis: Identify regions with higher or lower reporting rates. Content Analysis: Examine the language and details provided in the headlines and content to understand the nature of reporting. Correlation Studies: Investigate possible correlations between reported cases and other socio-economic factors. Data Quality and Considerations: Missing Values: Some columns, such as subdistrict-tag, may contain missing values where specific information was not provided. Data Source: The data is sourced from news articles, so it may be influenced by reporting biases and the availability of news coverage.

  19. T

    Japan Interest Rate

    • tradingeconomics.com
    • ru.tradingeconomics.com
    • +13more
    csv, excel, json, xml
    Updated Aug 8, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    TRADING ECONOMICS (2025). Japan Interest Rate [Dataset]. https://tradingeconomics.com/japan/interest-rate
    Explore at:
    excel, xml, json, csvAvailable download formats
    Dataset updated
    Aug 8, 2025
    Dataset authored and provided by
    TRADING ECONOMICS
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Oct 2, 1972 - Jul 31, 2025
    Area covered
    Japan
    Description

    The benchmark interest rate in Japan was last recorded at 0.50 percent. This dataset provides - Japan Interest Rate - actual values, historical data, forecast, chart, statistics, economic calendar and news.

  20. T

    Euro Area Interest Rate

    • tradingeconomics.com
    • zh.tradingeconomics.com
    • +13more
    csv, excel, json, xml
    Updated Jul 24, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    TRADING ECONOMICS (2025). Euro Area Interest Rate [Dataset]. https://tradingeconomics.com/euro-area/interest-rate
    Explore at:
    xml, json, csv, excelAvailable download formats
    Dataset updated
    Jul 24, 2025
    Dataset authored and provided by
    TRADING ECONOMICS
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Dec 18, 1998 - Jul 24, 2025
    Area covered
    Euro Area
    Description

    The benchmark interest rate In the Euro Area was last recorded at 2.15 percent. This dataset provides - Euro Area Interest Rate - actual values, historical data, forecast, chart, statistics, economic calendar and news.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Georgios Fatouros; Georgios Fatouros; Kalliopi Kouroumali; Kalliopi Kouroumali (2023). Forex News Annotated Dataset for Sentiment Analysis [Dataset]. http://doi.org/10.5281/zenodo.7976208
Organization logo

Forex News Annotated Dataset for Sentiment Analysis

Explore at:
csvAvailable download formats
Dataset updated
Nov 11, 2023
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Georgios Fatouros; Georgios Fatouros; Kalliopi Kouroumali; Kalliopi Kouroumali
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

This dataset contains news headlines relevant to key forex pairs: AUDUSD, EURCHF, EURUSD, GBPUSD, and USDJPY. The data was extracted from reputable platforms Forex Live and FXstreet over a period of 86 days, from January to May 2023. The dataset comprises 2,291 unique news headlines. Each headline includes an associated forex pair, timestamp, source, author, URL, and the corresponding article text. Data was collected using web scraping techniques executed via a custom service on a virtual machine. This service periodically retrieves the latest news for a specified forex pair (ticker) from each platform, parsing all available information. The collected data is then processed to extract details such as the article's timestamp, author, and URL. The URL is further used to retrieve the full text of each article. This data acquisition process repeats approximately every 15 minutes.

To ensure the reliability of the dataset, we manually annotated each headline for sentiment. Instead of solely focusing on the textual content, we ascertained sentiment based on the potential short-term impact of the headline on its corresponding forex pair. This method recognizes the currency market's acute sensitivity to economic news, which significantly influences many trading strategies. As such, this dataset could serve as an invaluable resource for fine-tuning sentiment analysis models in the financial realm.

We used three categories for annotation: 'positive', 'negative', and 'neutral', which correspond to bullish, bearish, and hold sentiments, respectively, for the forex pair linked to each headline. The following Table provides examples of annotated headlines along with brief explanations of the assigned sentiment.

Examples of Annotated Headlines


    Forex Pair
    Headline
    Sentiment
    Explanation




    GBPUSD 
    Diminishing bets for a move to 12400 
    Neutral
    Lack of strong sentiment in either direction


    GBPUSD 
    No reasons to dislike Cable in the very near term as long as the Dollar momentum remains soft 
    Positive
    Positive sentiment towards GBPUSD (Cable) in the near term


    GBPUSD 
    When are the UK jobs and how could they affect GBPUSD 
    Neutral
    Poses a question and does not express a clear sentiment


    JPYUSD
    Appropriate to continue monetary easing to achieve 2% inflation target with wage growth 
    Positive
    Monetary easing from Bank of Japan (BoJ) could lead to a weaker JPY in the short term due to increased money supply


    USDJPY
    Dollar rebounds despite US data. Yen gains amid lower yields 
    Neutral
    Since both the USD and JPY are gaining, the effects on the USDJPY forex pair might offset each other


    USDJPY
    USDJPY to reach 124 by Q4 as the likelihood of a BoJ policy shift should accelerate Yen gains 
    Negative
    USDJPY is expected to reach a lower value, with the USD losing value against the JPY


    AUDUSD

    <p>RBA Governor Lowe’s Testimony High inflation is damaging and corrosive </p>

    Positive
    Reserve Bank of Australia (RBA) expresses concerns about inflation. Typically, central banks combat high inflation with higher interest rates, which could strengthen AUD.

Moreover, the dataset includes two columns with the predicted sentiment class and score as predicted by the FinBERT model. Specifically, the FinBERT model outputs a set of probabilities for each sentiment class (positive, negative, and neutral), representing the model's confidence in associating the input headline with each sentiment category. These probabilities are used to determine the predicted class and a sentiment score for each headline. The sentiment score is computed by subtracting the negative class probability from the positive one.

Search
Clear search
Close search
Google apps
Main menu