Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The Gross Domestic Product (GDP) in the United States was worth 29184.89 billion US dollars in 2024, according to official data from the World Bank. The GDP value of the United States represents 27.49 percent of the world economy. This dataset provides - United States GDP - actual values, historical data, forecast, chart, statistics, economic calendar and news.
https://www.industryselect.com/licensehttps://www.industryselect.com/license
The U.S. manufacturing sector plays a central role in the economy, accounting for 20% of U.S. capital investment, 60% of the nation's exports and 70% of business R&D. Overall, the sector's market size, measured in terms of revenue is worth roughly $6 trillion, making it a major industry to do business with. So which U.S. states are the biggest for manufacturing? This article will explore the nation's top manufacturing states, measured by number of employees, based on MNI's database of 400,000 U.S. manufacturing companies.
Quality of life is a measure of comfort, health, and happiness by a person or a group of people. Quality of life is determined by both material factors, such as income and housing, and broader considerations like health, education, and freedom. Each year, US & World News releases its “Best States to Live in” report, which ranks states on the quality of life each state provides its residents. In order to determine rankings, U.S. News & World Report considers a wide range of factors, including healthcare, education, economy, infrastructure, opportunity, fiscal stability, crime and corrections, and the natural environment. More information on these categories and what is measured in each can be found below:
Healthcare includes access, quality, and affordability of healthcare, as well as health measurements, such as obesity rates and rates of smoking. Education measures how well public schools perform in terms of testing and graduation rates, as well as tuition costs associated with higher education and college debt load. Economy looks at GDP growth, migration to the state, and new business. Infrastructure includes transportation availability, road quality, communications, and internet access. Opportunity includes poverty rates, cost of living, housing costs and gender and racial equality. Fiscal Stability considers the health of the government's finances, including how well the state balances its budget. Crime and Corrections ranks a state’s public safety and measures prison systems and their populations. Natural Environment looks at the quality of air and water and exposure to pollution.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The Gross Domestic Product (GDP) in the United States expanded 2 percent in the second quarter of 2025 over the same quarter of the previous year. This dataset provides the latest reported value for - United States GDP Annual Growth Rate - plus previous releases, historical high and low, short-term forecast and long-term prediction, economic calendar, survey consensus and news.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The Gross Domestic Product (GDP) in the United States expanded 3 percent in the second quarter of 2025 over the previous quarter. This dataset provides the latest reported value for - United States GDP Growth Rate - plus previous releases, historical high and low, short-term forecast and long-term prediction, economic calendar, survey consensus and news.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
United States US: Income Share Held by Highest 10% data was reported at 30.600 % in 2016. This records an increase from the previous number of 30.100 % for 2013. United States US: Income Share Held by Highest 10% data is updated yearly, averaging 30.100 % from Dec 1979 (Median) to 2016, with 11 observations. The data reached an all-time high of 30.600 % in 2016 and a record low of 25.300 % in 1979. United States US: Income Share Held by Highest 10% data remains active status in CEIC and is reported by World Bank. The data is categorized under Global Database’s United States – Table US.World Bank.WDI: Poverty. Percentage share of income or consumption is the share that accrues to subgroups of population indicated by deciles or quintiles.; ; World Bank, Development Research Group. Data are based on primary household survey data obtained from government statistical agencies and World Bank country departments. Data for high-income economies are from the Luxembourg Income Study database. For more information and methodology, please see PovcalNet (http://iresearch.worldbank.org/PovcalNet/index.htm).; ; The World Bank’s internationally comparable poverty monitoring database now draws on income or detailed consumption data from more than one thousand six hundred household surveys across 164 countries in six regions and 25 other high income countries (industrialized economies). While income distribution data are published for all countries with data available, poverty data are published for low- and middle-income countries and countries eligible to receive loans from the World Bank (such as Chile) and recently graduated countries (such as Estonia) only. See PovcalNet (http://iresearch.worldbank.org/PovcalNet/WhatIsNew.aspx) for definitions of geographical regions and industrialized countries.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
United States US: GDP: PPP data was reported at 19,390,604.000 Intl $ mn in 2017. This records an increase from the previous number of 18,624,475.000 Intl $ mn for 2016. United States US: GDP: PPP data is updated yearly, averaging 11,892,799.000 Intl $ mn from Dec 1990 (Median) to 2017, with 28 observations. The data reached an all-time high of 19,390,604.000 Intl $ mn in 2017 and a record low of 5,979,589.000 Intl $ mn in 1990. United States US: GDP: PPP data remains active status in CEIC and is reported by World Bank. The data is categorized under Global Database’s United States – Table US.World Bank.WDI: Gross Domestic Product: Purchasing Power Parity. PPP GDP is gross domestic product converted to international dollars using purchasing power parity rates. An international dollar has the same purchasing power over GDP as the U.S. dollar has in the United States. GDP is the sum of gross value added by all resident producers in the economy plus any product taxes and minus any subsidies not included in the value of the products. It is calculated without making deductions for depreciation of fabricated assets or for depletion and degradation of natural resources. Data are in current international dollars. For most economies PPP figures are extrapolated from the 2011 International Comparison Program (ICP) benchmark estimates or imputed using a statistical model based on the 2011 ICP. For 47 high- and upper middle-income economies conversion factors are provided by Eurostat and the Organisation for Economic Co-operation and Development (OECD).; ; World Bank, International Comparison Program database.; Gap-filled total;
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The United States recorded a Government Debt to GDP of 124.30 percent of the country's Gross Domestic Product in 2024. This dataset provides - United States Government Debt To GDP - actual values, historical data, forecast, chart, statistics, economic calendar and news.
This U.S. Geological Survey (USGS) data release provides the descriptions of the only U.S. sites—including mineral regions, mineral occurrences, and mine features—that have reported production and (or) resources of tantalum (Ta). The sites in this data release have contained resource and (or) past production of more than 900 metric tons Ta metal, which was the approximate average annual consumption of Ta in the U.S. from 2016 through 2020. This dataset contains the Bokan Mountain deposit in Alaska and the Round Top deposit in Texas. Tantalum primarily occurs in the mineral tantalite, which may be found in carbonatites, alkaline granite-syenite complexes, and lithium-cesium-tantalum (LCT) pegmatites. The largest Ta deposits can be found in Australia, where the Greenbushes and Wodgina Mines have been producing Ta from pegmatites since the late 1880s. The Greenbushes is an LCT pegmatite deposit that contains more than 135 million metric tons of ore with an average grade of 0.022 percent Ta2O5. The Wodgina LCT pegmatite deposit contains more than 85 million metric tons of ore at a grade of 0.032 percent Ta2O5 (Schulz and others, 2017). In comparison, the largest Ta deposit in the U.S. is the Round Top deposit in Texas, which has reported resources of more than 480 million metric tons with an average grade of 67.2 grams per metric ton Ta2O5 (Hulse and others, 2019). There are no current U.S. producers of Ta. Tantalum is necessary for strategic, consumer, and commercial applications. Tantalum is highly conductive to heat and electricity and known for its resistance to acidic corrosion, thereby making this metal an ideal component for electronic capacitors, telecommunications, data storage, and implantable medical devices. In 2020, the U.S. was 100 percent net import reliant on Ta from countries such as China, Germany, Australia, and others. Tantalum is imported to the U.S. as ore and concentrate, metal and powder, as well as waste and scrap (U.S. Geological Survey, 2021). The entries and descriptions in the database were derived from published papers, reports, data, and internet documents representing a variety of sources, including geologic and exploration studies described in State, Federal, and industry reports. Resources extracted from older sources might not be compliant with current rules and guidelines in minerals industry standards such as National Instrument 43-101 (NI 43-101). The presence of a Ta mineral deposit in this database is not meant to imply that the deposit is currently economic. Rather, these deposits were included to capture the characteristics of the largest Ta deposits in the United States. Inclusion of material in the database is for descriptive purposes only and does not imply endorsement by the U.S. Government. The authors welcome additional published information in order to continually update and refine this dataset. Hulse, D.E., Malhotra, D., Matthews, T., and Emanuel, C., 2019, NI 43-101 preliminary economic assessment Round Top project, Sierra Blanca, Texas, prepared for USA Rare Earth LLC and Texas Mineral Resources Corp. [Filing Date July 1, 2019]: Gustavson Associates, LLC, 218 p., accessed October 17, 2019, at http://usarareearth.com/. Schulz, K.J., Piatak, N.M., and Papp, J.F., 2017, Niobium and tantalum, chap. M of Schulz, K.J., DeYoung, J.H., Jr., Seal, R.R., II, and Bradley, D.C., eds., Critical mineral resources of the United States—Economic and environmental geology and prospects for future supply: U.S. Geological Survey Professional Paper 1802, p. M1–M34, https://doi.org/10.3133/pp1802M. U.S. Geological Survey, 2021, Mineral commodity summaries 2021: U.S. Geological Survey, 200 p., https://doi.org/10.3133/mcs2021.
This dataset package is focused on U.S construction materials and three construction companies: Cemex, Martin Marietta & Vulcan.
In this package, SpaceKnow tracks manufacturing and processing facilities for construction material products all over the US. By tracking these facilities, we are able to give you near-real-time data on spending on these materials, which helps to predict residential and commercial real estate construction and spending in the US.
The dataset includes 40 indices focused on asphalt, cement, concrete, and building materials in general. You can look forward to receiving country-level and regional data (activity in the North, East, West, and South of the country) and the aforementioned company data.
SpaceKnow uses satellite (SAR) data to capture activity and building material manufacturing and processing facilities in the US.
Data is updated daily, has an average lag of 4-6 days, and history back to 2017.
The insights provide you with level and change data for refineries, storage, manufacturing, logistics, and employee parking-based locations.
SpaceKnow offers 3 delivery options: CSV, API, and Insights Dashboard
Available Indices Companies: Cemex (CX): Construction Materials (covers all manufacturing facilities of the company in the US), Concrete, Cement (refinery and storage) indices, and aggregates Martin Marietta (MLM): Construction Materials (covers all manufacturing facilities of the company in the US), Concrete, Cement (refinery and storage) indices, and aggregates Vulcan (VMC): Construction Materials (covers all manufacturing facilities of the company in the US), Concrete, Cement (refinery and storage) indices, and aggregates
USA Indices:
Aggregates USA Asphalt USA Cement USA Cement Refinery USA Cement Storage USA Concrete USA Construction Materials USA Construction Mining USA Construction Parking Lots USA Construction Materials Transfer Hub US Cement - Midwest, Northeast, South, West Cement Refinery - Midwest, Northeast, South, West Cement Storage - Midwest, Northeast, South, West
Why get SpaceKnow's U.S Construction Materials Package?
Monitor Construction Market Trends: Near-real-time insights into the construction industry allow clients to understand and anticipate market trends better.
Track Companies Performance: Monitor the operational activities, such as the volume of sales
Assess Risk: Use satellite activity data to assess the risks associated with investing in the construction industry.
Index Methodology Summary Continuous Feed Index (CFI) is a daily aggregation of the area of metallic objects in square meters. There are two types of CFI indices; CFI-R index gives the data in levels. It shows how many square meters are covered by metallic objects (for example employee cars at a facility). CFI-S index gives the change in data. It shows how many square meters have changed within the locations between two consecutive satellite images.
How to interpret the data SpaceKnow indices can be compared with the related economic indicators or KPIs. If the economic indicator is in monthly terms, perform a 30-day rolling sum and pick the last day of the month to compare with the economic indicator. Each data point will reflect approximately the sum of the month. If the economic indicator is in quarterly terms, perform a 90-day rolling sum and pick the last day of the 90-day to compare with the economic indicator. Each data point will reflect approximately the sum of the quarter.
Where the data comes from SpaceKnow brings you the data edge by applying machine learning and AI algorithms to synthetic aperture radar and optical satellite imagery. The company’s infrastructure searches and downloads new imagery every day, and the computations of the data take place within less than 24 hours.
In contrast to traditional economic data, which are released in monthly and quarterly terms, SpaceKnow data is high-frequency and available daily. It is possible to observe the latest movements in the construction industry with just a 4-6 day lag, on average.
The construction materials data help you to estimate the performance of the construction sector and the business activity of the selected companies.
The foundation of delivering high-quality data is based on the success of defining each location to observe and extract the data. All locations are thoroughly researched and validated by an in-house team of annotators and data analysts.
See below how our Construction Materials index performs against the US Non-residential construction spending benchmark
Each individual location is precisely defined to avoid noise in the data, which may arise from traffic or changing vegetation due to seasonal reasons.
SpaceKnow uses radar imagery and its own unique algorithms, so the indices do not lose their significance in bad weather conditions such as rain or heavy clouds.
→ Reach out to get free trial
...
West Virginia and Kansas had the lowest cost of living across all U.S. states, with composite costs being half of those found in Hawaii. This was according to a composite index that compares prices for various goods and services on a state-by-state basis. In West Virginia, the cost of living index amounted to **** — well below the national benchmark of 100. Virginia— which had an index value of ***** — was only slightly above that benchmark. Expensive places to live included Hawaii, Massachusetts, and California. Housing costs in the U.S. Housing is usually the highest expense in a household’s budget. In 2023, the average house sold for approximately ******* U.S. dollars, but house prices in the Northeast and West regions were significantly higher. Conversely, the South had some of the least expensive housing. In West Virginia, Mississippi, and Louisiana, the median price of the typical single-family home was less than ******* U.S. dollars. That makes living expenses in these states significantly lower than in states such as Hawaii and California, where housing is much pricier. What other expenses affect the cost of living? Utility costs such as electricity, natural gas, water, and internet also influence the cost of living. In Alaska, Hawaii, and Connecticut, the average monthly utility cost exceeded *** U.S. dollars. That was because of the significantly higher prices for electricity and natural gas in these states.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
United States US: Income Share Held by Highest 20% data was reported at 46.900 % in 2016. This records an increase from the previous number of 46.400 % for 2013. United States US: Income Share Held by Highest 20% data is updated yearly, averaging 46.000 % from Dec 1979 (Median) to 2016, with 11 observations. The data reached an all-time high of 46.900 % in 2016 and a record low of 41.200 % in 1979. United States US: Income Share Held by Highest 20% data remains active status in CEIC and is reported by World Bank. The data is categorized under Global Database’s United States – Table US.World Bank.WDI: Poverty. Percentage share of income or consumption is the share that accrues to subgroups of population indicated by deciles or quintiles. Percentage shares by quintile may not sum to 100 because of rounding.; ; World Bank, Development Research Group. Data are based on primary household survey data obtained from government statistical agencies and World Bank country departments. Data for high-income economies are from the Luxembourg Income Study database. For more information and methodology, please see PovcalNet (http://iresearch.worldbank.org/PovcalNet/index.htm).; ; The World Bank’s internationally comparable poverty monitoring database now draws on income or detailed consumption data from more than one thousand six hundred household surveys across 164 countries in six regions and 25 other high income countries (industrialized economies). While income distribution data are published for all countries with data available, poverty data are published for low- and middle-income countries and countries eligible to receive loans from the World Bank (such as Chile) and recently graduated countries (such as Estonia) only. See PovcalNet (http://iresearch.worldbank.org/PovcalNet/WhatIsNew.aspx) for definitions of geographical regions and industrialized countries.
https://fred.stlouisfed.org/legal/#copyright-public-domainhttps://fred.stlouisfed.org/legal/#copyright-public-domain
Graph and download economic data for Real gross domestic product per capita (A939RX0Q048SBEA) from Q1 1947 to Q2 2025 about per capita, real, GDP, and USA.
Comprehensive dataset of 138 Economic development agencies in Illinois, United States as of July, 2025. Includes verified contact information (email, phone), geocoded addresses, customer ratings, reviews, business categories, and operational details. Perfect for market research, lead generation, competitive analysis, and business intelligence. Download a complimentary sample to evaluate data quality and completeness.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F1937611%2F82267b1a15f8669ec2a072972bebccb5%2Fquality-of-life-by-us-state.png?generation=1717697280376438&alt=media" alt="">
This dataset provides insights into the quality of life across different states in the United States for the year 2024. Quality of life, encompassing aspects like comfort, health, and happiness, is evaluated through various metrics including affordability, economy, education, and safety. Dive into this dataset to understand how different states fare in terms of overall quality of life and its individual components.
These descriptions provide an overview of what each column represents and the specific aspects of quality of life they assess for each U.S. state.
Since the second half of the 20th century, there has been an increase in scientific interest, research effort, and information gathered on the geologic sedimentary character of the continental margins of the United States. Data and information from thousands of sources have increased our scientific understanding of the geologic origins of the margin surface but rarely have those data been combined into a unified database. Initially, usSEABED was created by the U.S. Geological Survey (USGS), in cooperation with the Institute of Arctic and Alpine Research at the University of Colorado Boulder, for assessments of marine-based aggregates and for studies of sea-floor habitats by the U.S. Geological Survey (USGS). Since then, the USGS has continued to build up the database as a nationwide resource for many uses and applications. Previously published data derived from the usSEABED database have been released as three USGS data series publications containing data covering the U.S. Atlantic margin, the Gulf of Mexico and Caribbean regions, and the Pacific coast (Reid and others, 2005; Buczkowski and others, 2006; and Reid and others, 2006). This expanded USGS data release unifies the data from these three publications and includes an additional 54 data sources added to usSEABED since the original data series, provides revised output files, and expands the data coverage to include usSEABED data from all areas within the U.S. Exclusive Economic Zone (EEZ) as of the time of publication (including Alaska, Hawaii, and U.S. overseas territories). The usSEABED database was created using the most recent stable version of the dbSEABED software available to the USGS at the time of release (specifically, dbSEABED software [NMEv, version date 4/23/2010] using the dbSEABED thesaurus [db9 dict.rtf, version date 8/21/2009], the component set up file for U.S. waters [SET ABUN 2016.txt, version date 5/29/2016], and the facies set up file for U.S. waters [SET FACI.txt, version date 3/16/2012]). The USGS Open-File Report "Sediments and the sea floor of the continental shelves and coastal waters of the United States: About the usSEABED integrated sea-floor-characterization database, built with the dbSEABED processing system" (Buczkowski and others, 2020) accompanies this data release and provides information on the usSEABED database as well as the dbSEABED data processing system. Users are encouraged to read this companion report to learn more about how usSEABED is built, how the data should be interpreted, and how they are best used.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Introduction
There are several works based on Natural Language Processing on newspaper reports. Mining opinions from headlines [ 1 ] using Standford NLP and SVM by Rameshbhaiet. Al.compared several algorithms on a small and large dataset. Rubinet. al., in their paper [ 2 ], created a mechanism to differentiate fake news from real ones by building a set of characteristics of news according to their types. The purpose was to contribute to the low resource data available for training machine learning algorithms. Doumitet. al.in [ 3 ] have implemented LDA, a topic modeling approach to study bias present in online news media.
However, there are not many NLP research invested in studying COVID-19. Most applications include classification of chest X-rays and CT-scans to detect presence of pneumonia in lungs [ 4 ], a consequence of the virus. Other research areas include studying the genome sequence of the virus[ 5 ][ 6 ][ 7 ] and replicating its structure to fight and find a vaccine. This research is crucial in battling the pandemic. The few NLP based research publications are sentiment classification of online tweets by Samuel et el [ 8 ] to understand fear persisting in people due to the virus. Similar work has been done using the LSTM network to classify sentiments from online discussion forums by Jelodaret. al.[ 9 ]. NKK dataset is the first study on a comparatively larger dataset of a newspaper report on COVID-19, which contributed to the virus’s awareness to the best of our knowledge.
2 Data-set Introduction
2.1 Data Collection
We accumulated 1000 online newspaper report from United States of America (USA) on COVID-19. The newspaper includes The Washington Post (USA) and StarTribune (USA). We have named it as “Covid-News-USA-NNK”. We also accumulated 50 online newspaper report from Bangladesh on the issue and named it “Covid-News-BD-NNK”. The newspaper includes The Daily Star (BD) and Prothom Alo (BD). All these newspapers are from the top provider and top read in the respective countries. The collection was done manually by 10 human data-collectors of age group 23- with university degrees. This approach was suitable compared to automation to ensure the news were highly relevant to the subject. The newspaper online sites had dynamic content with advertisements in no particular order. Therefore there were high chances of online scrappers to collect inaccurate news reports. One of the challenges while collecting the data is the requirement of subscription. Each newspaper required $1 per subscriptions. Some criteria in collecting the news reports provided as guideline to the human data-collectors were as follows:
The headline must have one or more words directly or indirectly related to COVID-19.
The content of each news must have 5 or more keywords directly or indirectly related to COVID-19.
The genre of the news can be anything as long as it is relevant to the topic. Political, social, economical genres are to be more prioritized.
Avoid taking duplicate reports.
Maintain a time frame for the above mentioned newspapers.
To collect these data we used a google form for USA and BD. We have two human editor to go through each entry to check any spam or troll entry.
2.2 Data Pre-processing and Statistics
Some pre-processing steps performed on the newspaper report dataset are as follows:
Remove hyperlinks.
Remove non-English alphanumeric characters.
Remove stop words.
Lemmatize text.
While more pre-processing could have been applied, we tried to keep the data as much unchanged as possible since changing sentence structures could result us in valuable information loss. While this was done with help of a script, we also assigned same human collectors to cross check for any presence of the above mentioned criteria.
The primary data statistics of the two dataset are shown in Table 1 and 2.
Table 1: Covid-News-USA-NNK data statistics
No of words per headline
7 to 20
No of words per body content
150 to 2100
Table 2: Covid-News-BD-NNK data statistics No of words per headline
10 to 20
No of words per body content
100 to 1500
2.3 Dataset Repository
We used GitHub as our primary data repository in account name NKK^1. Here, we created two repositories USA-NKK^2 and BD-NNK^3. The dataset is available in both CSV and JSON format. We are regularly updating the CSV files and regenerating JSON using a py script. We provided a python script file for essential operation. We welcome all outside collaboration to enrich the dataset.
3 Literature Review
Natural Language Processing (NLP) deals with text (also known as categorical) data in computer science, utilizing numerous diverse methods like one-hot encoding, word embedding, etc., that transform text to machine language, which can be fed to multiple machine learning and deep learning algorithms.
Some well-known applications of NLP includes fraud detection on online media sites[ 10 ], using authorship attribution in fallback authentication systems[ 11 ], intelligent conversational agents or chatbots[ 12 ] and machine translations used by Google Translate[ 13 ]. While these are all downstream tasks, several exciting developments have been made in the algorithm solely for Natural Language Processing tasks. The two most trending ones are BERT[ 14 ], which uses bidirectional encoder-decoder architecture to create the transformer model, that can do near-perfect classification tasks and next-word predictions for next generations, and GPT-3 models released by OpenAI[ 15 ] that can generate texts almost human-like. However, these are all pre-trained models since they carry huge computation cost. Information Extraction is a generalized concept of retrieving information from a dataset. Information extraction from an image could be retrieving vital feature spaces or targeted portions of an image; information extraction from speech could be retrieving information about names, places, etc[ 16 ]. Information extraction in texts could be identifying named entities and locations or essential data. Topic modeling is a sub-task of NLP and also a process of information extraction. It clusters words and phrases of the same context together into groups. Topic modeling is an unsupervised learning method that gives us a brief idea about a set of text. One commonly used topic modeling is Latent Dirichlet Allocation or LDA[17].
Keyword extraction is a process of information extraction and sub-task of NLP to extract essential words and phrases from a text. TextRank [ 18 ] is an efficient keyword extraction technique that uses graphs to calculate the weight of each word and pick the words with more weight to it.
Word clouds are a great visualization technique to understand the overall ’talk of the topic’. The clustered words give us a quick understanding of the content.
4 Our experiments and Result analysis
We used the wordcloud library^4 to create the word clouds. Figure 1 and 3 presents the word cloud of Covid-News-USA- NNK dataset by month from February to May. From the figures 1,2,3, we can point few information:
In February, both the news paper have talked about China and source of the outbreak.
StarTribune emphasized on Minnesota as the most concerned state. In April, it seemed to have been concerned more.
Both the newspaper talked about the virus impacting the economy, i.e, bank, elections, administrations, markets.
Washington Post discussed global issues more than StarTribune.
StarTribune in February mentioned the first precautionary measurement: wearing masks, and the uncontrollable spread of the virus throughout the nation.
While both the newspaper mentioned the outbreak in China in February, the weight of the spread in the United States are more highlighted through out March till May, displaying the critical impact caused by the virus.
We used a script to extract all numbers related to certain keywords like ’Deaths’, ’Infected’, ’Died’ , ’Infections’, ’Quarantined’, Lock-down’, ’Diagnosed’ etc from the news reports and created a number of cases for both the newspaper. Figure 4 shows the statistics of this series. From this extraction technique, we can observe that April was the peak month for the covid cases as it gradually rose from February. Both the newspaper clearly shows us that the rise in covid cases from February to March was slower than the rise from March to April. This is an important indicator of possible recklessness in preparations to battle the virus. However, the steep fall from April to May also shows the positive response against the attack. We used Vader Sentiment Analysis to extract sentiment of the headlines and the body. On average, the sentiments were from -0.5 to -0.9. Vader Sentiment scale ranges from -1(highly negative to 1(highly positive). There were some cases
where the sentiment scores of the headline and body contradicted each other,i.e., the sentiment of the headline was negative but the sentiment of the body was slightly positive. Overall, sentiment analysis can assist us sort the most concerning (most negative) news from the positive ones, from which we can learn more about the indicators related to COVID-19 and the serious impact caused by it. Moreover, sentiment analysis can also provide us information about how a state or country is reacting to the pandemic. We used PageRank algorithm to extract keywords from headlines as well as the body content. PageRank efficiently highlights important relevant keywords in the text. Some frequently occurring important keywords extracted from both the datasets are: ’China’, Government’, ’Masks’, ’Economy’, ’Crisis’, ’Theft’ , ’Stock market’ , ’Jobs’ , ’Election’, ’Missteps’, ’Health’, ’Response’. Keywords extraction acts as a filter allowing quick searches for indicators in case of locating situations of the economy,
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
When I was searching for COVID-19 datasets online, I soon realized that there were no comprehensive datasets of the United States on a county level basis which included social, economic, and demographic factors in addition to the general case information that was already available on several sites. To quench my thirst for clean and relevant data, I proceeded to gather information from several various sources to compile the dataset I was looking for.
I started by looking for a reliable dataset that has general information such as confirmed cases, deaths, etc. I found John Hopkin's COVID-19 dataset to be the best one for this purpose as it is well organized and updated daily. Then, I set out looking for economic factors and population data for each county in the United States. I found a collection of such files compiled by the Economic Research Service branch of the USDA on their website. Finally, I had to find a dataset which had racial and demographic information for each county, which I found on the US Census Bureau's website under a page which was dedicated to county population data by several characteristics. Now that I had all the data I was looking for, I proceeded to find which counties were common in all datasets. After several hours of cleaning each dataset and extracting relevant information, I combined all the information into one CSV file with 2959 counties of clean information - exactly what I was looking for.
I hope that the Kaggle community will use this dataset to answer important questions regarding COVID-19 in the United States and the role that external economic, social, and demographic factors play in the shaping of the pandemic. I know that there are several patterns to be discovered and I sincerely hope that this helps our community understand just a little more about the pandemic than we do right now.
Explore the World Competitiveness Ranking dataset for 2016, including key indicators such as GDP per capita, fixed telephone tariffs, and pension funding. Discover insights on social cohesion, scientific research, and digital transformation in various countries.
Social cohesion, The image abroad of your country encourages business development, Scientific articles published by origin of author, International Telecommunication Union, World Telecommunication/ICT Indicators database, Data reproduced with the kind permission of ITU, National sources, Fixed telephone tariffs, GDP (PPP) per capita, Overall, Exports of goods - growth, Pension funding is adequately addressed for the future, Companies are very good at using big data and analytics to support decision-making, Gross fixed capital formation - real growth, Economic Performance, Scientific research legislation, Percentage of GDP, Health infrastructure meets the needs of society, Estimates based on preliminary data for the most recent year., Singapore: including re-exports., Value, Laws relating to scientific research do encourage innovation, % of GDP, Gross Domestic Product (GDP), Health Infrastructure, Digital transformation in companies is generally well understood, Industrial disputes, EE, Female / male ratio, State ownership of enterprises, Total expenditure on R&D (%), Score, Colombia, Estimates for the most recent year., Percentage change, based on US$ values, Number of listed domestic companies, Tax evasion is not a threat to your economy, Scientific articles, Tax evasion, % change, Use of big data and analytics, National sources, Disposable Income, Equal opportunity, Listed domestic companies, Government budget surplus/deficit (%), Pension funding, US$ per capita at purchasing power parity, Estimates; US$ per capita at purchasing power parity, Image abroad or branding, Equal opportunity legislation in your economy encourages economic development, Number, Article counts are from a selection of journals, books, and conference proceedings in S&E from Scopus. Articles are classified by their year of publication and are assigned to a region/country/economy on the basis of the institutional address(es) listed in the article. Articles are credited on a fractional-count basis. The sum of the countries/economies may not add to the world total because of rounding. Some publications have incomplete address information for coauthored publications in the Scopus database. The unassigned category count is the sum of fractional counts for publications that cannot be assigned to a country or economy. Hong Kong: research output items by the higher education institutions funded by the University Grants Committee only., State ownership of enterprises is not a threat to business activities, Protectionism does not impair the conduct of your business, Digital transformation in companies, Total final energy consumption per capita, Social cohesion is high, Rank, MTOE per capita, Percentage change, based on constant prices, US$ billions, National sources, World Trade Organization Statistics database, Rank, Score, Value, World Rankings
Argentina, Australia, Austria, Belgium, Brazil, Bulgaria, Canada, Chile, China, Colombia, Croatia, Cyprus, Denmark, Estonia, Finland, France, Germany, Greece, Hungary, Iceland, India, Indonesia, Ireland, Israel, Italy, Japan, Jordan, Kazakhstan, Latvia, Lithuania, Luxembourg, Malaysia, Mexico, Mongolia, Netherlands, New Zealand, Norway, Oman, Peru, Philippines, Poland, Portugal, Qatar, Romania, Russia, Saudi Arabia, Singapore, Slovenia, South Africa, Spain, Sweden, Switzerland, Thailand, Turkey, Ukraine, United Kingdom, Venezuela
Follow data.kapsarc.org for timely data to advance energy economics research.
Comprehensive dataset of 77 Economic consultants in California, United States as of July, 2025. Includes verified contact information (email, phone), geocoded addresses, customer ratings, reviews, business categories, and operational details. Perfect for market research, lead generation, competitive analysis, and business intelligence. Download a complimentary sample to evaluate data quality and completeness.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The Gross Domestic Product (GDP) in the United States was worth 29184.89 billion US dollars in 2024, according to official data from the World Bank. The GDP value of the United States represents 27.49 percent of the world economy. This dataset provides - United States GDP - actual values, historical data, forecast, chart, statistics, economic calendar and news.