The total amount of data created, captured, copied, and consumed globally is forecast to increase rapidly, reaching *** zettabytes in 2024. Over the next five years up to 2028, global data creation is projected to grow to more than *** zettabytes. In 2020, the amount of data created and replicated reached a new high. The growth was higher than previously expected, caused by the increased demand due to the COVID-19 pandemic, as more people worked and learned from home and used home entertainment options more often. Storage capacity also growing Only a small percentage of this newly created data is kept though, as just * percent of the data produced and consumed in 2020 was saved and retained into 2021. In line with the strong growth of the data volume, the installed base of storage capacity is forecast to increase, growing at a compound annual growth rate of **** percent over the forecast period from 2020 to 2025. In 2020, the installed base of storage capacity reached *** zettabytes.
On March 10, 2023, the Johns Hopkins Coronavirus Resource Center ceased its collecting and reporting of global COVID-19 data. For updated cases, deaths, and vaccine data please visit: World Health Organization (WHO)For more information, visit the Johns Hopkins Coronavirus Resource Center.COVID-19 Trends MethodologyOur goal is to analyze and present daily updates in the form of recent trends within countries, states, or counties during the COVID-19 global pandemic. The data we are analyzing is taken directly from the Johns Hopkins University Coronavirus COVID-19 Global Cases Dashboard, though we expect to be one day behind the dashboard’s live feeds to allow for quality assurance of the data.DOI: https://doi.org/10.6084/m9.figshare.125529863/7/2022 - Adjusted the rate of active cases calculation in the U.S. to reflect the rates of serious and severe cases due nearly completely dominant Omicron variant.6/24/2020 - Expanded Case Rates discussion to include fix on 6/23 for calculating active cases.6/22/2020 - Added Executive Summary and Subsequent Outbreaks sectionsRevisions on 6/10/2020 based on updated CDC reporting. This affects the estimate of active cases by revising the average duration of cases with hospital stays downward from 30 days to 25 days. The result shifted 76 U.S. counties out of Epidemic to Spreading trend and no change for national level trends.Methodology update on 6/2/2020: This sets the length of the tail of new cases to 6 to a maximum of 14 days, rather than 21 days as determined by the last 1/3 of cases. This was done to align trends and criteria for them with U.S. CDC guidance. The impact is areas transition into Controlled trend sooner for not bearing the burden of new case 15-21 days earlier.Correction on 6/1/2020Discussion of our assertion of an abundance of caution in assigning trends in rural counties added 5/7/2020. Revisions added on 4/30/2020 are highlighted.Revisions added on 4/23/2020 are highlighted.Executive SummaryCOVID-19 Trends is a methodology for characterizing the current trend for places during the COVID-19 global pandemic. Each day we assign one of five trends: Emergent, Spreading, Epidemic, Controlled, or End Stage to geographic areas to geographic areas based on the number of new cases, the number of active cases, the total population, and an algorithm (described below) that contextualize the most recent fourteen days with the overall COVID-19 case history. Currently we analyze the countries of the world and the U.S. Counties. The purpose is to give policymakers, citizens, and analysts a fact-based data driven sense for the direction each place is currently going. When a place has the initial cases, they are assigned Emergent, and if that place controls the rate of new cases, they can move directly to Controlled, and even to End Stage in a short time. However, if the reporting or measures to curtail spread are not adequate and significant numbers of new cases continue, they are assigned to Spreading, and in cases where the spread is clearly uncontrolled, Epidemic trend.We analyze the data reported by Johns Hopkins University to produce the trends, and we report the rates of cases, spikes of new cases, the number of days since the last reported case, and number of deaths. We also make adjustments to the assignments based on population so rural areas are not assigned trends based solely on case rates, which can be quite high relative to local populations.Two key factors are not consistently known or available and should be taken into consideration with the assigned trend. First is the amount of resources, e.g., hospital beds, physicians, etc.that are currently available in each area. Second is the number of recoveries, which are often not tested or reported. On the latter, we provide a probable number of active cases based on CDC guidance for the typical duration of mild to severe cases.Reasons for undertaking this work in March of 2020:The popular online maps and dashboards show counts of confirmed cases, deaths, and recoveries by country or administrative sub-region. Comparing the counts of one country to another can only provide a basis for comparison during the initial stages of the outbreak when counts were low and the number of local outbreaks in each country was low. By late March 2020, countries with small populations were being left out of the mainstream news because it was not easy to recognize they had high per capita rates of cases (Switzerland, Luxembourg, Iceland, etc.). Additionally, comparing countries that have had confirmed COVID-19 cases for high numbers of days to countries where the outbreak occurred recently is also a poor basis for comparison.The graphs of confirmed cases and daily increases in cases were fit into a standard size rectangle, though the Y-axis for one country had a maximum value of 50, and for another country 100,000, which potentially misled people interpreting the slope of the curve. Such misleading circumstances affected comparing large population countries to small population counties or countries with low numbers of cases to China which had a large count of cases in the early part of the outbreak. These challenges for interpreting and comparing these graphs represent work each reader must do based on their experience and ability. Thus, we felt it would be a service to attempt to automate the thought process experts would use when visually analyzing these graphs, particularly the most recent tail of the graph, and provide readers with an a resulting synthesis to characterize the state of the pandemic in that country, state, or county.The lack of reliable data for confirmed recoveries and therefore active cases. Merely subtracting deaths from total cases to arrive at this figure progressively loses accuracy after two weeks. The reason is 81% of cases recover after experiencing mild symptoms in 10 to 14 days. Severe cases are 14% and last 15-30 days (based on average days with symptoms of 11 when admitted to hospital plus 12 days median stay, and plus of one week to include a full range of severely affected people who recover). Critical cases are 5% and last 31-56 days. Sources:U.S. CDC. April 3, 2020 Interim Clinical Guidance for Management of Patients with Confirmed Coronavirus Disease (COVID-19). Accessed online. Initial older guidance was also obtained online. Additionally, many people who recover may not be tested, and many who are, may not be tracked due to privacy laws. Thus, the formula used to compute an estimate of active cases is: Active Cases = 100% of new cases in past 14 days + 19% from past 15-25 days + 5% from past 26-49 days - total deaths. On 3/17/2022, the U.S. calculation was adjusted to: Active Cases = 100% of new cases in past 14 days + 6% from past 15-25 days + 3% from past 26-49 days - total deaths. Sources: https://www.cdc.gov/mmwr/volumes/71/wr/mm7104e4.htm https://covid.cdc.gov/covid-data-tracker/#variant-proportions If a new variant arrives and appears to cause higher rates of serious cases, we will roll back this adjustment. We’ve never been inside a pandemic with the ability to learn of new cases as they are confirmed anywhere in the world. After reviewing epidemiological and pandemic scientific literature, three needs arose. We need to specify which portions of the pandemic lifecycle this map cover. The World Health Organization (WHO) specifies six phases. The source data for this map begins just after the beginning of Phase 5: human to human spread and encompasses Phase 6: pandemic phase. Phase six is only characterized in terms of pre- and post-peak. However, these two phases are after-the-fact analyses and cannot ascertained during the event. Instead, we describe (below) a series of five trends for Phase 6 of the COVID-19 pandemic.Choosing terms to describe the five trends was informed by the scientific literature, particularly the use of epidemic, which signifies uncontrolled spread. The five trends are: Emergent, Spreading, Epidemic, Controlled, and End Stage. Not every locale will experience all five, but all will experience at least three: emergent, controlled, and end stage.This layer presents the current trends for the COVID-19 pandemic by country (or appropriate level). There are five trends:Emergent: Early stages of outbreak. Spreading: Early stages and depending on an administrative area’s capacity, this may represent a manageable rate of spread. Epidemic: Uncontrolled spread. Controlled: Very low levels of new casesEnd Stage: No New cases These trends can be applied at several levels of administration: Local: Ex., City, District or County – a.k.a. Admin level 2State: Ex., State or Province – a.k.a. Admin level 1National: Country – a.k.a. Admin level 0Recommend that at least 100,000 persons be represented by a unit; granted this may not be possible, and then the case rate per 100,000 will become more important.Key Concepts and Basis for Methodology: 10 Total Cases minimum threshold: Empirically, there must be enough cases to constitute an outbreak. Ideally, this would be 5.0 per 100,000, but not every area has a population of 100,000 or more. Ten, or fewer, cases are also relatively less difficult to track and trace to sources. 21 Days of Cases minimum threshold: Empirically based on COVID-19 and would need to be adjusted for any other event. 21 days is also the minimum threshold for analyzing the “tail” of the new cases curve, providing seven cases as the basis for a likely trend (note that 21 days in the tail is preferred). This is the minimum needed to encompass the onset and duration of a normal case (5-7 days plus 10-14 days). Specifically, a median of 5.1 days incubation time, and 11.2 days for 97.5% of cases to incubate. This is also driven by pressure to understand trends and could easily be adjusted to 28 days. Source
An August 2020 survey of fraud examiners worldwide revealed increases in different types of fraud risks after the start of the coronavirus pandemic. In May 2020, 29 percent of respondents reported a significant increase in identity theft risk. Additionally, 43 percent of respondents expected a significant increase in identity theft risk over the next twelve months.
Notice of data discontinuation: Since the start of the pandemic, AP has reported case and death counts from data provided by Johns Hopkins University. Johns Hopkins University has announced that they will stop their daily data collection efforts after March 10. As Johns Hopkins stops providing data, the AP will also stop collecting daily numbers for COVID cases and deaths. The HHS and CDC now collect and visualize key metrics for the pandemic. AP advises using those resources when reporting on the pandemic going forward.
April 9, 2020
April 20, 2020
April 29, 2020
September 1st, 2020
February 12, 2021
new_deaths
column.February 16, 2021
The AP is using data collected by the Johns Hopkins University Center for Systems Science and Engineering as our source for outbreak caseloads and death counts for the United States and globally.
The Hopkins data is available at the county level in the United States. The AP has paired this data with population figures and county rural/urban designations, and has calculated caseload and death rates per 100,000 people. Be aware that caseloads may reflect the availability of tests -- and the ability to turn around test results quickly -- rather than actual disease spread or true infection rates.
This data is from the Hopkins dashboard that is updated regularly throughout the day. Like all organizations dealing with data, Hopkins is constantly refining and cleaning up their feed, so there may be brief moments where data does not appear correctly. At this link, you’ll find the Hopkins daily data reports, and a clean version of their feed.
The AP is updating this dataset hourly at 45 minutes past the hour.
To learn more about AP's data journalism capabilities for publishers, corporations and financial institutions, go here or email kromano@ap.org.
Use AP's queries to filter the data or to join to other datasets we've made available to help cover the coronavirus pandemic
Filter cases by state here
Rank states by their status as current hotspots. Calculates the 7-day rolling average of new cases per capita in each state: https://data.world/associatedpress/johns-hopkins-coronavirus-case-tracker/workspace/query?queryid=481e82a4-1b2f-41c2-9ea1-d91aa4b3b1ac
Find recent hotspots within your state by running a query to calculate the 7-day rolling average of new cases by capita in each county: https://data.world/associatedpress/johns-hopkins-coronavirus-case-tracker/workspace/query?queryid=b566f1db-3231-40fe-8099-311909b7b687&showTemplatePreview=true
Join county-level case data to an earlier dataset released by AP on local hospital capacity here. To find out more about the hospital capacity dataset, see the full details.
Pull the 100 counties with the highest per-capita confirmed cases here
Rank all the counties by the highest per-capita rate of new cases in the past 7 days here. Be aware that because this ranks per-capita caseloads, very small counties may rise to the very top, so take into account raw caseload figures as well.
The AP has designed an interactive map to track COVID-19 cases reported by Johns Hopkins.
@(https://datawrapper.dwcdn.net/nRyaf/15/)
<iframe title="USA counties (2018) choropleth map Mapping COVID-19 cases by county" aria-describedby="" id="datawrapper-chart-nRyaf" src="https://datawrapper.dwcdn.net/nRyaf/10/" scrolling="no" frameborder="0" style="width: 0; min-width: 100% !important;" height="400"></iframe><script type="text/javascript">(function() {'use strict';window.addEventListener('message', function(event) {if (typeof event.data['datawrapper-height'] !== 'undefined') {for (var chartId in event.data['datawrapper-height']) {var iframe = document.getElementById('datawrapper-chart-' + chartId) || document.querySelector("iframe[src*='" + chartId + "']");if (!iframe) {continue;}iframe.style.height = event.data['datawrapper-height'][chartId] + 'px';}}});})();</script>
Johns Hopkins timeseries data - Johns Hopkins pulls data regularly to update their dashboard. Once a day, around 8pm EDT, Johns Hopkins adds the counts for all areas they cover to the timeseries file. These counts are snapshots of the latest cumulative counts provided by the source on that day. This can lead to inconsistencies if a source updates their historical data for accuracy, either increasing or decreasing the latest cumulative count. - Johns Hopkins periodically edits their historical timeseries data for accuracy. They provide a file documenting all errors in their timeseries files that they have identified and fixed here
This data should be credited to Johns Hopkins University COVID-19 tracking project
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
COVID-19 data for World from 2020-01-22 to 2023-03-09, including tot_confirmed, tot_deaths, tot_recovered
Given the survey results, it appears that global data-powered enterprises gain greater operational efficiencies, such as an increase in sales of traditional and new products and services, compared to those enterprises which are not data-powered. For example, in 2020, data-powered enterprises saw an ** to ** percent increase in sales of traditional and new products and services, while enterprises that were not data-powered only saw a 12 percent increase for both.
On March 10, 2023, the Johns Hopkins Coronavirus Resource Center ceased its collecting and reporting of global COVID-19 data. For updated cases, deaths, and vaccine data please visit: World Health Organization (WHO)For more information, visit the Johns Hopkins Coronavirus Resource Center.COVID-19 Trends MethodologyOur goal is to analyze and present daily updates in the form of recent trends within countries, states, or counties during the COVID-19 global pandemic. The data we are analyzing is taken directly from the Johns Hopkins University Coronavirus COVID-19 Global Cases Dashboard, though we expect to be one day behind the dashboard’s live feeds to allow for quality assurance of the data.DOI: https://doi.org/10.6084/m9.figshare.125529863/7/2022 - Adjusted the rate of active cases calculation in the U.S. to reflect the rates of serious and severe cases due nearly completely dominant Omicron variant.6/24/2020 - Expanded Case Rates discussion to include fix on 6/23 for calculating active cases.6/22/2020 - Added Executive Summary and Subsequent Outbreaks sectionsRevisions on 6/10/2020 based on updated CDC reporting. This affects the estimate of active cases by revising the average duration of cases with hospital stays downward from 30 days to 25 days. The result shifted 76 U.S. counties out of Epidemic to Spreading trend and no change for national level trends.Methodology update on 6/2/2020: This sets the length of the tail of new cases to 6 to a maximum of 14 days, rather than 21 days as determined by the last 1/3 of cases. This was done to align trends and criteria for them with U.S. CDC guidance. The impact is areas transition into Controlled trend sooner for not bearing the burden of new case 15-21 days earlier.Correction on 6/1/2020Discussion of our assertion of an abundance of caution in assigning trends in rural counties added 5/7/2020. Revisions added on 4/30/2020 are highlighted.Revisions added on 4/23/2020 are highlighted.Executive SummaryCOVID-19 Trends is a methodology for characterizing the current trend for places during the COVID-19 global pandemic. Each day we assign one of five trends: Emergent, Spreading, Epidemic, Controlled, or End Stage to geographic areas to geographic areas based on the number of new cases, the number of active cases, the total population, and an algorithm (described below) that contextualize the most recent fourteen days with the overall COVID-19 case history. Currently we analyze the countries of the world and the U.S. Counties. The purpose is to give policymakers, citizens, and analysts a fact-based data driven sense for the direction each place is currently going. When a place has the initial cases, they are assigned Emergent, and if that place controls the rate of new cases, they can move directly to Controlled, and even to End Stage in a short time. However, if the reporting or measures to curtail spread are not adequate and significant numbers of new cases continue, they are assigned to Spreading, and in cases where the spread is clearly uncontrolled, Epidemic trend.We analyze the data reported by Johns Hopkins University to produce the trends, and we report the rates of cases, spikes of new cases, the number of days since the last reported case, and number of deaths. We also make adjustments to the assignments based on population so rural areas are not assigned trends based solely on case rates, which can be quite high relative to local populations.Two key factors are not consistently known or available and should be taken into consideration with the assigned trend. First is the amount of resources, e.g., hospital beds, physicians, etc.that are currently available in each area. Second is the number of recoveries, which are often not tested or reported. On the latter, we provide a probable number of active cases based on CDC guidance for the typical duration of mild to severe cases.Reasons for undertaking this work in March of 2020:The popular online maps and dashboards show counts of confirmed cases, deaths, and recoveries by country or administrative sub-region. Comparing the counts of one country to another can only provide a basis for comparison during the initial stages of the outbreak when counts were low and the number of local outbreaks in each country was low. By late March 2020, countries with small populations were being left out of the mainstream news because it was not easy to recognize they had high per capita rates of cases (Switzerland, Luxembourg, Iceland, etc.). Additionally, comparing countries that have had confirmed COVID-19 cases for high numbers of days to countries where the outbreak occurred recently is also a poor basis for comparison.The graphs of confirmed cases and daily increases in cases were fit into a standard size rectangle, though the Y-axis for one country had a maximum value of 50, and for another country 100,000, which potentially misled people interpreting the slope of the curve. Such misleading circumstances affected comparing large population countries to small population counties or countries with low numbers of cases to China which had a large count of cases in the early part of the outbreak. These challenges for interpreting and comparing these graphs represent work each reader must do based on their experience and ability. Thus, we felt it would be a service to attempt to automate the thought process experts would use when visually analyzing these graphs, particularly the most recent tail of the graph, and provide readers with an a resulting synthesis to characterize the state of the pandemic in that country, state, or county.The lack of reliable data for confirmed recoveries and therefore active cases. Merely subtracting deaths from total cases to arrive at this figure progressively loses accuracy after two weeks. The reason is 81% of cases recover after experiencing mild symptoms in 10 to 14 days. Severe cases are 14% and last 15-30 days (based on average days with symptoms of 11 when admitted to hospital plus 12 days median stay, and plus of one week to include a full range of severely affected people who recover). Critical cases are 5% and last 31-56 days. Sources:U.S. CDC. April 3, 2020 Interim Clinical Guidance for Management of Patients with Confirmed Coronavirus Disease (COVID-19). Accessed online. Initial older guidance was also obtained online. Additionally, many people who recover may not be tested, and many who are, may not be tracked due to privacy laws. Thus, the formula used to compute an estimate of active cases is: Active Cases = 100% of new cases in past 14 days + 19% from past 15-25 days + 5% from past 26-49 days - total deaths. On 3/17/2022, the U.S. calculation was adjusted to: Active Cases = 100% of new cases in past 14 days + 6% from past 15-25 days + 3% from past 26-49 days - total deaths. Sources: https://www.cdc.gov/mmwr/volumes/71/wr/mm7104e4.htm https://covid.cdc.gov/covid-data-tracker/#variant-proportions If a new variant arrives and appears to cause higher rates of serious cases, we will roll back this adjustment. We’ve never been inside a pandemic with the ability to learn of new cases as they are confirmed anywhere in the world. After reviewing epidemiological and pandemic scientific literature, three needs arose. We need to specify which portions of the pandemic lifecycle this map cover. The World Health Organization (WHO) specifies six phases. The source data for this map begins just after the beginning of Phase 5: human to human spread and encompasses Phase 6: pandemic phase. Phase six is only characterized in terms of pre- and post-peak. However, these two phases are after-the-fact analyses and cannot ascertained during the event. Instead, we describe (below) a series of five trends for Phase 6 of the COVID-19 pandemic.Choosing terms to describe the five trends was informed by the scientific literature, particularly the use of epidemic, which signifies uncontrolled spread. The five trends are: Emergent, Spreading, Epidemic, Controlled, and End Stage. Not every locale will experience all five, but all will experience at least three: emergent, controlled, and end stage.This layer presents the current trends for the COVID-19 pandemic by country (or appropriate level). There are five trends:Emergent: Early stages of outbreak. Spreading: Early stages and depending on an administrative area’s capacity, this may represent a manageable rate of spread. Epidemic: Uncontrolled spread. Controlled: Very low levels of new casesEnd Stage: No New cases These trends can be applied at several levels of administration: Local: Ex., City, District or County – a.k.a. Admin level 2State: Ex., State or Province – a.k.a. Admin level 1National: Country – a.k.a. Admin level 0Recommend that at least 100,000 persons be represented by a unit; granted this may not be possible, and then the case rate per 100,000 will become more important.Key Concepts and Basis for Methodology: 10 Total Cases minimum threshold: Empirically, there must be enough cases to constitute an outbreak. Ideally, this would be 5.0 per 100,000, but not every area has a population of 100,000 or more. Ten, or fewer, cases are also relatively less difficult to track and trace to sources. 21 Days of Cases minimum threshold: Empirically based on COVID-19 and would need to be adjusted for any other event. 21 days is also the minimum threshold for analyzing the “tail” of the new cases curve, providing seven cases as the basis for a likely trend (note that 21 days in the tail is preferred). This is the minimum needed to encompass the onset and duration of a normal case (5-7 days plus 10-14 days). Specifically, a median of 5.1 days incubation time, and 11.2 days for 97.5% of cases to incubate. This is also driven by pressure to understand trends and could easily be adjusted to 28 days. Source
To facilitate the use of data collected through the high-frequency phone surveys on COVID-19, the Living Standards Measurement Study (LSMS) team has created the harmonized datafiles using two household surveys: 1) the country’ latest face-to-face survey which has become the sample frame for the phone survey, and 2) the country’s high-frequency phone survey on COVID-19.
The LSMS team has extracted and harmonized variables from these surveys, based on the harmonized definitions and ensuring the same variable names. These variables include demography as well as housing, household consumption expenditure, food security, and agriculture. Inevitably, many of the original variables are collected using questions that are asked differently. The harmonized datafiles include the best available variables with harmonized definitions.
Two harmonized datafiles are prepared for each survey. The two datafiles are:
1. HH: This datafile contains household-level variables. The information include basic household characterizes, housing, water and sanitation, asset ownership, consumption expenditure, consumption quintile, food security, livestock ownership. It also contains information on agricultural activities such as crop cultivation, use of organic and inorganic fertilizer, hired labor, use of tractor and crop sales.
2. IND: This datafile contains individual-level variables. It includes basic characteristics of individuals such as age, sex, marital status, disability status, literacy, education and work.
National coverage
The survey covered all de jure households excluding prisons, hospitals, military barracks, and school dormitories.
Sample survey data [ssd]
See “Nigeria - General Household Survey, Panel 2018-2019, Wave 4” and “Nigeria - COVID-19 National Longitudinal Phone Survey 2020” available in the Microdata Library for details.
Computer Assisted Personal Interview [capi]
Nigeria General Household Survey, Panel (GHS-Panel) 2018-2019 and Nigeria COVID-19 National Longitudinal Phone Survey (COVID-19 NLPS) 2020 data were harmonized following the harmonization guidelines (see “Harmonized Datafiles and Variables for High-Frequency Phone Surveys on COVID-19” for more details).
The high-frequency phone survey on COVID-19 has multiple rounds of data collection. When variables are extracted from multiple rounds of the survey, the originating round of the survey is noted with “_rX” in the variable name, where X represents the number of the round. For example, a variable with “_r3” presents that the variable was extracted from Round 3 of the high-frequency phone survey. Round 0 refers to the country’s latest face-to-face survey which has become the sample frame for the high-frequency phone surveys on COVID-19. When the variables are without “_rX”, they were extracted from Round 0.
See “Nigeria - General Household Survey, Panel 2018-2019, Wave 4” and “Nigeria - COVID-19 National Longitudinal Phone Survey 2020” available in the Microdata Library for details.
In July 2024, global industrial production, excluding the United States, increased by 1.5 percent compared to the same time in the previous year, based on three month moving averages. This is compared to an increase of 0.2 percent in advanced economies (excluding the United States) for the same time period. The global industrial production collapsed after the outbreak of COVID-19, but increased steadily in the months after, peaking at 23 percent in June 2021. Industrial growth rate tracks the output production in the industrial sector.
Objective Daily COVID-19 data reported by the World Health Organization (WHO) may provide the basis for political ad hoc decisions including travel restrictions. Data reported by countries, however, is heterogeneous and metrics to evaluate its quality are scarce. In this work, we analyzed COVID-19 case counts provided by WHO and developed tools to evaluate country-specific reporting behaviors. Methods In this retrospective cross-sectional study, COVID-19 data reported daily to WHO from 3rd January 2020 until 14th June 2021 were analyzed. We proposed the concepts of binary reporting rate and relative reporting behavior and performed descriptive analyses for all countries with these metrics. We developed a score to evaluate the consistency of incidence and binary reporting rates. Further, we performed spectral clustering of the binary reporting rate and relative reporting behavior to identify salient patterns in these metrics. Results Our final analysis included 222 countries and regions...., Data collection COVID-19 data was downloaded from WHO. Using a public repository, we have added the countries' full names to the WHO data set using the two-letter abbreviations for each country to merge both data sets. The provided COVID-19 data covers January 2020 until June 2021. We uploaded the final data set used for the analyses of this paper. Data processing We processed data using a Jupyter Notebook with a Python kernel and publically available external libraries. This upload contains the required Jupyter Notebook (reporting_behavior.ipynb) with all analyses and some additional work, a README, and the conda environment yml (env.yml)., Any text editor including Microsoft Excel and their free alternatives can open the uploaded CSV file. Any web browser and some code editors (like the freely available Visual Studio Code) can show the uploaded Jupyter Notebook if the required Python environment is set up correctly.
The Global Monthly and Seasonal Urban and Land Backscatter Time Series, 1993-2020, is a multi-sensor, multi-decadal, data set of global microwave backscatter, for 1993 to 2020. It assembles data from C-band sensors onboard the European Remote Sensing Satellites (ERS-1 and ERS-2) covering 1993-2000, Advanced Scatterometer (ASCAT) onboard EUMETSAT satellites for 2007-2020, and the Ku-band sensor onboard the QuikSCAT satellite for 1999-2009, onto a common spatial grid (0.05 degree latitude /longitude resolution) and time step (both monthly and seasonal). Data are provided for all land (except high latitudes and islands), and for urban grid cells, based on a specific masking that removes grid cells with > 50% open water or < 20% built land. The all-land data allows users to choose and evaluate other urban masks. There is an offset between C-band and Ku-band backscatter from both vegetated and urban surfaces that is not spatially constant. There is a strong linear correlation (overall R-squared value = 0.69) between 2015 ASCAT urban backscatter and a continental-scale gridded product of building volume, across 8,450 urban grid cells (0.05 degree resolution) from large cities in Europe, China, and the United States.
The Associated Press is sharing data from the COVID Impact Survey, which provides statistics about physical health, mental health, economic security and social dynamics related to the coronavirus pandemic in the United States.
Conducted by NORC at the University of Chicago for the Data Foundation, the probability-based survey provides estimates for the United States as a whole, as well as in 10 states (California, Colorado, Florida, Louisiana, Minnesota, Missouri, Montana, New York, Oregon and Texas) and eight metropolitan areas (Atlanta, Baltimore, Birmingham, Chicago, Cleveland, Columbus, Phoenix and Pittsburgh).
The survey is designed to allow for an ongoing gauge of public perception, health and economic status to see what is shifting during the pandemic. When multiple sets of data are available, it will allow for the tracking of how issues ranging from COVID-19 symptoms to economic status change over time.
The survey is focused on three core areas of research:
Instead, use our queries linked below or statistical software such as R or SPSS to weight the data.
If you'd like to create a table to see how people nationally or in your state or city feel about a topic in the survey, use the survey questionnaire and codebook to match a question (the variable label) to a variable name. For instance, "How often have you felt lonely in the past 7 days?" is variable "soc5c".
Nationally: Go to this query and enter soc5c as the variable. Hit the blue Run Query button in the upper right hand corner.
Local or State: To find figures for that response in a specific state, go to this query and type in a state name and soc5c as the variable, and then hit the blue Run Query button in the upper right hand corner.
The resulting sentence you could write out of these queries is: "People in some states are less likely to report loneliness than others. For example, 66% of Louisianans report feeling lonely on none of the last seven days, compared with 52% of Californians. Nationally, 60% of people said they hadn't felt lonely."
The margin of error for the national and regional surveys is found in the attached methods statement. You will need the margin of error to determine if the comparisons are statistically significant. If the difference is:
The survey data will be provided under embargo in both comma-delimited and statistical formats.
Each set of survey data will be numbered and have the date the embargo lifts in front of it in the format of: 01_April_30_covid_impact_survey. The survey has been organized by the Data Foundation, a non-profit non-partisan think tank, and is sponsored by the Federal Reserve Bank of Minneapolis and the Packard Foundation. It is conducted by NORC at the University of Chicago, a non-partisan research organization. (NORC is not an abbreviation, it part of the organization's formal name.)
Data for the national estimates are collected using the AmeriSpeak Panel, NORC’s probability-based panel designed to be representative of the U.S. household population. Interviews are conducted with adults age 18 and over representing the 50 states and the District of Columbia. Panel members are randomly drawn from AmeriSpeak with a target of achieving 2,000 interviews in each survey. Invited panel members may complete the survey online or by telephone with an NORC telephone interviewer.
Once all the study data have been made final, an iterative raking process is used to adjust for any survey nonresponse as well as any noncoverage or under and oversampling resulting from the study specific sample design. Raking variables include age, gender, census division, race/ethnicity, education, and county groupings based on county level counts of the number of COVID-19 deaths. Demographic weighting variables were obtained from the 2020 Current Population Survey. The count of COVID-19 deaths by county was obtained from USA Facts. The weighted data reflect the U.S. population of adults age 18 and over.
Data for the regional estimates are collected using a multi-mode address-based (ABS) approach that allows residents of each area to complete the interview via web or with an NORC telephone interviewer. All sampled households are mailed a postcard inviting them to complete the survey either online using a unique PIN or via telephone by calling a toll-free number. Interviews are conducted with adults age 18 and over with a target of achieving 400 interviews in each region in each survey.Additional details on the survey methodology and the survey questionnaire are attached below or can be found at https://www.covid-impact.org.
Results should be credited to the COVID Impact Survey, conducted by NORC at the University of Chicago for the Data Foundation.
To learn more about AP's data journalism capabilities for publishers, corporations and financial institutions, go here or email kromano@ap.org.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
COVID-19 data for World from 2020-01-01 to 2023-11-06, including cur_excess_mortality, cur_excess_mortality_cumulative_per_million, cur_hosp_patients, cur_icu_patients, cur_idx_positive_rate, cur_reproduction_rate, cur_weekly_hosp_admissions, total_boosters, total_cases, total_cases_per_million, total_deaths, total_deaths_per_million, total_gdp_per_capita, total_people_fully_vaccinated, total_people_fully_vaccinated_per_hundred, total_people_vaccinated, total_people_vaccinated_per_hundred, total_population, total_tests, total_tests_per_thousand, total_vaccinations, total_vaccinations_per_hundred
Files:
".pkl" Cache file backup
Dataframe exported "PyCoa-DF.csv"
Original database backup
The Times Higher Education World University Rankings 2020 includes almost 1,400 universities across 92 countries, standing as the largest and most diverse university rankings ever to date. The table is based on 13 carefully calibrated performance indicators that measure an institution’s performance across teaching, research, knowledge transfer and international outlook.
Background: It is not known how the number of deaths due to COVID-19 compare to the number of deaths due to "unsafe water, sanitation, and handwashing" during the COVID-19 global health emergency. Methods: A dataset of deaths due to COVID-19 was downloaded from the World Health Organization. A dataset summarizing deaths due to unsafe water, sanitation, and handwashing was obtained from the Institute for Health Metrics and Evaluation (IHME).  Results indicate that COVID-19 deaths in Africa and South East Asia regions exceeded those due to unsafe water, sanitation, and hygiene. , Two raw datasets were obtained and processed.  To construct the dataset, "Estimates of  mortality due to inadequate water, sanitation, and hygiene (WASH) during the COVID-19 Global Health Emergency" raw data were downloaded from the Institute for Health Metrics and Evaluation (IHME). The raw dataset was reduced, eliminating variables. The original IHME dataset was for the year 2019. IMHE does not yet have data for 2020 or beyond. The final data contains calculations that project into 2020-2023 the estimated number of WASH-related deaths, by region. That was done by multiplying the 2019 estimated deaths by regions, by a factor of the duration of the pandemic period/the number of days in 2019, assuming a constant rate. To construct the dataset "Estimates of COVID-19  mortality, by region January 3 2020-May 5, 2023, with assumptions about undercounting" raw data were downloaded from the public WHO Coronavirus (COVID-19) Dashboard. The raw dataset contains COVID-19 mortality data by coun..., , # Data from: Priority setting for global WASH challenges in the age of wastewater-based epidemiological surveillance
A brief summary of dataset contents
Dataset #1: Estimates of mortality due to inadequate water, sanitation, and hygiene (WASH) during the COVID-19 Global Health Emergency
VARIABLES
Region = The name for country groupings used by WHO
Age category = All observations have either the value 1 (<5 years) or 5 (all ages)
Deaths 2019 due to unsafe WASH point estimate = The point estimate for the number of deaths due to unsafe WASH in 2019, by WHO region, by age category
Deaths 2019 due to unsafe WASH upper estimate = The upper bound estimate for the number of deaths due to unsafe WASH in 2019, by WHO region, by age category
Deaths 2019 due to unsafe WASH lower estimate = The lower bound estimate for the number of deaths due to unsafe WASH in 2019, by WHO region, by age category
Estimated number Jan 3 2020-May 5 2023 = The estimated number of deaths ...
This layer is a time series of the annual ESA CCI (Climate Change Initiative) land cover maps of the world. ESA has produced land cover maps for the years 1992-2020. These are available at the European Space Agency Climate Change Initiative website.Time Extent: 1992-2020Cell Size: 300 meter Source Type: ThematicPixel Type: 8 Bit UnsignedData Projection: GCS WGS84Mosaic Projection: Web Mercator Auxiliary Sphere Extent: GlobalSource: ESA Climate Change InitiativeUpdate Cycle: Annual until 2020, no updates thereafterWhat can you do with this layer? This layer may be added to ArcGIS Online maps and applications and shown in a time series to watch a "time lapse" view of land cover change since 1992 for any part of the world. The same behavior exists when the layer is added to ArcGIS Pro. In addition to displaying all layers in a series, this layer may be queried so that only one year is displayed in a map. This layer can be used in analysis. For example, the layer may be added to ArcGIS Pro with a query set to display just one year. Then, an area count of land cover types may be produced for a feature dataset using the zonal statistics tool. Statistics may be compared with the statistics from other years to show a trend. To sum up area by land cover using this service, or any other analysis, be sure to use an equal area projection, such as Albers or Equal Earth. Different Classifications Available to Map Five processing templates are included in this layer. The processing templates may be used to display a smaller set of land cover classes.Cartographic Renderer (Default Template)Displays all ESA CCI land cover classes.*Forested lands TemplateThe forested lands template shows only forested lands (classes 50-90).Urban Lands TemplateThe urban lands template shows only urban areas (class 190).Converted Lands TemplateThe converted lands template shows only urban lands and lands converted to agriculture (classes 10-40 and 190).Simplified RendererDisplays the map in ten simple classes which match the ten simplified classes used in 2050 Land Cover projections from Clark University.Any of these variables can be displayed or analyzed by selecting their processing template. In ArcGIS Online, select the Image Display Options on the layer. Then pull down the list of variables from the Renderer options. Click Apply and Close. In ArcGIS Pro, go into the Layer Properties. Select Processing Templates from the left hand menu. From the Processing Template pull down menu, select the variable to display. Using Time By default, the map will display as a time series animation, one year per frame. A time slider will appear when you add this layer to your map. To see the most current data, move the time slider until you see the most current year. In addition to displaying the past quarter century of land cover maps as an animation, this time series can also display just one year of data by use of a definition query. For a step by step example using ArcGIS Pro on how to display just one year of this layer, as well as to compare one year to another, see the blog called Calculating Impervious Surface Change. Hierarchical ClassificationLand cover types are defined using the land cover classification (LCCS) developed by the United Nations, FAO. It is designed to be as compatible as possible with other products, namely GLCC2000, GlobCover 2005 and 2009. This is a heirarchical classification system. For example, class 60 means "closed to open" canopy broadleaved deciduous tree cover. But in some places a more specific type of broadleaved deciduous tree cover may be available. In that case, a more specific code 61 or 62 may be used which specifies "open" (61) or "closed" (62) cover. Land Cover Processing To provide consistency over time, these maps are produced from baseline land cover maps, and are revised for changes each year depending on the best available satellite data from each period in time. These revisions were made from AVHRR 1km time series from 1992 to 1999, SPOT-VGT time series between 1999 and 2013, and PROBA-V data for years 2013, 2014 and 2015. When MERIS FR or PROBA-V time series are available, changes detected at 1 km are re-mapped at 300 m. The last step consists in back- and up-dating the 10-year baseline LC map to produce the 24 annual LC maps from 1992 to 2015. Source data The datasets behind this layer were extracted from NetCDF files and TIFF files produced by ESA. Years 1992-2015 were acquired from ESA CCI LC version 2.0.7 in TIFF format, and years 2016-2018 were acquired from version 2.1.1 in NetCDF format. These are downloadable from ESA with an account, after agreeing to their terms of use. https://maps.elie.ucl.ac.be/CCI/viewer/download.php CitationESA. Land Cover CCI Product User Guide Version 2. Tech. Rep. (2017). Available at: maps.elie.ucl.ac.be/CCI/viewer/download/ESACCI-LC-Ph2-PUGv2_2.0.pdfMore technical documentation on the source datasets is available here:https://cds.climate.copernicus.eu/cdsapp#!/dataset/satellite-land-cover?tab=doc*Index of all classes in this layer:10 Cropland, rainfed11 Herbaceous cover12 Tree or shrub cover20 Cropland, irrigated or post-flooding30 Mosaic cropland (>50%) / natural vegetation (tree, shrub, herbaceous cover) (<50%)40 Mosaic natural vegetation (tree, shrub, herbaceous cover) (>50%) / cropland (<50%) 50 Tree cover, broadleaved, evergreen, closed to open (>15%)60 Tree cover, broadleaved, deciduous, closed to open (>15%)61 Tree cover, broadleaved, deciduous, closed (>40%)62 Tree cover, broadleaved, deciduous, open (15-40%)70 Tree cover, needleleaved, evergreen, closed to open (>15%)71 Tree cover, needleleaved, evergreen, closed (>40%)72 Tree cover, needleleaved, evergreen, open (15-40%)80 Tree cover, needleleaved, deciduous, closed to open (>15%)81 Tree cover, needleleaved, deciduous, closed (>40%)82 Tree cover, needleleaved, deciduous, open (15-40%)90 Tree cover, mixed leaf type (broadleaved and needleleaved)100 Mosaic tree and shrub (>50%) / herbaceous cover (<50%)110 Mosaic herbaceous cover (>50%) / tree and shrub (<50%)120 Shrubland121 Shrubland evergreen122 Shrubland deciduous130 Grassland140 Lichens and mosses150 Sparse vegetation (tree, shrub, herbaceous cover) (<15%)151 Sparse tree (<15%)152 Sparse shrub (<15%)153 Sparse herbaceous cover (<15%)160 Tree cover, flooded, fresh or brakish water170 Tree cover, flooded, saline water180 Shrub or herbaceous cover, flooded, fresh/saline/brakish water190 Urban areas200 Bare areas201 Consolidated bare areas202 Unconsolidated bare areas210 Water bodies
The Country Opinion Survey in Seychelles assists the World Bank Group (WBG) in gaining a better understanding of how stakeholders in Seychelles perceive the WBG. It provides the WBG with systematic feedback from national and local governments, multilateral/bilateral agencies, media, academia, the private sector, and civil society in Seychelles on 1) their views regarding the general environment in Seychelles; 2) their overall attitudes toward the WBG in Seychelles; 3) overall impressions of the WBG’s effectiveness and results, knowledge work and activities, and communication and information sharing in Seychelles; and 4) their perceptions of the WBG’s future role in Seychelles.
Stakeholders of the World Bank Group in the Seychelles
Opinion leaders from national and local governments, multilateral/bilateral agencies, media, academia, the private sector, and civil society.
Sample survey data [ssd]
From March to July 2020, 306 stakeholders of the WBG in Seychelles were invited to provide their opinions on the WBG’s work in the country by participating in a Country Opinion Survey. Participants were drawn from the Office of the President, Prime Minister; office of a minister; office of a parliamentarian; ministries/ministerial departments/implementation agencies; Project Management Units (PMUs) overseeing implementation of WBG projects; consultants/ contractors working on WBG-supported projects/programs; local governments; independent government institutions; the judicial system; state-owned enterprises; bilateral and multilateral agencies; private sector organizations; the financial sector/private banks; private foundations; NGOs and community based organizations; trade unions; faith-based groups; youth groups; academia/research institutes/think tanks; the media; and other organizations.
Internet [int]
The questionnaire is structured and is available in English.
The response rate for the Seychelles WBCS 2020 was 35%.
This product provides the global annual urban extents (1992-2020) using the harmonized nighttime light observations, including :(1) Global time-series urban sequence map from 1992 to 2020 (i.e., "sQ_urbanMap_global_stackTS.tif")(2) Global annual urban extent maps from 1992 to 2020 (e.g., "annual_urbanMap_global_1992.tif", "annual_urbanMap_global_1993.tif")Any questions about this data can be corresponded to Prof. Yuyu Zhou (zhouyuyu@gmail.com)
The World Values Survey (WVS) is an international research program devoted to the scientific and academic study of social, political, economic, religious and cultural values of people in the world. The project’s goal is to assess which impact values stability or change over time has on the social, political and economic development of countries and societies. The project grew out of the European Values Study and was started in 1981 by its Founder and first President (1981-2013) Professor Ronald Inglehart from the University of Michigan (USA) and his team, and since then has been operating in more than 120 world societies. The main research instrument of the project is a representative comparative social survey which is conducted globally every 5 years. Extensive geographical and thematic scope, free availability of survey data and project findings for broad public turned the WVS into one of the most authoritative and widely-used cross-national surveys in the social sciences. At the moment, WVS is the largest non-commercial cross-national empirical time-series investigation of human beliefs and values ever executed.
The project’s overall aim is to analyze people’s values, beliefs and norms in a comparative cross-national and over-time perspective. To reach this aim, project covers a broad scope of topics from the field of Sociology, Political Science, International Relations, Economics, Public Health, Demography, Anthropology, Social Psychology and etc. In addition, WVS is the only academic study which covers the whole scope of global variations, from very poor to very rich societies in all world’s main cultural zones.
The WVS combines two institutional components. From one side, WVS is a scientific program and social research infrastructure that explores people’s values and beliefs. At the same time, WVS comprises an international network of social scientists and researchers from 120 world countries and societies. All national teams and individual researchers involved into the implementation of the WVS constitute the community of Principal Investigators (PIs). All PIs are members of the WVS.
The WVS seeks to help scientists and policy makers understand changes in the beliefs, values and motivations of people throughout the world. Thousands of political scientists, sociologists, social psychologists, anthropologists and economists have used these data to analyze such topics as economic development, democratization, religion, gender equality, social capital, and subjective well-being. The WVS findings have proved to be valuable for policy makers seeking to build civil society and stable political institutions in developing countries. The WVS data is also frequently used by governments around the world, scholars, students, journalists and international organizations such as the World Bank, World Health Organization (WHO), United Nations Development Program (UNDP) and the United Nations Headquarters in New York (USA). The WVS data has been used in thousands of scholarly publications and the findings have been reported in leading media such as Time, Newsweek, The New York Times, The Economist, the World Development Report, the World Happiness Report and the UN Human Development Report.
The World Values Survey Association is governed by the Executive Committee, the Scientific Advisory Committee, and the General Assembly, under the terms of the Constitution.
Strategic goals for the 7th wave included:
Expansion of territorial coverage from 60 countries in WVS-6 to 80 in WVS-7; Deepening collaboration within the international development community; Deepening collaboration within NGOs, academic institutions and research foundations; Updating the WVS-7 questionnaire with new topics & items covering new social phenomena and emerging processes of value change; Expanding the 7th wave WVS with data useful for monitoring the SDGs; Expanding capacity and resources for survey fieldwork in developing countries. The 7th wave continued monitoring cultural values, attitudes and beliefs towards gender, family, and religion; attitudes and experience of poverty; education, health, and security; social tolerance and trust; attitudes towards multilateral institutions; cultural differences and similarities between regions and societies. In addition, the WVS-7 questionnaire has been elaborated with the inclusion of such new topics as the issues of justice, moral principles, corruption, accountability and risk, migration, national security and global governance.
For more information on the history of the WVSA, visit https://www.worldvaluessurvey.org/WVSContents.jsp ›Who we are › History of the WVSA.
Iran.
The WVS has just completed wave 7 data that comprises 64 surveys conducted in 2017-2022. With 64 countries and societies around the world and more than 80,000 respondents, this is the latest resource made available for the research community.
The WVS-7 survey was launched in January 2017 with Bolivia becoming the first country to conduct WVS-7. In the course of 2017 and 2018, WVS-7 has been conducted in the USA, Mexico, Brazil, Argentina, Chile, Ecuador, Peru, Andorra, Greece, Serbia, Romania, Turkey, Russia, Germany, Thailand, Australia, Malaysia, Indonesia, China, Pakistan, Egypt, Jordan, Nigeria, Iraq and over dozen of other world countries. Geographic coverage has also been expanded to several new countries included into the WVS for the first time, such as Bolivia, Greece, Macao SAR, Maldives, Myanmar, Nicaragua, and Tajikistan.
Household, Individual
The sample type preferable for using in the World Values Survey is a full probability sample of the population aged 18 years and older. A detailed description of the sampling methodology is provided in the country specific sample design documentation available for download from WVS.
A detailed description of the sampling methodology is provided in the Iran 2020 sample design documentation available for download from WVS and also from the Downloads section of the metadata.
Paper Assisted Personal Interview [papi]
The survey was fielded in the following language(s): Persian. The questionnaire is available for download from the WVS website.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Serbia: Percent of world tourist arrivals: The latest value from 2020 is 0.07 percent, a decline from 0.08 percent in 2019. In comparison, the world average is 0.81 percent, based on data from 123 countries. Historically, the average for Serbia from 2002 to 2020 is 0.05 percent. The minimum value, 0.03 percent, was reached in 2002 while the maximum of 0.08 percent was recorded in 2018.
The total amount of data created, captured, copied, and consumed globally is forecast to increase rapidly, reaching *** zettabytes in 2024. Over the next five years up to 2028, global data creation is projected to grow to more than *** zettabytes. In 2020, the amount of data created and replicated reached a new high. The growth was higher than previously expected, caused by the increased demand due to the COVID-19 pandemic, as more people worked and learned from home and used home entertainment options more often. Storage capacity also growing Only a small percentage of this newly created data is kept though, as just * percent of the data produced and consumed in 2020 was saved and retained into 2021. In line with the strong growth of the data volume, the installed base of storage capacity is forecast to increase, growing at a compound annual growth rate of **** percent over the forecast period from 2020 to 2025. In 2020, the installed base of storage capacity reached *** zettabytes.