https://datacatalog.worldbank.org/public-licenses?fragment=cchttps://datacatalog.worldbank.org/public-licenses?fragment=cc
Overview of policy measures taken in jurisdictions and by type of measure in support of the financial sector to address the impact of the COVID-19 pandemic. This dataset is updated regularly and remains work in progress. As such, it may contain errors and omissions.
Compiled by the Finance, Competitiveness & Innovation Global Practice. For inquiries, please reach out to Erik Feyen (efeijen@worldbank.org) and Davide Mare (dmare@worldbank.org).
Sources: National authorities; Yale, IIF, IMF, OECD, IADB.
In September 2024, the global PMI amounted to 47.5 for new export orders and 48.8 for manufacturing. The manufacturing PMI was at its lowest point in August 2020. It decreased over the last months of 2022 after the effects of the Russia-Ukraine war and rising inflation hit the world economy, and remained around 50 since.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The SPIN covid19 RMRIO dataset is a time series of MRIO tables covering years from 2016-2026 on a yearly basis. The dataset covers 163 sectors in 155 countries.
This repository includes data for years from 2020 to 2026 (covid scenario).
Code, method material and data for years 2016-2019 are stored in the following repository: 10.5281/zenodo.5713811
Data for the counterfactual scenario are stored in the following repository: 10.5281/zenodo.5713839
Tables are generated using the SPIN method, based on the RMRIO tables for the year 2015, GDP, imports and exports data from the International Financial Statistics (IFS) and the World Economic Outlooks (WEO) of October 2019 and April 2021.
The covid scenario is in line with April 2021 WEO's data and includes the macroeconomic effects of Covid 19.
All tables are labelled in 2015 US$ and valued in basic prices.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
To create the dataset, the top 10 countries leading in the incidence of COVID-19 in the world were selected as of October 22, 2020 (on the eve of the second full of pandemics), which are presented in the Global 500 ranking for 2020: USA, India, Brazil, Russia, Spain, France and Mexico. For each of these countries, no more than 10 of the largest transnational corporations included in the Global 500 rating for 2020 and 2019 were selected separately. The arithmetic averages were calculated and the change (increase) in indicators such as profitability and profitability of enterprises, their ranking position (competitiveness), asset value and number of employees. The arithmetic mean values of these indicators for all countries of the sample were found, characterizing the situation in international entrepreneurship as a whole in the context of the COVID-19 crisis in 2020 on the eve of the second wave of the pandemic. The data is collected in a general Microsoft Excel table. Dataset is a unique database that combines COVID-19 statistics and entrepreneurship statistics. The dataset is flexible data that can be supplemented with data from other countries and newer statistics on the COVID-19 pandemic. Due to the fact that the data in the dataset are not ready-made numbers, but formulas, when adding and / or changing the values in the original table at the beginning of the dataset, most of the subsequent tables will be automatically recalculated and the graphs will be updated. This allows the dataset to be used not just as an array of data, but as an analytical tool for automating scientific research on the impact of the COVID-19 pandemic and crisis on international entrepreneurship. The dataset includes not only tabular data, but also charts that provide data visualization. The dataset contains not only actual, but also forecast data on morbidity and mortality from COVID-19 for the period of the second wave of the pandemic in 2020. The forecasts are presented in the form of a normal distribution of predicted values and the probability of their occurrence in practice. This allows for a broad scenario analysis of the impact of the COVID-19 pandemic and crisis on international entrepreneurship, substituting various predicted morbidity and mortality rates in risk assessment tables and obtaining automatically calculated consequences (changes) on the characteristics of international entrepreneurship. It is also possible to substitute the actual values identified in the process and following the results of the second wave of the pandemic to check the reliability of pre-made forecasts and conduct a plan-fact analysis. The dataset contains not only the numerical values of the initial and predicted values of the set of studied indicators, but also their qualitative interpretation, reflecting the presence and level of risks of a pandemic and COVID-19 crisis for international entrepreneurship.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
This dataset contains two APIs with daily COVID-19 Trends and Impact Survey data. The University of Maryland API is for accessing global survey data and CMU API is for accessing US survey data.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
This original data set was submitted as a requirement for publication in the journal Humanities and Social Sciences Communications.
The World Bank and UNHCR in collaboration with the Kenya National Bureau of Statistics and the University of California, Berkeley are conducting the Kenya COVID-19 Rapid Response Phone Survey to track the socioeconomic impacts of the COVID-19 pandemic, the recovery from it as well as other shocks to provide timely data to inform a targeted response. This dataset contains information from eight waves of the COVID-19 RRPS, which is part of a panel survey that targets refugee household and started in May 2020. The same households were interviewed every two months for five survey rounds, in the first year of data collection, and every four months thereafter, with interviews conducted using Computer Assisted Telephone Interviewing (CATI) techniques. The sample aims to be representative of the refugee and stateless population in Kenya. It comprises five strata: Kakuma refugee camp, Kalobeyei settlement, Dadaab refugee camp, urban refugees, and Shona stateless. Waves 1-7 of this survey include information on household background, service access, employment, food security, income loss, transfers, health, and COVID-19 knowledge. Wave 8 focused on how households were exposed to shocks, in particular adverse weather shocks and the increase in the price of food and fuel, but also included parts of the previous modules on household background, service access, employment, food security, income loss, and subjective wellbeing. The data is uploaded in three files. The first is the hh file, which contains household level information. The 'hhid', uniquely identifies all household. The second is the adult level file, which contains data at the level of adult household members. Each adult in a household is uniquely identified by the 'adult_id'. The third file is the child level file, available only for waves 3-7, which contains information for every child in the household. Each child in a household is uniquely identified by the 'child_id'. The duration of data collection and sample size for each completed wave was: Wave 1: May 14 to July 7, 2020; 1,328 refugee households Wave 2: July 16 to September 18, 2020; 1,699 refugee households Wave 3: September 28 to December 2, 2020; 1,487 refugee households Wave 4: January 15 to March 25, 2021; 1,376 refugee households Wave 5: March 29 to June 13, 2021; 1,562 refugee households Wave 6: July 14 to November 3, 2021; 1,407 refugee households Wave 7: November 15, 2021, to March 31, 2022; 1,281 refugee households Wave 8: May 31 to July 8, 2022: 1,355 refugee households The same questionnaire is also administered to nationals in Kenya, with the data available in the WB microdata library: https://microdata.worldbank.org/index.php/catalog/3774
National coverage covering rural and urban areas
Individual and Household
All persons of concern for UNHCR
Sample survey data [ssd]
The sample aims to be representative of the refugee and stateless population in Kenya. It comprises five strata: Kakuma refugee camp, Kalobeyei settlement, Dadaab refugee camp, urban refugees, and Shona stateless, where sampling approaches differ across strata. For refugees in Kakuma and Kalobeyei, as well as for stateless people, recently conducted Socioeconomic Surveys (SES), were used as sampling frames. For the refugee population living in urban areas and the Dadaab camp, no such household survey data existed, and sampling frames were based on UNHCR's registration records (proGres), which include phone numbers. For Kakuma, Kalobeyei, Dadaab and urban refugees, a two-step sampling process was used. First, 1,000 individuals from each stratum were selected from the corresponding sampling frames. Each of these individuals received a text message to confirm that the registered phone was still active. In the second stage, implicitly stratifying by sex and age, the verified phone number lists were used to select the sample. Until wave 7 sampled households that were not reached in earlier waves were also contacted along with households that were interviewed before. In wave 8 only households that had previously participated in the survey were contacted for interview. The “wave” variable represents in which wave the households were interviewed in. For the stateless population, all the participants of the Shona socioeconomic survey (n=400) were included in the RRPS, because of limited sample size. The sampling frames for the refugee and Shona stateless communities are thus representative of households with active phone numbers registered with UNHCR.
Computer Assisted Telephone Interview [cati]
The questionnaire included 12 sections Section 1: Introduction Section 2: Household background Section 3: Travel patterns and interactions Section 4: Employment Section 5: Food security Section 6: Income Loss Section 7: Transfers Section 8: Subjective welfare (50% of sample) Section 9: Health Section 10: COVID Knowledge Section 11: Household and Social Relations (50% of sample) Section 12: Conclusion
Variable names were kept constant across survey waves. For questions that remained exactly the same across survey waves, data points for all waves can be found under one variable name. For questions where the phrasing changed (even in a minimal way) across waves, variable names were also changed to reflect the change in phrasing. Extended missing values are used to indicate why a value is missing for all variables. The following extended missing values are used in the dataset: · .a for 'Don't know' · .b for 'Refused to respond' · .c for 'Outliers set to missing' · .d for 'Inconsistency set to missing' (used for employment data as explained below) · .e for 'Field Skipped' (where an error in the survey tool caused the question to be missed) · .z for 'Not administered' (as the variable was not relevant to the observation) More detailed data on children was collected between waves 3 and 7, compared to waves 1, 2 and 8. In waves 1 and 2, data on children, e.g. on their learning activities, was collected for all children in a household with one question. Therefore, variables related to children are part of the 'hh' data for waves 1 and 2. Between waves 3 and 7, questions on children in the household were asked for specific children. Some questions covered all children, while others were only administered to one randomly selected child in the household. This approach allows to disaggregate data at the level of the child household members, and the data can be found in the 'child' data set. The household level weights can be used for analysis of the children's data. In wave 8, detailed information on children was dropped, as the questionnaire focused on other topics. The education status of household members, except for the respondent, was imputed for rounds 1 and 2. For rounds 1 and 2, only the education status of the respondent was elicited, while for later rounds the education status for each household member was asked. In order to evaluate outcomes by the household member's education status, information on education was imputed for waves 1 and 2, using the information provided for all household members in waves 3, 4, and 5. This resulted in additional information on the education status for household members in round 1 and 2, which was not yet available for earlier versions of this data. Some questions are not asked repeatedly across waves such that their values were imputed. For some questions, answers are not possible or unlikely to change within two months between survey waves such that households were not asked about them in all waves. The questions on assets owned before March 2020 were only asked to households when they are interviewed for the first time. The questions on the dwelling's wall and floor material as well as the household's connection to the power grid was not asked for all households in wave 2 and 3, where only new households and those who moved were covered by these questions. Questions on the main source of electricity in the households and types of assets owned were not asked in wave 8. The missing values those variables have when they were not asked, are imputed from the answers given in earlier waves. Improved quality insurance algorithms lead to minor revisions to wave 1 to 5 data. Based on additional data checks, the team has made minor refinements to wave 1 to 5 data. The identification of the household members that were the respondent or the household head was refined in the rare cases where it was not possible to interview the same respondent as in previous waves for a given household such that another adult was interviewed. For this reason, for about 2 percent of observations the household head status was assigned to an incorrect household member, which was corrected. For <1 percent of households the respondent did not appear in adult level dataset. For about 1 percent of observations in wave 5 the respondent appeared twice in the adult level dataset. Data from questions on COVID-19 vaccinations from wave 7 was dropped from the dataset. Due to significantly higher self-reported vaccination rates compared to official administrative records, data on vaccinations was deemed unreliable, most likely due to social desirability bias. Consequently, questions on vaccination status and questions using the vaccination data as a validation criterion were dropped from the datasets.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Introduction
There are several works based on Natural Language Processing on newspaper reports. Mining opinions from headlines [ 1 ] using Standford NLP and SVM by Rameshbhaiet. Al.compared several algorithms on a small and large dataset. Rubinet. al., in their paper [ 2 ], created a mechanism to differentiate fake news from real ones by building a set of characteristics of news according to their types. The purpose was to contribute to the low resource data available for training machine learning algorithms. Doumitet. al.in [ 3 ] have implemented LDA, a topic modeling approach to study bias present in online news media.
However, there are not many NLP research invested in studying COVID-19. Most applications include classification of chest X-rays and CT-scans to detect presence of pneumonia in lungs [ 4 ], a consequence of the virus. Other research areas include studying the genome sequence of the virus[ 5 ][ 6 ][ 7 ] and replicating its structure to fight and find a vaccine. This research is crucial in battling the pandemic. The few NLP based research publications are sentiment classification of online tweets by Samuel et el [ 8 ] to understand fear persisting in people due to the virus. Similar work has been done using the LSTM network to classify sentiments from online discussion forums by Jelodaret. al.[ 9 ]. NKK dataset is the first study on a comparatively larger dataset of a newspaper report on COVID-19, which contributed to the virus’s awareness to the best of our knowledge.
2 Data-set Introduction
2.1 Data Collection
We accumulated 1000 online newspaper report from United States of America (USA) on COVID-19. The newspaper includes The Washington Post (USA) and StarTribune (USA). We have named it as “Covid-News-USA-NNK”. We also accumulated 50 online newspaper report from Bangladesh on the issue and named it “Covid-News-BD-NNK”. The newspaper includes The Daily Star (BD) and Prothom Alo (BD). All these newspapers are from the top provider and top read in the respective countries. The collection was done manually by 10 human data-collectors of age group 23- with university degrees. This approach was suitable compared to automation to ensure the news were highly relevant to the subject. The newspaper online sites had dynamic content with advertisements in no particular order. Therefore there were high chances of online scrappers to collect inaccurate news reports. One of the challenges while collecting the data is the requirement of subscription. Each newspaper required $1 per subscriptions. Some criteria in collecting the news reports provided as guideline to the human data-collectors were as follows:
The headline must have one or more words directly or indirectly related to COVID-19.
The content of each news must have 5 or more keywords directly or indirectly related to COVID-19.
The genre of the news can be anything as long as it is relevant to the topic. Political, social, economical genres are to be more prioritized.
Avoid taking duplicate reports.
Maintain a time frame for the above mentioned newspapers.
To collect these data we used a google form for USA and BD. We have two human editor to go through each entry to check any spam or troll entry.
2.2 Data Pre-processing and Statistics
Some pre-processing steps performed on the newspaper report dataset are as follows:
Remove hyperlinks.
Remove non-English alphanumeric characters.
Remove stop words.
Lemmatize text.
While more pre-processing could have been applied, we tried to keep the data as much unchanged as possible since changing sentence structures could result us in valuable information loss. While this was done with help of a script, we also assigned same human collectors to cross check for any presence of the above mentioned criteria.
The primary data statistics of the two dataset are shown in Table 1 and 2.
Table 1: Covid-News-USA-NNK data statistics
No of words per headline
7 to 20
No of words per body content
150 to 2100
Table 2: Covid-News-BD-NNK data statistics No of words per headline
10 to 20
No of words per body content
100 to 1500
2.3 Dataset Repository
We used GitHub as our primary data repository in account name NKK^1. Here, we created two repositories USA-NKK^2 and BD-NNK^3. The dataset is available in both CSV and JSON format. We are regularly updating the CSV files and regenerating JSON using a py script. We provided a python script file for essential operation. We welcome all outside collaboration to enrich the dataset.
3 Literature Review
Natural Language Processing (NLP) deals with text (also known as categorical) data in computer science, utilizing numerous diverse methods like one-hot encoding, word embedding, etc., that transform text to machine language, which can be fed to multiple machine learning and deep learning algorithms.
Some well-known applications of NLP includes fraud detection on online media sites[ 10 ], using authorship attribution in fallback authentication systems[ 11 ], intelligent conversational agents or chatbots[ 12 ] and machine translations used by Google Translate[ 13 ]. While these are all downstream tasks, several exciting developments have been made in the algorithm solely for Natural Language Processing tasks. The two most trending ones are BERT[ 14 ], which uses bidirectional encoder-decoder architecture to create the transformer model, that can do near-perfect classification tasks and next-word predictions for next generations, and GPT-3 models released by OpenAI[ 15 ] that can generate texts almost human-like. However, these are all pre-trained models since they carry huge computation cost. Information Extraction is a generalized concept of retrieving information from a dataset. Information extraction from an image could be retrieving vital feature spaces or targeted portions of an image; information extraction from speech could be retrieving information about names, places, etc[ 16 ]. Information extraction in texts could be identifying named entities and locations or essential data. Topic modeling is a sub-task of NLP and also a process of information extraction. It clusters words and phrases of the same context together into groups. Topic modeling is an unsupervised learning method that gives us a brief idea about a set of text. One commonly used topic modeling is Latent Dirichlet Allocation or LDA[17].
Keyword extraction is a process of information extraction and sub-task of NLP to extract essential words and phrases from a text. TextRank [ 18 ] is an efficient keyword extraction technique that uses graphs to calculate the weight of each word and pick the words with more weight to it.
Word clouds are a great visualization technique to understand the overall ’talk of the topic’. The clustered words give us a quick understanding of the content.
4 Our experiments and Result analysis
We used the wordcloud library^4 to create the word clouds. Figure 1 and 3 presents the word cloud of Covid-News-USA- NNK dataset by month from February to May. From the figures 1,2,3, we can point few information:
In February, both the news paper have talked about China and source of the outbreak.
StarTribune emphasized on Minnesota as the most concerned state. In April, it seemed to have been concerned more.
Both the newspaper talked about the virus impacting the economy, i.e, bank, elections, administrations, markets.
Washington Post discussed global issues more than StarTribune.
StarTribune in February mentioned the first precautionary measurement: wearing masks, and the uncontrollable spread of the virus throughout the nation.
While both the newspaper mentioned the outbreak in China in February, the weight of the spread in the United States are more highlighted through out March till May, displaying the critical impact caused by the virus.
We used a script to extract all numbers related to certain keywords like ’Deaths’, ’Infected’, ’Died’ , ’Infections’, ’Quarantined’, Lock-down’, ’Diagnosed’ etc from the news reports and created a number of cases for both the newspaper. Figure 4 shows the statistics of this series. From this extraction technique, we can observe that April was the peak month for the covid cases as it gradually rose from February. Both the newspaper clearly shows us that the rise in covid cases from February to March was slower than the rise from March to April. This is an important indicator of possible recklessness in preparations to battle the virus. However, the steep fall from April to May also shows the positive response against the attack. We used Vader Sentiment Analysis to extract sentiment of the headlines and the body. On average, the sentiments were from -0.5 to -0.9. Vader Sentiment scale ranges from -1(highly negative to 1(highly positive). There were some cases
where the sentiment scores of the headline and body contradicted each other,i.e., the sentiment of the headline was negative but the sentiment of the body was slightly positive. Overall, sentiment analysis can assist us sort the most concerning (most negative) news from the positive ones, from which we can learn more about the indicators related to COVID-19 and the serious impact caused by it. Moreover, sentiment analysis can also provide us information about how a state or country is reacting to the pandemic. We used PageRank algorithm to extract keywords from headlines as well as the body content. PageRank efficiently highlights important relevant keywords in the text. Some frequently occurring important keywords extracted from both the datasets are: ’China’, Government’, ’Masks’, ’Economy’, ’Crisis’, ’Theft’ , ’Stock market’ , ’Jobs’ , ’Election’, ’Missteps’, ’Health’, ’Response’. Keywords extraction acts as a filter allowing quick searches for indicators in case of locating situations of the economy,
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Timely, comprehensive and accurate dataset of CO2 emissions from fossil fuel combustion and cement production are fundamental prerequisites to understanding the global carbon cycle and designing evidenced-based policies for reducing carbon emissions. Here we provided daily CO2 emission data record started from Jan. 1st 2019. The dataset uncovered the daily CO2 dynamics through daily, weekly and seasonal variations as well as the change of daily CO2 due to the holiday events and the impact of COVID-19 pandemic.Such near-real-time CO2 dataset would be of great advantage to further monitoring the human activities and to capture the impacts of COVID-19 for long term.
Earlier this year, Dr. Hoffman and Dr. Fafard published a book chapter on the efficacy and legality of border closures enacted by governments in response to changing COVID-19 conditions. The authors concluded border closures are at best, regarded as powerful symbolic acts taken by governments to show they are acting forcefully, even if the actions lack an epidemiological impact and breach international law. This COVID-19 travel restriction project was developed out of a necessity and desire to further examine the empirical implications of border closures. The current dataset contains bilateral travel restriction information on the status of 179 countries between 1 January 2020 and 8 June 2020. The data was extracted from the ‘international controls’ column from the Oxford COVID-19 Government Response Tracker (OxCGRT). The data in the ‘international controls’ column outlined a country’s change in border control status, as a response to COVID-19 conditions. Accompanying source links were further verified through random selection and comparison with external news sources. Greater weight is given to official national government sources, then to provincial and municipal news-affiliated agencies. The database is presented in matrix form for each country-pair and date. Subsequently, each cell is represented by datum Xdmn and indicates the border closure status on date d by country m on country n. The coding is as follows: no border closure (code = 0), targeted border closure (= 1), and a total border closure (= 99). The dataset provides further details in the ‘notes’ column if the type of closure is a modified form of a targeted closure, either as a land or port closure, flight or visa suspension, or a re-opening of borders to select countries. Visa suspensions and closure of land borders were coded separately as de facto border closures and analyzed as targeted border closures in quantitative analyses. The file titled ‘BTR Supplementary Information’ covers a multitude of supplemental details to the database. The various tabs cover the following: 1) Codebook: variable name, format, source links, and description; 2) Sources, Access dates: dates of access for the individual source links with additional notes; 3) Country groups: breakdown of EEA, EU, SADC, Schengen groups with source links; 4) Newly added sources: for missing countries with a population greater than 1 million (meeting the inclusion criteria), relevant news sources were added for analysis; 5) Corrections: external news sources correcting for errors in the coding of international controls retrieved from the OxCGRT dataset. At the time of our study inception, there was no existing dataset which recorded the bilateral decisions of travel restrictions between countries. We hope this dataset will be useful in the study of the impact of border closures in the COVID-19 pandemic and widen the capabilities of studying border closures on a global scale, due to its interconnected nature and impact, rather than being limited in analysis to a single country or region only. Statement of contributions: Data entry and verification was performed mainly by GL, with assistance from MJP and RN. MP and IW provided further data verification on the nine countries purposively selected for the exploratory analysis of political decision-making.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Data shared in this platform is data related to quality of life and domains in Medan City, North Sumatra Province, Indonesia. Medan City is the third largest city in Indonesia with a population of around 2.5 million. Medan city is certainly not spared from the Covid-19 Pandemic although judging by the percentage it is only 2-3% of the total Covid-19 sufferers in Indonesia. The quality of life measured is the quality of life of the community after 2 months of applying Physical Distancing. The application of Physical Distancing certainly has an impact on the declining quality of life of the people. By measuring the quality of life of the people during this pandemic, it is expected to be able to provide an overview for all stakeholders related to the impact of a pandemic and the policies undertaken in relation to the pandemic on the quality of life of people in an area. In the future, this is expected to be a good reference regarding pandemics and policies that should be implemented.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Coronavirus disease 2019 (COVID19) time series that lists confirmed cases, reported deaths, and reported recoveries. Data is broken down by country (and sometimes by sub-region).
Coronavirus disease (COVID19) is caused by severe acute respiratory syndrome Coronavirus 2 (SARSCoV2) and has had an effect worldwide. On March 11, 2020, the World Health Organization (WHO) declared it a pandemic, currently indicating more than 118,000 cases of coronavirus disease in more than 110 countries and territories around the world.
This dataset contains the latest news related to Covid-19 and it was fetched with the help of Newsdata.io news API.
The New York Times is releasing a series of data files with cumulative counts of coronavirus cases in the United States, at the state and county level, over time. We are compiling this time series data from state and local governments and health departments in an attempt to provide a complete record of the ongoing outbreak.
Since late January, The Times has tracked cases of coronavirus in real time as they were identified after testing. Because of the widespread shortage of testing, however, the data is necessarily limited in the picture it presents of the outbreak.
We have used this data to power our maps and reporting tracking the outbreak, and it is now being made available to the public in response to requests from researchers, scientists and government officials who would like access to the data to better understand the outbreak.
The data begins with the first reported coronavirus case in Washington State on Jan. 21, 2020. We will publish regular updates to the data in this repository.
To monitor the socioeconomic impacts of the coronavirus disease 2019 (COVID-19) pandemic and inform policy responses and interventions, the COVID-19 High-Frequency Phone Survey (HFPS) of households was designed as part of a World Bank global initiative. For Cambodia, a total of 5 survey rounds are planned, with households being called back every 1 to 2 months. This allows for the impact of the pandemic to be tracked as it unfolds and provides data to the government and development partners in near real-time, supporting an evidence-based response to the crisis.
In June 2020, Cambodia launched a Cash Transfer Program to support poor and vulnerable households during COVID-19. To more closely monitor the impact of COVID-19 among poor and vulnerable households in Cambodia and the impact of the Cambodia's Cash Transfer Program for Poor and Vulnerable Households during COVID-19, a sample of 1,000 IDPoor households was drawn for the phone survey from the beneficiary list of the conditional cash transfer for pregnant women and children under 2.
The questionnaire covers a series of topics, such as knowledge of Covid-19 and social behavior, access to food, food insecurity, impact of the Covid-19 on income sources and coping mechanisms, access to social assistance, and impact of Covid-19 on economic activity. A modular approach is used in the questionnaire design, which allows for modules to be dropped and/or added in different waves/rounds of the survey. The questionnaire is designed to be administered between 20 to 25 minutes.
Data collection for the first round started in June 2020. The survey is implemented using Computer Assisted Telephone Interviewing.
National coverage and 5 geographical regions (Phnom Penh and other urban areas, Plains, Tonle Sap, Coastal, Plateau and Mountains).
The survey covered all de jure households (with a phone number) excluding prisons, hospitals, military barracks, and school dormitories.
Sample survey data [ssd]
The beneficiary list of the conditional cash transfer program for pregnant women and child under 2 was assigned into 5 strata i.e. Phnom Penh and other urban areas, Plain, Tonle Sap, Coastal, Plateau and Mountain. The sample was randomly selected with proportional to the size of the IDPoor households in each strata. The phone survey successfully interviewed 984 households in June (Round 1). In August (Round 2), 784 households have been re-interviewed and 271 replacement households were added. Of these, 841 were successfully reached again in October (Round 3), with 527 interviewed in all three rounds. In December, 1,277 households were successfully interviewed, of which 945 households were re-contacted and 332 households were added as replacement households. In March 2021, 1,309 households were interviewed, of which 991 households were re-interviewed and 318 households were replacement households. In February 2022, 812 households were successfully interviewed while only 801 households were interviewed in April 2022.
Computer Assisted Telephone Interview [cati]
The Cambodia COVID-19 High Frequency Phone Survey of households questionnaire consists of the following sections:
Round 1 - Interview Information - Household Roster - Social Economic Status - Knowledge Regarding the Spread of COVID-19 - Behavior and Social Distancing - Access to Basic Services - Employment - Income Loss - FIES - Shocks and Coping - Safety Nets
Round 2 - Interview Information - Household Roster - Migration - Access to Basic Services - Employment - Income Loss - FIES - Safety Nets - Relief Transfer
Round 3 - Interview Information - Household Roster - Social Economic Status - Knowledge Regarding the Spread of COVID-19 - Access to Basic Services - Employment - Income Loss - FIES - Safety Nets - Relief Transfer
Round 4 - Interview Information - Household Roster - Social Economic Status - Access to Basic Services - Employment - Income Loss - FIES - Shocks and Coping - Safety Nets - Relief Transfer - Payment Method
Round 5 - Interview Information - Household Roster - Social Economic Status - Access to Basic Services - Employment - Income Loss - FIES - Safety Nets - Relief Transfer
Round 6 - Interview Information - Household Roster - Social Economic Status - Access to Basic Services - Employment - Income Loss - FIES - Shocks and Coping - Safety Nets - Relief Transfer - Education - SWIFT
Round 7 - Interview Information - Household Roster - Social Economic Status - Disability - Access to Basic Services - Employment - Income Loss - FIES - Shocks and Coping - Safety Nets - Relief Transfer - Education - SWIFT
At the end of data collection, the raw dataset was cleaned by the research team. This included formatting, and correcting results based on monitoring issues, enumerator feedback and survey changes.
Only households that consented to being successfully interviewed were kept in the dataset, and all personal information and internal survey variables were dropped from the clean dataset.
Replacement sampling approach was applied to reach the target sample.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset contains real-world COVID-19 human-to-human transmission network data. The data mimics how COVID-19 infections may have spread from one individual to another in Bucharest (Romania) during August 1st and October 31st, 2020. The information refers to COVID-19 patients (referees) and their contacts (referrals), i.e., the people they interacted with before being tested COVID-19 positive. The dataset is structured as an edge-list file (referee - referral ties). For each referee (referral), we provide the following attributes: sex (male/female), age, sector (public/private), a job in the medical sector (yes/no), ISCO-08 one-digit code, ISCO-08 two-digit code, ISCO-08 three-digit code, employability (active/non-active), age class (minor, adult, pensioner), confirmation month (when a patient was tested positive for COVID-19 infection), confirmation day (when a patient was tested positive for COVID-19 infection). The data were analyzed using relational hyperevent modeling (https://github.com/juergenlerner/eventnet).
This dataset allows replication of the analysis reported in the manuscript entitled: Occupations and their impact on the spreading of COVID-19 in urban communities (Hâncean M-G, Lerner J, Perc M, Oană I, Bunaciu D-A, Stoica AA & Ghiță M-C).
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
As global communities respond to COVID-19, we've heard from public health officials that the same type of aggregated, anonymized insights we use in products such as Google Maps could be helpful as they make critical decisions to combat COVID-19.
These Community Mobility Reports aim to provide insights into what has changed in response to policies aimed at combating COVID-19. The reports chart movement trends over time by geography, across different categories of places such as retail and recreation, groceries and pharmacies, parks, transit stations, workplaces, and residential. (https://www.google.com/covid19/mobility/)
The data contains aggregated and anonymised aggregated data per day for each country. For say accessing data for India - the files 2020_IN_Region_Mobility_Report.csv for 2020 data and 2021_IN_Region_Mobility_Report.csv. The aggregated data is not only present at country level, but also at States and district level - as given in sub_region_1 and sub_region_2.
This data from report published by Google. https://www.google.com/covid19/mobility/
Some Questions to answer
India is having its Second Wave and one of the major causes is considered to the election rallies held in different parts of the country. How does Mobility Impact the COVID Cases?
Comparing Mobility across different Countries
The map data and summary statistics data are sourced from Johns Hopkins University and Esri’s Living Atlas. The charts are being sourced from a database created by Timmons Group GIS that leverages the temporal data provided by JHU on github.
Why did we do this?
How did we do this?
The raw data from JHU does not support the temporal charting at the State level or County level, so we created a data pipeline to leverage JHU’s source data files and transforms their raw data into our data model
Key features:
Check out our other ArcGIS Dashboard powered by the new ArcGIS Experience Builder to explore the COVID-19 curves at the country level around the world - Explore the COVID-19 Curve
For additional information, please contact:
The World Bank has launched a fast-deploying high-frequency phone-based survey of households to generate near real time insights into the socio-economic impact of COVID-19 on households which hence to be used to support evidence-based policy responses to the crisis. At a time when conventional modes of data collection are not feasible, this phone-based rapid data collection method offers a way to gather granular information on the transmission mechanisms of the crisis on the populations, to identify gaps in policy responses, and to generate insights to inform scaling up or redirection of resources as the crisis unfolds.
National
Individual, Household-level
A mobile frame was generated via random digit dialing (RDD), based on the National Numbering Plans from the Malaysian Communications and Multimedia Commission (MCMC). All possible subscriber combinations were generated in DRUID (D Force Sampling's Reactive User Interface Database), an SQL database interface which houses the complete sampling frame. From this database, complete random telephone numbers were sampled. For Round 1, a sample of 33,894 phone numbers were drawn (without replacement within the survey wave) from a total of 102,780,000 possible mobile numbers from more than 18 mobile providers in the sampling frame, which were not stratified. Once the sample was drawn in the form of replicates (subsamples) of n = 10.000, the numbers were filtered by D-Force Sampling using an auto-dialer to determine each numbers' working status. All numbers that yield a working call disposition for at least one of the two filtering attempts were then passed to the CATI center human interviewing team. Mobile devices were assumed to be personal, and therefore the person who answered the call was the selected respondent. Screening questions were used to ensure that the respondent was at least 18 years old and within the capacity of either contributing, making or with knowledge of household finances. Respondents who had participated in Round 1 were sampled for Round 2. Fresh respondents were introduced in Round 3 in addition to panel respondents from Round 2; fresh respondents in Round 3 were selected using the same procedure for sampling respondents in Round 1.
Computer Assisted Telephone Interview [cati]
The questionnaire is available in three languages, including English, Bahasa Melayu, and Mandarin Chinese. It can be downloaded from the Downloads section.
In Round 1, the survey successfully interviewed 2,210 individuals out of 33,894 sampled phone numbers. In Round 2, the survey successfully re-interviewed 1,047 individuals, recording a 47% response rate. In Round 3, the survey successfully re-interviewed 667 respondents who had been previously interviewed in Round 2, recording a 64% response rate. The panel respondents in Round 3 were added with 446 fresh respondents.
In Round 1, assuming a simple random sample, with p=0.5 and n=2,210 at the 95% CI level, yields a margin of sampling error (MOE) of 2.09 percentage points. Incorporating the design effect into this estimate yields a margin of sampling error of 2.65% percentage points.
In Round 2, the complete weight was for the entire sample adjusted to the 2021 population estimates from DOSM’s annual intercensal population projections. Assuming a simple random sample with p=0.5 and n=1,047 at the 95% CI level, yields a margin of sampling error (MOE) of 3.803 percentage points. Incorporating the design effect into this estimate yields a margin of sampling error of 3.54 percentage points.
Among both fresh and panel samples in Round 3, assuming a simple random sample, with p=0.5 and n=1,113 at the 95% CI level yields a margin of sampling error (MOE) of 2.94 percentage points. Incorporating the design effect into this estimate yields a margin of sampling error of 3.34 percentage points.
Among panel samples in Round 3, with p=0.5 and n=667 at the 95% CI level yields a margin of sampling error (MOE) of 3.80 percentage points. Incorporating the design effect into this estimate yields a margin of sampling error of 4.16 percentage points.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
R code for Implementation of data processing, supervised SOM and Class rule mining Analysis
https://datacatalog.worldbank.org/public-licenses?fragment=cchttps://datacatalog.worldbank.org/public-licenses?fragment=cc
Overview of policy measures taken in jurisdictions and by type of measure in support of the financial sector to address the impact of the COVID-19 pandemic. This dataset is updated regularly and remains work in progress. As such, it may contain errors and omissions.
Compiled by the Finance, Competitiveness & Innovation Global Practice. For inquiries, please reach out to Erik Feyen (efeijen@worldbank.org) and Davide Mare (dmare@worldbank.org).
Sources: National authorities; Yale, IIF, IMF, OECD, IADB.