Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
All cities with a population > 1000 or seats of adm div (ca 80.000)Sources and ContributionsSources : GeoNames is aggregating over hundred different data sources. Ambassadors : GeoNames Ambassadors help in many countries. Wiki : A wiki allows to view the data and quickly fix error and add missing places. Donations and Sponsoring : Costs for running GeoNames are covered by donations and sponsoring.Enrichment:add country name
The world population surpassed eight billion people in 2022, having doubled from its figure less than 50 years previously. Looking forward, it is projected that the world population will reach nine billion in 2038, and 10 billion in 2060, but it will peak around 10.3 billion in the 2080s before it then goes into decline. Regional variations The global population has seen rapid growth since the early 1800s, due to advances in areas such as food production, healthcare, water safety, education, and infrastructure, however, these changes did not occur at a uniform time or pace across the world. Broadly speaking, the first regions to undergo their demographic transitions were Europe, North America, and Oceania, followed by Latin America and Asia (although Asia's development saw the greatest variation due to its size), while Africa was the last continent to undergo this transformation. Because of these differences, many so-called "advanced" countries are now experiencing population decline, particularly in Europe and East Asia, while the fastest population growth rates are found in Sub-Saharan Africa. In fact, the roughly two billion difference in population between now and the 2080s' peak will be found in Sub-Saharan Africa, which will rise from 1.2 billion to 3.2 billion in this time (although populations in other continents will also fluctuate). Changing projections The United Nations releases their World Population Prospects report every 1-2 years, and this is widely considered the foremost demographic dataset in the world. However, recent years have seen a notable decline in projections when the global population will peak, and at what number. Previous reports in the 2010s had suggested a peak of over 11 billion people, and that population growth would continue into the 2100s, however a sooner and shorter peak is now projected. Reasons for this include a more rapid population decline in East Asia and Europe, particularly China, as well as a prolongued development arc in Sub-Saharan Africa.
Census data reveals that population density varies noticeably from area to area. Small area census data do a better job depicting where the crowded neighborhoods are. In this map, the yellow areas of highest density range from 30,000 to 150,000 persons per square kilometer. In those areas, if the people were spread out evenly across the area, there would be just 4 to 9 meters between them. Very high density areas exceed 7,000 persons per square kilometer. High density areas exceed 5,200 persons per square kilometer. The last categories break at 3,330 persons per square kilometer, and 1,500 persons per square kilometer.This dataset is comprised of multiple sources. All of the demographic data are from Michael Bauer Research with the exception of the following countries:Australia: Esri Australia and MapData ServicesCanada: Esri Canada and EnvironicsFrance: Esri FranceGermany: Esri Germany and NexigaIndia: Esri India and IndicusJapan: Esri JapanSouth Korea: Esri Korea and OPENmateSpain: Esri España and AISUnited States: Esri Demographics
https://borealisdata.ca/api/datasets/:persistentId/versions/3.0/customlicense?persistentId=doi:10.5683/SP2/AOVUW7https://borealisdata.ca/api/datasets/:persistentId/versions/3.0/customlicense?persistentId=doi:10.5683/SP2/AOVUW7
This database contains tobacco consumption data from 1970-2015 collected through a systematic search coupled with consultation with country and subject-matter experts. Data quality appraisal was conducted by at least two research team members in duplicate, with greater weight given to official government sources. All data was standardized into units of cigarettes consumed and a detailed accounting of data quality and sourcing was prepared. Data was found for 82 of 214 countries for which searches for national cigarette consumption data were conducted, representing over 95% of global cigarette consumption and 85% of the world’s population. Cigarette consumption fell in most countries over the past three decades but trends in country specific consumption were highly variable. For example, China consumed 2.5 million metric tonnes (MMT) of cigarettes in 2013, more than Russia (0.36 MMT), the United States (0.28 MMT), Indonesia (0.28 MMT), Japan (0.20 MMT), and the next 35 highest consuming countries combined. The US and Japan achieved reductions of more than 0.1 MMT from a decade earlier, whereas Russian consumption plateaued, and Chinese and Indonesian consumption increased by 0.75 MMT and 0.1 MMT, respectively. These data generally concord with modelled country level data from the Institute for Health Metrics and Evaluation and have the additional advantage of not smoothing year-over-year discontinuities that are necessary for robust quasi-experimental impact evaluations. Before this study, publicly available data on cigarette consumption have been limited—either inappropriate for quasi-experimental impact evaluations (modelled data), held privately by companies (proprietary data), or widely dispersed across many national statistical agencies and research organisations (disaggregated data). This new dataset confirms that cigarette consumption has decreased in most countries over the past three decades, but that secular country specific consumption trends are highly variable. The findings underscore the need for more robust processes in data reporting, ideally built into international legal instruments or other mandated processes. To monitor the impact of the WHO Framework Convention on Tobacco Control and other tobacco control interventions, data on national tobacco production, trade, and sales should be routinely collected and openly reported. The first use of this database for a quasi-experimental impact evaluation of the WHO Framework Convention on Tobacco Control is: Hoffman SJ, Poirier MJP, Katwyk SRV, Baral P, Sritharan L. Impact of the WHO Framework Convention on Tobacco Control on global cigarette consumption: quasi-experimental evaluations using interrupted time series analysis and in-sample forecast event modelling. BMJ. 2019 Jun 19;365:l2287. doi: https://doi.org/10.1136/bmj.l2287 Another use of this database was to systematically code and classify longitudinal cigarette consumption trajectories in European countries since 1970 in: Poirier MJ, Lin G, Watson LK, Hoffman SJ. Classifying European cigarette consumption trajectories from 1970 to 2015. Tobacco Control. 2022 Jan. DOI: 10.1136/tobaccocontrol-2021-056627. Statement of Contributions: Conceived the study: GEG, SJH Identified multi-country datasets: GEG, MP Extracted data from multi-country datasets: MP Quality assessment of data: MP, GEG Selection of data for final analysis: MP, GEG Data cleaning and management: MP, GL Internet searches: MP (English, French, Spanish, Portuguese), GEG (English, French), MYS (Chinese), SKA (Persian), SFK (Arabic); AG, EG, BL, MM, YM, NN, EN, HR, KV, CW, and JW (English), GL (English) Identification of key informants: GEG, GP Project Management: LS, JM, MP, SJH, GEG Contacts with Statistical Agencies: MP, GEG, MYS, SKA, SFK, GP, BL, MM, YM, NN, HR, KV, JW, GL Contacts with key informants: GEG, MP, GP, MYS, GP Funding: GEG, SJH SJH: Hoffman, SJ; JM: Mammone J; SRVK: Rogers Van Katwyk, S; LS: Sritharan, L; MT: Tran, M; SAK: Al-Khateeb, S; AG: Grjibovski, A.; EG: Gunn, E; SKA: Kamali-Anaraki, S; BL: Li, B; MM: Mahendren, M; YM: Mansoor, Y; NN: Natt, N; EN: Nwokoro, E; HR: Randhawa, H; MYS: Yunju Song, M; KV: Vercammen, K; CW: Wang, C; JW: Woo, J; MJPP: Poirier, MJP; GEG: Guindon, EG; GP: Paraje, G; GL Gigi Lin Key informants who provided data: Corne van Walbeek (South Africa, Jamaica) Frank Chaloupka (US) Ayda Yurekli (Turkey) Dardo Curti (Uruguay) Bungon Ritthiphakdee (Thailand) Jakub Lobaszewski (Poland) Guillermo Paraje (Chile, Argentina) Key informants who provided useful insights: Carlos Manuel Guerrero López (Mexico) Muhammad Jami Husain (Bangladesh) Nigar Nargis (Bangladesh) Rijo M John (India) Evan Blecher (Nigeria, Indonesia, Philippines, South Africa) Yagya Karki (Nepal) Anne CK Quah (Malaysia) Nery Suarez Lugo (Cuba) Agencies providing assistance: Iranian Tobacco Co. Institut National de la Statistique (Tunisia) HM Revenue & Customs (UK) Eidgenössisches Finanzdepartement EFD/Département...
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset provides values for PERSONAL SAVINGS reported in several countries. The data includes current values, previous releases, historical highs and record lows, release frequency, reported unit and currency.
Accurate locations of people or places of interest is important to drive businesses and improve governement services. For accurate location, correctly geocoding addresses becomes important. Street addresses may sometimes be missing the country information and geocoding such incomplete addresses often results in poor accuracy. Geocoding accuracy and performance increases when the country is specified. This model categorizes incomplete addresses by automatically assigning the country they belong to.This deep learning model is trained on address dataset provided by openaddresses.io and can be used to classify addresses from 18 different countries in the world.Using the modelFollow the guide to use the model. Before using this model, ensure that the supported deep learning libraries are installed. For more details, check Deep Learning Libraries Installer for ArcGIS.Fine-tuning the modelThis model can be fine-tuned using the Train Text Classification Model tool available in the GeoAI toolbox in ArcGIS Pro.. Follow the guide to fine-tune this model.InputText on which country classification will be performed. Text should include street number or apartment number, street name, city or state.OutputText (classified country)Supported countriesThis model supports addresses from the following countries:AR – ArgentinaAT – AustriaAU – AustraliaBE – BelgiumCA – CanadaCH – SwitzerlandDE – GermanyDK – DenmarkES – SpainFI – FinlandFR – FranceIS – IcelandIT – ItalyKR – South KoreaLU – LuxemburgNZ – New ZealandSI – SloveniaUS – USAModel architectureThis model uses the xlm-roberta architecture implemented in Hugging Face Transformers.Accuracy metricsThe table below summarizes the precision, recall and F1-score of the model on the validation dataset.Training dataThe model has been trained on openly licensed data from openaddresses.io. Sample resultsHere are a few results from the model.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
The World Bank is an international financial institution that provides loans to countries of the world for capital projects. The World Bank's stated goal is the reduction of poverty. Source: https://en.wikipedia.org/wiki/World_Bank
This dataset combines key education statistics from a variety of sources to provide a look at global literacy, spending, and access.
For more information, see the World Bank website.
Fork this kernel to get started with this dataset.
https://bigquery.cloud.google.com/dataset/bigquery-public-data:world_bank_health_population
http://data.worldbank.org/data-catalog/ed-stats
https://cloud.google.com/bigquery/public-data/world-bank-education
Citation: The World Bank: Education Statistics
Dataset Source: World Bank. This dataset is publicly available for anyone to use under the following terms provided by the Dataset Source - http://www.data.gov/privacy-policy#data_policy - and is provided "AS IS" without any warranty, express or implied, from Google. Google disclaims all liability for any damages, direct or indirect, resulting from the use of the dataset.
Banner Photo by @till_indeman from Unplash.
Of total government spending, what percentage is spent on education?
Gallup Worldwide Research continually surveys residents in more than 150 countries, representing more than 98% of the world's adult population, using randomly selected, nationally representative samples. Gallup typically surveys 1,000 individuals in each country, using a standard set of core questions that has been translated into the major languages of the respective country. In some regions, supplemental questions are asked in addition to core questions. Face-to-face interviews are approximately 1 hour, while telephone interviews are about 30 minutes. In many countries, the survey is conducted once per year, and fieldwork is generally completed in two to four weeks. The Country Dataset Details spreadsheet displays each country's sample size, month/year of the data collection, mode of interviewing, languages employed, design effect, margin of error, and details about sample coverage.
Gallup is entirely responsible for the management, design, and control of Gallup Worldwide Research. For the past 70 years, Gallup has been committed to the principle that accurately collecting and disseminating the opinions and aspirations of people around the globe is vital to understanding our world. Gallup's mission is to provide information in an objective, reliable, and scientifically grounded manner. Gallup is not associated with any political orientation, party, or advocacy group and does not accept partisan entities as clients. Any individual, institution, or governmental agency may access the Gallup Worldwide Research regardless of nationality. The identities of clients and all surveyed respondents will remain confidential.
Sample survey data [ssd]
SAMPLING AND DATA COLLECTION METHODOLOGY With some exceptions, all samples are probability based and nationally representative of the resident population aged 15 and older. The coverage area is the entire country including rural areas, and the sampling frame represents the entire civilian, non-institutionalized, aged 15 and older population of the entire country. Exceptions include areas where the safety of interviewing staff is threatened, scarcely populated islands in some countries, and areas that interviewers can reach only by foot, animal, or small boat.
Telephone surveys are used in countries where telephone coverage represents at least 80% of the population or is the customary survey methodology (see the Country Dataset Details for detailed information for each country). In Central and Eastern Europe, as well as in the developing world, including much of Latin America, the former Soviet Union countries, nearly all of Asia, the Middle East, and Africa, an area frame design is used for face-to-face interviewing.
The typical Gallup Worldwide Research survey includes at least 1,000 surveys of individuals. In some countries, oversamples are collected in major cities or areas of special interest. Additionally, in some large countries, such as China and Russia, sample sizes of at least 2,000 are collected. Although rare, in some instances the sample size is between 500 and 1,000. See the Country Dataset Details for detailed information for each country.
FACE-TO-FACE SURVEY DESIGN
FIRST STAGE In countries where face-to-face surveys are conducted, the first stage of sampling is the identification of 100 to 135 ultimate clusters (Sampling Units), consisting of clusters of households. Sampling units are stratified by population size and or geography and clustering is achieved through one or more stages of sampling. Where population information is available, sample selection is based on probabilities proportional to population size, otherwise simple random sampling is used. Samples are drawn independent of any samples drawn for surveys conducted in previous years.
There are two methods for sample stratification:
METHOD 1: The sample is stratified into 100 to 125 ultimate clusters drawn proportional to the national population, using the following strata: 1) Areas with population of at least 1 million 2) Areas 500,000-999,999 3) Areas 100,000-499,999 4) Areas 50,000-99,999 5) Areas 10,000-49,999 6) Areas with less than 10,000
The strata could include additional stratum to reflect populations that exceed 1 million as well as areas with populations less than 10,000. Worldwide Research Methodology and Codebook Copyright © 2008-2012 Gallup, Inc. All rights reserved. 8
METHOD 2:
A multi-stage design is used. The country is first stratified by large geographic units, and then by smaller units within geography. A minimum of 33 Primary Sampling Units (PSUs), which are first stage sampling units, are selected. The sample design results in 100 to 125 ultimate clusters.
SECOND STAGE
Random route procedures are used to select sampled households. Unless an outright refusal occurs, interviewers make up to three attempts to survey the sampled household. To increase the probability of contact and completion, attempts are made at different times of the day, and where possible, on different days. If an interviewer cannot obtain an interview at the initial sampled household, he or she uses a simple substitution method. Refer to Appendix C for a more in-depth description of random route procedures.
THIRD STAGE
Respondents are randomly selected within the selected households. Interviewers list all eligible household members and their ages or birthdays. The respondent is selected by means of the Kish grid (refer to Appendix C) in countries where face-to-face interviewing is used. The interview does not inform the person who answers the door of the selection criteria until after the respondent has been identified. In a few Middle East and Asian countries where cultural restrictions dictate gender matching, respondents are randomly selected using the Kish grid from among all eligible adults of the matching gender.
TELEPHONE SURVEY DESIGN
In countries where telephone interviewing is employed, random-digit-dial (RDD) or a nationally representative list of phone numbers is used. In select countries where cell phone penetration is high, a dual sampling frame is used. Random respondent selection is achieved by using either the latest birthday or Kish grid method. At least three attempts are made to reach a person in each household, spread over different days and times of day. Appointments for callbacks that fall within the survey data collection period are made.
PANEL SURVEY DESIGN
Prior to 2009, United States data were collected using The Gallup Panel. The Gallup Panel is a probability-based, nationally representative panel, for which all members are recruited via random-digit-dial methodology and is only used in the United States. Participants who elect to join the panel are committing to the completion of two to three surveys per month, with the typical survey lasting 10 to 15 minutes. The Gallup Worldwide Research panel survey is conducted over the telephone and takes approximately 30 minutes. No incentives are given to panel participants. Worldwide Research Methodology and Codebook Copyright © 2008-2012 Gallup, Inc. All rights reserved. 9
QUESTION DESIGN
Many of the Worldwide Research questions are items that Gallup has used for years. When developing additional questions, Gallup employed its worldwide network of research and political scientists1 to better understand key issues with regard to question development and construction and data gathering. Hundreds of items were developed, tested, piloted, and finalized. The best questions were retained for the core questionnaire and organized into indexes. Most items have a simple dichotomous ("yes or no") response set to minimize contamination of data because of cultural differences in response styles and to facilitate cross-cultural comparisons.
The Gallup Worldwide Research measures key indicators such as Law and Order, Food and Shelter, Job Creation, Migration, Financial Wellbeing, Personal Health, Civic Engagement, and Evaluative Wellbeing and demonstrates their correlations with world development indicators such as GDP and Brain Gain. These indicators assist leaders in understanding the broad context of national interests and establishing organization-specific correlations between leading indexes and lagging economic outcomes.
Gallup organizes its core group of indicators into the Gallup World Path. The Path is an organizational conceptualization of the seven indexes and is not to be construed as a causal model. The individual indexes have many properties of a strong theoretical framework. A more in-depth description of the questions and Gallup indexes is included in the indexes section of this document. In addition to World Path indexes, Gallup Worldwide Research questions also measure opinions about national institutions, corruption, youth development, community basics, diversity, optimism, communications, religiosity, and numerous other topics. For many regions of the world, additional questions that are specific to that region or country are included in surveys. Region-specific questions have been developed for predominantly Muslim nations, former Soviet Union countries, the Balkans, sub-Saharan Africa, Latin America, China and India, South Asia, and Israel and the Palestinian Territories.
The questionnaire is translated into the major conversational languages of each country. The translation process starts with an English, French, or Spanish version, depending on the region. One of two translation methods may be used.
METHOD 1: Two independent translations are completed. An independent third party, with some knowledge of survey research methods, adjudicates the differences. A professional translator translates the final version back into the source language.
METHOD 2: A translator
COVID-19 Trends MethodologyOur goal is to analyze and present daily updates in the form of recent trends within countries, states, or counties during the COVID-19 global pandemic. The data we are analyzing is taken directly from the Johns Hopkins University Coronavirus COVID-19 Global Cases Dashboard, though we expect to be one day behind the dashboard’s live feeds to allow for quality assurance of the data.Revisions added on 4/23/2020 are highlighted.Revisions added on 4/30/2020 are highlighted.Discussion of our assertion of an abundance of caution in assigning trends in rural counties added 5/7/2020. Correction on 6/1/2020Methodology update on 6/2/2020: This sets the length of the tail of new cases to 6 to a maximum of 14 days, rather than 21 days as determined by the last 1/3 of cases. This was done to align trends and criteria for them with U.S. CDC guidance. The impact is areas transition into Controlled trend sooner for not bearing the burden of new case 15-21 days earlier.Reasons for undertaking this work:The popular online maps and dashboards show counts of confirmed cases, deaths, and recoveries by country or administrative sub-region. Comparing the counts of one country to another can only provide a basis for comparison during the initial stages of the outbreak when counts were low and the number of local outbreaks in each country was low. By late March 2020, countries with small populations were being left out of the mainstream news because it was not easy to recognize they had high per capita rates of cases (Switzerland, Luxembourg, Iceland, etc.). Additionally, comparing countries that have had confirmed COVID-19 cases for high numbers of days to countries where the outbreak occurred recently is also a poor basis for comparison.The graphs of confirmed cases and daily increases in cases were fit into a standard size rectangle, though the Y-axis for one country had a maximum value of 50, and for another country 100,000, which potentially misled people interpreting the slope of the curve. Such misleading circumstances affected comparing large population countries to small population counties or countries with low numbers of cases to China which had a large count of cases in the early part of the outbreak. These challenges for interpreting and comparing these graphs represent work each reader must do based on their experience and ability. Thus, we felt it would be a service to attempt to automate the thought process experts would use when visually analyzing these graphs, particularly the most recent tail of the graph, and provide readers with an a resulting synthesis to characterize the state of the pandemic in that country, state, or county.The lack of reliable data for confirmed recoveries and therefore active cases. Merely subtracting deaths from total cases to arrive at this figure progressively loses accuracy after two weeks. The reason is 81% of cases recover after experiencing mild symptoms in 10 to 14 days. Severe cases are 14% and last 15-30 days (based on average days with symptoms of 11 when admitted to hospital plus 12 days median stay, and plus of one week to include a full range of severely affected people who recover). Critical cases are 5% and last 31-56 days. Sources:U.S. CDC. April 3, 2020 Interim Clinical Guidance for Management of Patients with Confirmed Coronavirus Disease (COVID-19). Accessed online. Initial older guidance was also obtained online. Additionally, many people who recover may not be tested, and many who are, may not be tracked due to privacy laws. Thus, the formula used to compute an estimate of active cases is: Active Cases = 100% of new cases in past 14 days + 19% from past 15-30 days + 5% from past 31-56 days - total deaths.We’ve never been inside a pandemic with the ability to learn of new cases as they are confirmed anywhere in the world. After reviewing epidemiological and pandemic scientific literature, three needs arose. We need to specify which portions of the pandemic lifecycle this map cover. The World Health Organization (WHO) specifies six phases. The source data for this map begins just after the beginning of Phase 5: human to human spread and encompasses Phase 6: pandemic phase. Phase six is only characterized in terms of pre- and post-peak. However, these two phases are after-the-fact analyses and cannot ascertained during the event. Instead, we describe (below) a series of five trends for Phase 6 of the COVID-19 pandemic.Choosing terms to describe the five trends was informed by the scientific literature, particularly the use of epidemic, which signifies uncontrolled spread. The five trends are: Emergent, Spreading, Epidemic, Controlled, and End Stage. Not every locale will experience all five, but all will experience at least three: emergent, controlled, and end stage.This layer presents the current trends for the COVID-19 pandemic by country (or appropriate level). There are five trends:Emergent: Early stages of outbreak. Spreading: Early stages and depending on an administrative area’s capacity, this may represent a manageable rate of spread. Epidemic: Uncontrolled spread. Controlled: Very low levels of new casesEnd Stage: No New cases These trends can be applied at several levels of administration: Local: Ex., City, District or County – a.k.a. Admin level 2State: Ex., State or Province – a.k.a. Admin level 1National: Country – a.k.a. Admin level 0Recommend that at least 100,000 persons be represented by a unit; granted this may not be possible, and then the case rate per 100,000 will become more important.Key Concepts and Basis for Methodology: 10 Total Cases minimum threshold: Empirically, there must be enough cases to constitute an outbreak. Ideally, this would be 5.0 per 100,000, but not every area has a population of 100,000 or more. Ten, or fewer, cases are also relatively less difficult to track and trace to sources. 21 Days of Cases minimum threshold: Empirically based on COVID-19 and would need to be adjusted for any other event. 21 days is also the minimum threshold for analyzing the “tail” of the new cases curve, providing seven cases as the basis for a likely trend (note that 21 days in the tail is preferred). This is the minimum needed to encompass the onset and duration of a normal case (5-7 days plus 10-14 days). Specifically, a median of 5.1 days incubation time, and 11.2 days for 97.5% of cases to incubate. This is also driven by pressure to understand trends and could easily be adjusted to 28 days. Source used as basis:Stephen A. Lauer, MS, PhD *; Kyra H. Grantz, BA *; Qifang Bi, MHS; Forrest K. Jones, MPH; Qulu Zheng, MHS; Hannah R. Meredith, PhD; Andrew S. Azman, PhD; Nicholas G. Reich, PhD; Justin Lessler, PhD. 2020. The Incubation Period of Coronavirus Disease 2019 (COVID-19) From Publicly Reported Confirmed Cases: Estimation and Application. Annals of Internal Medicine DOI: 10.7326/M20-0504.New Cases per Day (NCD) = Measures the daily spread of COVID-19. This is the basis for all rates. Back-casting revisions: In the Johns Hopkins’ data, the structure is to provide the cumulative number of cases per day, which presumes an ever-increasing sequence of numbers, e.g., 0,0,1,1,2,5,7,7,7, etc. However, revisions do occur and would look like, 0,0,1,1,2,5,7,7,6. To accommodate this, we revised the lists to eliminate decreases, which make this list look like, 0,0,1,1,2,5,6,6,6.Reporting Interval: In the early weeks, Johns Hopkins' data provided reporting every day regardless of change. In late April, this changed allowing for days to be skipped if no new data was available. The day was still included, but the value of total cases was set to Null. The processing therefore was updated to include tracking of the spacing between intervals with valid values.100 News Cases in a day as a spike threshold: Empirically, this is based on COVID-19’s rate of spread, or r0 of ~2.5, which indicates each case will infect between two and three other people. There is a point at which each administrative area’s capacity will not have the resources to trace and account for all contacts of each patient. Thus, this is an indicator of uncontrolled or epidemic trend. Spiking activity in combination with the rate of new cases is the basis for determining whether an area has a spreading or epidemic trend (see below). Source used as basis:World Health Organization (WHO). 16-24 Feb 2020. Report of the WHO-China Joint Mission on Coronavirus Disease 2019 (COVID-19). Obtained online.Mean of Recent Tail of NCD = Empirical, and a COVID-19-specific basis for establishing a recent trend. The recent mean of NCD is taken from the most recent fourteen days. A minimum of 21 days of cases is required for analysis but cannot be considered reliable. Thus, a preference of 42 days of cases ensures much higher reliability. This analysis is not explanatory and thus, merely represents a likely trend. The tail is analyzed for the following:Most recent 2 days: In terms of likelihood, this does not mean much, but can indicate a reason for hope and a basis to share positive change that is not yet a trend. There are two worthwhile indicators:Last 2 days count of new cases is less than any in either the past five or 14 days. Past 2 days has only one or fewer new cases – this is an extremely positive outcome if the rate of testing has continued at the same rate as the previous 5 days or 14 days. Most recent 5 days: In terms of likelihood, this is more meaningful, as it does represent at short-term trend. There are five worthwhile indicators:Past five days is greater than past 2 days and past 14 days indicates the potential of the past 2 days being an aberration. Past five days is greater than past 14 days and less than past 2 days indicates slight positive trend, but likely still within peak trend time frame.Past five days is less than the past 14 days. This means a downward trend. This would be an
Financial inclusion is critical in reducing poverty and achieving inclusive economic growth. When people can participate in the financial system, they are better able to start and expand businesses, invest in their children’s education, and absorb financial shocks. Yet prior to 2011, little was known about the extent of financial inclusion and the degree to which such groups as the poor, women, and rural residents were excluded from formal financial systems.
By collecting detailed indicators about how adults around the world manage their day-to-day finances, the Global Findex allows policy makers, researchers, businesses, and development practitioners to track how the use of financial services has changed over time. The database can also be used to identify gaps in access to the formal financial system and design policies to expand financial inclusion.
See Methodology document for country-specific geographic coverage details.
The target population is the civilian, non-institutionalized population 15 years and above.
Observation data/ratings [obs]
The indicators in the 2017 Global Findex database are drawn from survey data covering almost 150,000 people in 144 economies-representing more than 97 percent of the world’s population (see Table A.1 of the Global Findex Database 2017 Report for a list of the economies included). The survey was carried out over the 2017 calendar year by Gallup, Inc., as part of its Gallup World Poll, which since 2005 has annually conducted surveys of approximately 1,000 people in each of more than 160 economies and in over 150 languages, using randomly selected, nationally representative samples. The target population is the entire civilian, noninstitutionalized population age 15 and above. Interview procedure Surveys are conducted face to face in economies where telephone coverage represents less than 80 percent of the population or where this is the customary methodology. In most economies the fieldwork is completed in two to four weeks.
In economies where face-to-face surveys are conducted, the first stage of sampling is the identification of primary sampling units. These units are stratified by population size, geography, or both, and clustering is achieved through one or more stages of sampling. Where population information is available, sample selection is based on probabilities proportional to population size; otherwise, simple random sampling is used. Random route procedures are used to select sampled households. Unless an outright refusal occurs, interviewers make up to three attempts to survey the sampled household. To increase the probability of contact and completion, attempts are made at different times of the day and, where possible, on different days. If an interview cannot be obtained at the initial sampled household, a simple substitution method is used.
Respondents are randomly selected within the selected households. Each eligible household member is listed and the handheld survey device randomly selects the household member to be interviewed. For paper surveys, the Kish grid method is used to select the respondent. In economies where cultural restrictions dictate gender matching, respondents are randomly selected from among all eligible adults of the interviewer’s gender.
In economies where telephone interviewing is employed, random digit dialing or a nationally representative list of phone numbers is used. In most economies where cell phone penetration is high, a dual sampling frame is used. Random selection of respondents is achieved by using either the latest birthday or household enumeration method. At least three attempts are made to reach a person in each household, spread over different days and times of day.
Other [oth]
The questionnaire was designed by the World Bank, in conjunction with a Technical Advisory Board composed of leading academics, practitioners, and policy makers in the field of financial inclusion. The Bill and Melinda Gates Foundation and Gallup Inc. also provided valuable input. The questionnaire was piloted in multiple countries, using focus groups, cognitive interviews, and field testing. The questionnaire is available in more than 140 languages upon request.
Questions on cash on delivery, saving using an informal savings club or person outside the family, domestic remittances, and agricultural payments are only asked in developing economies and few other selected countries. The question on mobile money accounts was only asked in economies that were part of the Mobile Money for the Unbanked (MMU) database of the GSMA at the time the interviews were being held.
Estimates of standard errors (which account for sampling error) vary by country and indicator. For country-specific margins of error, please refer to the Methodology section and corresponding table in Demirgüç-Kunt, Asli, Leora Klapper, Dorothe Singer, Saniya Ansar, and Jake Hess. 2018. The Global Findex Database 2017: Measuring Financial Inclusion and the Fintech Revolution. Washington, DC: World Bank
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset provides values for EMPLOYED PERSONS reported in several countries. The data includes current values, previous releases, historical highs and record lows, release frequency, reported unit and currency.
Well-functioning financial systems serve a vital purpose, offering savings, credit, payment, and risk management products to people with a wide range of needs. Yet until now little had been known about the global reach of the financial sector - the extent of financial inclusion and the degree to which such groups as the poor, women, and youth are excluded from formal financial systems. Systematic indicators of the use of different financial services had been lacking for most economies.
The Global Financial Inclusion (Global Findex) database provides such indicators. This database contains the first round of Global Findex indicators, measuring how adults in more than 140 economies save, borrow, make payments, and manage risk. The data set can be used to track the effects of financial inclusion policies globally and develop a deeper and more nuanced understanding of how people around the world manage their day-to-day finances. By making it possible to identify segments of the population excluded from the formal financial sector, the data can help policy makers prioritize reforms and design new policies.
National Coverage.
Individual
The target population is the civilian, non-institutionalized population 15 years and above. The sample is nationally representative.
Sample survey data [ssd]
The Global Findex indicators are drawn from survey data collected by Gallup, Inc. over the 2011 calendar year, covering more than 150,000 adults in 148 economies and representing about 97 percent of the world's population. Since 2005, Gallup has surveyed adults annually around the world, using a uniform methodology and randomly selected, nationally representative samples. The second round of Global Findex indicators was collected in 2014 and is forthcoming in 2015. The set of indicators will be collected again in 2017.
Surveys were conducted face-to-face in economies where landline telephone penetration is less than 80 percent, or where face-to-face interviewing is customary. The first stage of sampling is the identification of primary sampling units, consisting of clusters of households. The primary sampling units are stratified by population size, geography, or both, and clustering is achieved through one or more stages of sampling. Where population information is available, sample selection is based on probabilities proportional to population size; otherwise, simple random sampling is used. Random route procedures are used to select sampled households. Unless an outright refusal occurs, interviewers make up to three attempts to survey the sampled household. If an interview cannot be obtained at the initial sampled household, a simple substitution method is used. Respondents are randomly selected within the selected households by means of the Kish grid.
Surveys were conducted by telephone in economies where landline telephone penetration is over 80 percent. The telephone surveys were conducted using random digit dialing or a nationally representative list of phone numbers. In selected countries where cell phone penetration is high, a dual sampling frame is used. Random respondent selection is achieved by using either the latest birthday or Kish grid method. At least three attempts are made to teach a person in each household, spread over different days and times of year.
The sample size in Indonesia was 1,000 individuals.
Face-to-face [f2f]
The questionnaire was designed by the World Bank, in conjunction with a Technical Advisory Board composed of leading academics, practitioners, and policy makers in the field of financial inclusion. The Bill and Melinda Gates Foundation and Gallup, Inc. also provided valuable input. The questionnaire was piloted in over 20 countries using focus groups, cognitive interviews, and field testing. The questionnaire is available in 142 languages upon request.
Questions on insurance, mobile payments, and loan purposes were asked only in developing economies. The indicators on awareness and use of microfinance insitutions (MFIs) are not included in the public dataset. However, adults who report saving at an MFI are considered to have an account; this is reflected in the composite account indicator.
Estimates of standard errors (which account for sampling error) vary by country and indicator. For country- and indicator-specific standard errors, refer to the Annex and Country Table in Demirguc-Kunt, Asli and L. Klapper. 2012. "Measuring Financial Inclusion: The Global Findex." Policy Research Working Paper 6025, World Bank, Washington, D.C.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Worldwide Bureaucracy Indicators (WWBI) dataset from the World Bank.
The Worldwide Bureaucracy Indicators (WWBI) database is a unique cross-national dataset on public sector employment and wages that aims to fill an information gap, thereby helping researchers, development practitioners, and policymakers gain a better understanding of the personnel dimensions of state capability, the footprint of the public sector within the overall labor market, and the fiscal implications of the public sector wage bill. The dataset is derived from administrative data and household surveys, thereby complementing existing, expert perception-based approaches.
The World Bank introduced the dataset with a series of four blogs:
Can you replicate the figures in the blogs? Can you display any of the data more clearly than in the blogs?
wwbi_data.csv
variable | class | description |
---|---|---|
country_code | character | 3-letter ISO_3166-1 code |
indicator_code | character | code identifying the indicator of bureaucracy |
year | numeric | year of the data |
value | numeric | numeric value of the data |
wwbi_series.csv
variable | class | description |
---|---|---|
indicator_code | character | code identifying the indicator of bureaucracy |
indicator_name | character | name of the indicator |
wwbi_country.csv
variable | class | description |
---|---|---|
country_code | character | 3-letter ISO_3166-1 code |
short_name | character | short or common name for the country |
table_name | character | more alphabetically sortable name of the country |
long_name | character | full name of the country |
x2_alpha_code | character | 2-letter ISO_3166-1 code |
currency_unit | character | currency unit |
special_notes | character | special notes |
region | character | region |
income_group | character | low, lower middle, upper middle, or high income |
wb_2_code | character | alternate 2-letter code |
national_accounts_base_year | integer | national accounts base year |
national_accounts_reference_year | integer | national accounts reference year |
sna_price_valuation | character | UN system of national accounts price valuation |
lending_category | character | International Development Association (IDA), Interanational Bank of Reconstruction and Development (IBRD), a blend or neither |
other_groups | character | Heavily Indebted Poor Countries initiative (HIPC), or countries classified as the "Euro area" |
system_of_national_accounts | integer | which System of National Accounts methodology the country uses (1968, 1993, or 2008 version) |
balance_of_payments_manual_in_use | character | the version of the Balance of Payments Manual used by the country |
external_debt_reporting_status | character | estimate, preliminary, or actual |
system_of_trade | character | Under the general system imports include goods imported for domestic consumption and imports into bonded warehouses and free trade zones. Under the special system imports comprise goods imported for domestic consumption (including transformation and repair) and withdrawals for domestic consumption from bonded warehouses and free trade zones. Goods transported through a country en route to another are excluded. |
government_accounting_concept | character | government accounting concept |
imf_data_dissemination_standard | character | International Monetary Fund data-dissemination standard: Special Data Dissemination Standard (SDDS, 1996, created for countries |
that have or seek to have access to international markets), SDDS Plus (2012, the highest tier of data standards, intended for systemically important economies), enhanced GDDS (e-GDDS, 2015, encouraging participants to emphasize data publication) | ||
latest_household_survey | character | which household survey was most recently administered |
source_of_most_recent_income_and_expenditure_data | character | which survey serves as the basis for income and expenditure data |
vital_registration_complete | logical | whether the vital registration is complete |
latest_agricultural_census | integer | year of latest agricultural census |
latest_industrial_data | integer | year of latest industrial data |
latest_trade_data | in... |
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset provides values for EMPLOYMENT RATE reported in several countries. The data includes current values, previous releases, historical highs and record lows, release frequency, reported unit and currency.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
The Global Financial Inclusion Database provides 800 country-level indicators of financial inclusion summarized for all adults and disaggregated by key demographic characteristics-gender, age, education, income, and rural residence. Covering more than 140 economies, the indicators of financial inclusion measure how people save, borrow, make payments and manage risk.
The reference citation for the data is: Demirguc-Kunt, Asli, Leora Klapper, Dorothe Singer, and Peter Van Oudheusden. 2015. “The Global Findex Database 2014: Measuring Financial Inclusion around the World.” Policy Research Working Paper 7255, World Bank, Washington, DC.
This is a dataset hosted by the World Bank. The organization has an open data platform found here and they update their information according the amount of data that is brought in. Explore the World Bank using Kaggle and all of the data sources available through the World Bank organization page!
This dataset is maintained using the World Bank's APIs and Kaggle's API.
Cover photo by ZACHARY STAINES on Unsplash
Unsplash Images are distributed under a unique Unsplash License.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Contains data from the World Bank's data portal. There is also a consolidated country dataset on HDX.
Education is one of the most powerful instruments for reducing poverty and inequality and lays a foundation for sustained economic growth. The World Bank compiles data on education inputs, participation, efficiency, and outcomes. Data on education are compiled by the United Nations Educational, Scientific, and Cultural Organization (UNESCO) Institute for Statistics from official responses to surveys and from reports provided by education authorities in each country.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Contains data from the World Bank's data portal. There is also a consolidated country dataset on HDX.
Climate change is expected to hit developing countries the hardest. Its effects—higher temperatures, changes in precipitation patterns, rising sea levels, and more frequent weather-related disasters—pose risks for agriculture, food, and water supplies. At stake are recent gains in the fight against poverty, hunger and disease, and the lives and livelihoods of billions of people in developing countries. Addressing climate change requires unprecedented global cooperation across borders. The World Bank Group is helping support developing countries and contributing to a global solution, while tailoring our approach to the differing needs of developing country partners. Data here cover climate systems, exposure to climate impacts, resilience, greenhouse gas emissions, and energy use. Other indicators relevant to climate change are found under other data pages, particularly Environment, Agriculture & Rural Development, Energy & Mining, Health, Infrastructure, Poverty, and Urban Development.
Attribution-NonCommercial-ShareAlike 3.0 (CC BY-NC-SA 3.0)https://creativecommons.org/licenses/by-nc-sa/3.0/
License information was derived automatically
The "Major Cities" layer is derived from the "World Cities" dataset provided by ArcGIS Data and Maps group as part of the global data layers made available for public use.
"Major cities" layer specifically contains National and Provincial capitals that have the highest population within their respective country. Cities were filtered based on the STATUS (“National capital”, “National and provincial capital”, “Provincial capital”, “National capital and provincial capital enclave”, and “Other”). Majority of these cities within larger countries have been filtered at the highest levels of POP_CLASS (“5,000,000 and greater” and “1,000,000 to 4,999,999”). However, China for example, was filtered with cities over 11 million people due to many highly populated cities. Population approximations are sourced from US Census and UN Data.
Disclaimer: The designations employed and the presentation of material at this site do not imply the expression of any opinion whatsoever on the part of the Secretariat of the United Nations concerning the legal status of any country, territory, city or area or of its authorities, or concerning the delimitation of its frontiers or boundaries.
Data publication: 2021-03-12
Citation:
Credits: ESRI, CIA World Factbook, GMI, NIMA, UN Data, UN Habitat, US Census Bureau
Contact points:
Resource Contact: ESRI - ArcGIS Data and Maps
Metadata Contact: Justeen De Ocampo
Resource constraints:
Creative Commons Attribution-NonCommercial-ShareAlike 3.0 IGO (CC BY-NC- SA 3.0 IGO)
Online resources:
World Cities layer from ArcGIS Data & Maps
ArcGIS Data and Maps group background and available datasets.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
MGD: Music Genre Dataset
Over recent years, the world has seen a dramatic change in the way people consume music, moving from physical records to streaming services. Since 2017, such services have become the main source of revenue within the global recorded music market. Therefore, this dataset is built by using data from Spotify. It provides a weekly chart of the 200 most streamed songs for each country and territory it is present, as well as an aggregated global chart.
Considering that countries behave differently when it comes to musical tastes, we use chart data from global and regional markets from January 2017 to December 2019, considering eight of the top 10 music markets according to IFPI: United States (1st), Japan (2nd), United Kingdom (3rd), Germany (4th), France (5th), Canada (8th), Australia (9th), and Brazil (10th).
We also provide information about the hit songs and artists present in the charts, such as all collaborating artists within a song (since the charts only provide the main ones) and their respective genres, which is the core of this work. MGD also provides data about musical collaboration, as we build collaboration networks based on artist partnerships in hit songs. Therefore, this dataset contains:
Genre Networks: Success-based genre collaboration networks
Genre Mapping: Genre mapping from Spotify genres to super-genres
Artist Networks: Success-based artist collaboration networks
Artists: Some artist data
Hit Songs: Hit Song data and features
Charts: Enhanced data from Spotify Weekly Top 200 Charts
This dataset was originally built for a conference paper at ISMIR 2020. If you make use of the dataset, please also cite the following paper:
Gabriel P. Oliveira, Mariana O. Silva, Danilo B. Seufitelli, Anisio Lacerda, and Mirella M. Moro. Detecting Collaboration Profiles in Success-based Music Genre Networks. In Proceedings of the 21st International Society for Music Information Retrieval Conference (ISMIR 2020), 2020.
@inproceedings{ismir/OliveiraSSLM20, title = {Detecting Collaboration Profiles in Success-based Music Genre Networks}, author = {Gabriel P. Oliveira and Mariana O. Silva and Danilo B. Seufitelli and Anisio Lacerda and Mirella M. Moro}, booktitle = {21st International Society for Music Information Retrieval Conference} pages = {726--732}, year = {2020} }
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Contains data from the World Bank's data portal. There is also a consolidated country dataset on HDX.
Climate change is expected to hit developing countries the hardest. Its effects—higher temperatures, changes in precipitation patterns, rising sea levels, and more frequent weather-related disasters—pose risks for agriculture, food, and water supplies. At stake are recent gains in the fight against poverty, hunger and disease, and the lives and livelihoods of billions of people in developing countries. Addressing climate change requires unprecedented global cooperation across borders. The World Bank Group is helping support developing countries and contributing to a global solution, while tailoring our approach to the differing needs of developing country partners. Data here cover climate systems, exposure to climate impacts, resilience, greenhouse gas emissions, and energy use. Other indicators relevant to climate change are found under other data pages, particularly Environment, Agriculture & Rural Development, Energy & Mining, Health, Infrastructure, Poverty, and Urban Development.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
All cities with a population > 1000 or seats of adm div (ca 80.000)Sources and ContributionsSources : GeoNames is aggregating over hundred different data sources. Ambassadors : GeoNames Ambassadors help in many countries. Wiki : A wiki allows to view the data and quickly fix error and add missing places. Donations and Sponsoring : Costs for running GeoNames are covered by donations and sponsoring.Enrichment:add country name