Patterns of educational attainment vary greatly across countries, and across population groups within countries. In some countries, virtually all children complete basic education whereas in others large groups fall short. The primary purpose of this database, and the associated research program, is to document and analyze these differences using a compilation of a variety of household-based data sets: Demographic and Health Surveys (DHS); Multiple Indicator Cluster Surveys (MICS); Living Standards Measurement Study Surveys (LSMS); as well as country-specific Integrated Household Surveys (IHS) such as Socio-Economic Surveys.As shown at the website associated with this database, there are dramatic differences in attainment by wealth. When households are ranked according to their wealth status (or more precisely, a proxy based on the assets owned by members of the household) there are striking differences in the attainment patterns of children from the richest 20 percent compared to the poorest 20 percent.In Mali in 2012 only 34 percent of 15 to 19 year olds in the poorest quintile have completed grade 1 whereas 80 percent of the richest quintile have done so. In many countries, for example Pakistan, Peru and Indonesia, almost all the children from the wealthiest households have completed at least one year of schooling. In some countries, like Mali and Pakistan, wealth gaps are evident from grade 1 on, in other countries, like Peru and Indonesia, wealth gaps emerge later in the school system.The EdAttain website allows a visual exploration of gaps in attainment and enrollment within and across countries, based on the international database which spans multiple years from over 120 countries and includes indicators disaggregated by wealth, gender and urban/rural location. The database underlying that site can be downloaded from here.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Historical chart and dataset showing total population for the world by year from 1950 to 2025.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The World Health Organization reported 766440796 Coronavirus Cases since the epidemic began. In addition, countries reported 6932591 Coronavirus Deaths. This dataset provides - World Coronavirus Cases- actual values, historical data, forecast, chart, statistics, economic calendar and news.
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Some say climate change is the biggest threat of our age while others say it’s a myth based on dodgy science. We are turning some of the data over to you so you can form your own view.
Even more than with other data sets that Kaggle has featured, there’s a huge amount of data cleaning and preparation that goes into putting together a long-time study of climate trends. Early data was collected by technicians using mercury thermometers, where any variation in the visit time impacted measurements. In the 1940s, the construction of airports caused many weather stations to be moved. In the 1980s, there was a move to electronic thermometers that are said to have a cooling bias.
Given this complexity, there are a range of organizations that collate climate trends data. The three most cited land and ocean temperature data sets are NOAA’s MLOST, NASA’s GISTEMP and the UK’s HadCrut.
We have repackaged the data from a newer compilation put together by the Berkeley Earth, which is affiliated with Lawrence Berkeley National Laboratory. The Berkeley Earth Surface Temperature Study combines 1.6 billion temperature reports from 16 pre-existing archives. It is nicely packaged and allows for slicing into interesting subsets (for example by country). They publish the source data and the code for the transformations they applied. They also use methods that allow weather observations from shorter time series to be included, meaning fewer observations need to be thrown away.
In this dataset, we have include several files:
Global Land and Ocean-and-Land Temperatures (GlobalTemperatures.csv):
Other files include:
The raw data comes from the Berkeley Earth data page.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Historical chart and dataset showing World death rate by year from 1950 to 2025.
https://www.worldbank.org/en/about/legal/terms-of-use-for-datasetshttps://www.worldbank.org/en/about/legal/terms-of-use-for-datasets
The World Development Indicators from the World Bank contain over a thousand annual indicators of economic development from hundreds of countries around the world.
Here's a list of the available indicators along with a list of the available countries.
For example, this data includes the life expectancy at birth from many countries around the world:
The dataset hosted here is a slightly transformed verion of the raw files available here to facilitate analytics.
By Correlates of War Project [source]
The World Religion Project (WRP) is an ambitious endeavor to conduct a comprehensive analysis of religious adherence throughout the world from 1945 to 2010. This cutting-edge project offers unparalleled insight into the religious behavior of people in different countries, regions, and continents during this time period. Its datasets provide important information about the numbers and percentages of adherents across a multitude of different religions, religion families, and non-religious affiliations.
The WRP consists of three distinct datasets: the national religion dataset, regional religion dataset, and global religion dataset. Each is focused on understanding individually specific realms for varied analysis approaches - from individual states to global systems. The national dataset provides data on number of adherents by state as well as percentage population practicing a given faith group in five-year increments; focusing attention to how this number evolves from nation to nation over time. Similarly, regional data is provided at five year intervals highlighting individual region designations with one modification – Pacific Ocean states have been reclassified into their own Oceania category according to Country Code Number 900 or above). Finally at a global level – all states are aggregated in order that we may understand a snapshot view at any five-year interval between 1945‐2010 regarding relationships between religions or religio‐families within one location or transnationally.
This project was developed in three stages: firstly forming a religions tree (a systematic classification), secondly collecting data such as this provided by WRP according to that classification structure – lastly cleaning the data so discrepancies may be reconciled and imported where needed with gaps selected when unknown values were encountered during collection process . We would encourage anyone wishing details undergoing more detailed reading/analysis relating various use applications for these rich datasets - please contact Zeev Maoz (University California Davis) & Errol A Henderson _(Pennsylvania State University)
For more datasets, click here.
- 🚨 Your notebook can be here! 🚨!
The World Religions Project (WRP) dataset offers a comprehensive look at religious adherence around the world within a single dataset. With this dataset, you can track global religious trends over a period of 65 years and explore how they’ve changed during that time. By exploring the WRP data set, you’ll gain insight into cross-regional and cross-time patterns in religious affiliation around the world.
- Analyzing historical patterns of religious growth and decline across different regions
- Creating visualizations to compare religious adherence in various states, countries, or globally
- Studying the impact of governmental policies on religious participation over time
If you use this dataset in your research, please credit the original authors. Data Source
License: Dataset copyright by authors - You are free to: - Share - copy and redistribute the material in any medium or format for any purpose, even commercially. - Adapt - remix, transform, and build upon the material for any purpose, even commercially. - You must: - Give appropriate credit - Provide a link to the license, and indicate if changes were made. - ShareAlike - You must distribute your contributions under the same license as the original. - Keep intact - all notices that refer to this license, including copyright notices.
File: WRP regional data.csv | Column name | Description | |:-----------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------| | Year | Reference year for data collection. (Integer) | | Region | World region according to Correlates Of War (COW) Regional Systemizations with one modification (Oceania category for COW country code ...
This dataset contains counts of deaths for California as a whole based on information entered on death certificates. Final counts are derived from static data and include out-of-state deaths to California residents, whereas provisional counts are derived from incomplete and dynamic data. Provisional counts are based on the records available when the data was retrieved and may not represent all deaths that occurred during the time period. Deaths involving injuries from external or environmental forces, such as accidents, homicide and suicide, often require additional investigation that tends to delay certification of the cause and manner of death. This can result in significant under-reporting of these deaths in provisional data.
The final data tables include both deaths that occurred in California regardless of the place of residence (by occurrence) and deaths to California residents (by residence), whereas the provisional data table only includes deaths that occurred in California regardless of the place of residence (by occurrence). The data are reported as totals, as well as stratified by age, gender, race-ethnicity, and death place type. Deaths due to all causes (ALL) and selected underlying cause of death categories are provided. See temporal coverage for more information on which combinations are available for which years.
The cause of death categories are based solely on the underlying cause of death as coded by the International Classification of Diseases. The underlying cause of death is defined by the World Health Organization (WHO) as "the disease or injury which initiated the train of events leading directly to death, or the circumstances of the accident or violence which produced the fatal injury." It is a single value assigned to each death based on the details as entered on the death certificate. When more than one cause is listed, the order in which they are listed can affect which cause is coded as the underlying cause. This means that similar events could be coded with different underlying causes of death depending on variations in how they were entered. Consequently, while underlying cause of death provides a convenient comparison between cause of death categories, it may not capture the full impact of each cause of death as it does not always take into account all conditions contributing to the death.
The Global Trust dataset measures how much trust people around the world have in major institutions and social networks. It contains two data files, one with the raw survey data and one putting the raw data into percentages of trust in certain institutions. These can be analyzed in different ways. The data comes from surveys of over 119,088 people from 113 countries. Survey respondents were asked such things as “How much do you trust each of the following: other people in your neighborhood; your national government; scientists; journalists; doctors and nurses; people who work at non-governmental or non-profit organizations; healers? Do you trust them a lot, some, not much, or not at all?" The National Trust Codebook contains both the survey and the national rate codebook files, titled “Survey” and “Rate” respectively. Both files contain the same variables such as neighbors, government, and journalists, with the only difference being that “Survey” has id as a variable to account for the 119,088 unique responses. The Survey file has the raw data of all the 119,088 unique responses and both categorical and ordinal variables. It can be used to analyze how different countries feel about trust in different people or institutions as well as how those variables can relate to each other. The Rate file creates a percentage of how much people from each country trust certain communities or institutions and this can be used to analyze how different countries feel about certain things, this allows room to analyze each country with each other in a more clear way than the raw data. Both files are unique in the sense of the data being worldwide, it is a unique trait to be able to compare from different countries survey respondents that were asked the same questions with the same methodology, making comparison all the more easy. Another interesting element of this survey data is the number of responses per nation. There were, at minimum, 1000 responses gathered from each nation featured in the survey. The sample size allows for better than typical representation for each country.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Analysis of ‘COVID vaccination vs. mortality ’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/sinakaraji/covid-vaccination-vs-death on 12 November 2021.
--- Dataset description provided by original source is as follows ---
The COVID-19 outbreak has brought the whole planet to its knees.More over 4.5 million people have died since the writing of this notebook, and the only acceptable way out of the disaster is to vaccinate all parts of society. Despite the fact that the benefits of vaccination have been proved to the world many times, anti-vaccine groups are springing up all over the world. This data set was generated to investigate the impact of coronavirus vaccinations on coronavirus mortality.
country | iso_code | date | total_vaccinations | people_vaccinated | people_fully_vaccinated | New_deaths | population | ratio |
---|---|---|---|---|---|---|---|---|
country name | iso code for each country | date that this data belong | number of all doses of COVID vaccine usage in that country | number of people who got at least one shot of COVID vaccine | number of people who got full vaccine shots | number of daily new deaths | 2021 country population | % of vaccinations in that country at that date = people_vaccinated/population * 100 |
This dataset is a combination of the following three datasets:
1.https://www.kaggle.com/gpreda/covid-world-vaccination-progress
2.https://covid19.who.int/WHO-COVID-19-global-data.csv
3.https://www.kaggle.com/rsrishav/world-population
you can find more detail about this dataset by reading this notebook:
https://www.kaggle.com/sinakaraji/simple-linear-regression-covid-vaccination
Afghanistan | Albania | Algeria | Andorra | Angola |
Anguilla | Antigua and Barbuda | Argentina | Armenia | Aruba |
Australia | Austria | Azerbaijan | Bahamas | Bahrain |
Bangladesh | Barbados | Belarus | Belgium | Belize |
Benin | Bermuda | Bhutan | Bolivia (Plurinational State of) | Brazil |
Bosnia and Herzegovina | Botswana | Brunei Darussalam | Bulgaria | Burkina Faso |
Cambodia | Cameroon | Canada | Cabo Verde | Cayman Islands |
Central African Republic | Chad | Chile | China | Colombia |
Comoros | Cook Islands | Costa Rica | Croatia | Cuba |
Curaçao | Cyprus | Denmark | Djibouti | Dominica |
Dominican Republic | Ecuador | Egypt | El Salvador | Equatorial Guinea |
Estonia | Ethiopia | Falkland Islands (Malvinas) | Fiji | Finland |
France | French Polynesia | Gabon | Gambia | Georgia |
Germany | Ghana | Gibraltar | Greece | Greenland |
Grenada | Guatemala | Guinea | Guinea-Bissau | Guyana |
Haiti | Honduras | Hungary | Iceland | India |
Indonesia | Iran (Islamic Republic of) | Iraq | Ireland | Isle of Man |
Israel | Italy | Jamaica | Japan | Jordan |
Kazakhstan | Kenya | Kiribati | Kuwait | Kyrgyzstan |
Lao People's Democratic Republic | Latvia | Lebanon | Lesotho | Liberia |
Libya | Liechtenstein | Lithuania | Luxembourg | Madagascar |
Malawi | Malaysia | Maldives | Mali | Malta |
Mauritania | Mauritius | Mexico | Republic of Moldova | Monaco |
Mongolia | Montenegro | Montserrat | Morocco | Mozambique |
Myanmar | Namibia | Nauru | Nepal | Netherlands |
New Caledonia | New Zealand | Nicaragua | Niger | Nigeria |
Niue | North Macedonia | Norway | Oman | Pakistan |
occupied Palestinian territory, including east Jerusalem | ||||
Panama | Papua New Guinea | Paraguay | Peru | Philippines |
Poland | Portugal | Qatar | Romania | Russian Federation |
Rwanda | Saint Kitts and Nevis | Saint Lucia | ||
Saint Vincent and the Grenadines | Samoa | San Marino | Sao Tome and Principe | Saudi Arabia |
Senegal | Serbia | Seychelles | Sierra Leone | Singapore |
Slovakia | Slovenia | Solomon Islands | Somalia | South Africa |
Republic of Korea | South Sudan | Spain | Sri Lanka | Sudan |
Suriname | Sweden | Switzerland | Syrian Arab Republic | Tajikistan |
United Republic of Tanzania | Thailand | Togo | Tonga | Trinidad and Tobago |
Tunisia | Turkey | Turkmenistan | Turks and Caicos Islands | Tuvalu |
Uganda | Ukraine | United Arab Emirates | The United Kingdom | United States of America |
Uruguay | Uzbekistan | Vanuatu | Venezuela (Bolivarian Republic of) | Viet Nam |
Wallis and Futuna | Yemen | Zambia | Zimbabwe |
--- Original source retains full ownership of the source dataset ---
The total amount of data created, captured, copied, and consumed globally is forecast to increase rapidly, reaching *** zettabytes in 2024. Over the next five years up to 2028, global data creation is projected to grow to more than *** zettabytes. In 2020, the amount of data created and replicated reached a new high. The growth was higher than previously expected, caused by the increased demand due to the COVID-19 pandemic, as more people worked and learned from home and used home entertainment options more often. Storage capacity also growing Only a small percentage of this newly created data is kept though, as just * percent of the data produced and consumed in 2020 was saved and retained into 2021. In line with the strong growth of the data volume, the installed base of storage capacity is forecast to increase, growing at a compound annual growth rate of **** percent over the forecast period from 2020 to 2025. In 2020, the installed base of storage capacity reached *** zettabytes.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Historical chart and dataset showing World life expectancy by year from 1950 to 2025.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset includes images of people from all around the world. I noticed that on the internet, there was surprisingly no model that could detect genders, simply, male and female. The ones that could were not performing good and the datasets were too big and hard to download, or the models were havng API issues, atleast for me. This dataset is very carefully annotated, and you can use object detection yolo model to create a very reliable gender detection dataset, for detecting male and female.
Well-functioning financial systems serve a vital purpose, offering savings, credit, payment, and risk management products to people with a wide range of needs. Yet until now little had been known about the global reach of the financial sector - the extent of financial inclusion and the degree to which such groups as the poor, women, and youth are excluded from formal financial systems. Systematic indicators of the use of different financial services had been lacking for most economies.
The Global Financial Inclusion (Global Findex) database provides such indicators. This database contains the first round of Global Findex indicators, measuring how adults in more than 140 economies save, borrow, make payments, and manage risk. The data set can be used to track the effects of financial inclusion policies globally and develop a deeper and more nuanced understanding of how people around the world manage their day-to-day finances. By making it possible to identify segments of the population excluded from the formal financial sector, the data can help policy makers prioritize reforms and design new policies.
National Coverage.
Individual
The target population is the civilian, non-institutionalized population 15 years and above. The sample is nationally representative.
Sample survey data [ssd]
The Global Findex indicators are drawn from survey data collected by Gallup, Inc. over the 2011 calendar year, covering more than 150,000 adults in 148 economies and representing about 97 percent of the world's population. Since 2005, Gallup has surveyed adults annually around the world, using a uniform methodology and randomly selected, nationally representative samples. The second round of Global Findex indicators was collected in 2014 and is forthcoming in 2015. The set of indicators will be collected again in 2017.
Surveys were conducted face-to-face in economies where landline telephone penetration is less than 80 percent, or where face-to-face interviewing is customary. The first stage of sampling is the identification of primary sampling units, consisting of clusters of households. The primary sampling units are stratified by population size, geography, or both, and clustering is achieved through one or more stages of sampling. Where population information is available, sample selection is based on probabilities proportional to population size; otherwise, simple random sampling is used. Random route procedures are used to select sampled households. Unless an outright refusal occurs, interviewers make up to three attempts to survey the sampled household. If an interview cannot be obtained at the initial sampled household, a simple substitution method is used. Respondents are randomly selected within the selected households by means of the Kish grid.
Surveys were conducted by telephone in economies where landline telephone penetration is over 80 percent. The telephone surveys were conducted using random digit dialing or a nationally representative list of phone numbers. In selected countries where cell phone penetration is high, a dual sampling frame is used. Random respondent selection is achieved by using either the latest birthday or Kish grid method. At least three attempts are made to teach a person in each household, spread over different days and times of year.
The sample size in Latvia was 1,006 individuals.
Face-to-face [f2f]
The questionnaire was designed by the World Bank, in conjunction with a Technical Advisory Board composed of leading academics, practitioners, and policy makers in the field of financial inclusion. The Bill and Melinda Gates Foundation and Gallup, Inc. also provided valuable input. The questionnaire was piloted in over 20 countries using focus groups, cognitive interviews, and field testing. The questionnaire is available in 142 languages upon request.
Questions on insurance, mobile payments, and loan purposes were asked only in developing economies. The indicators on awareness and use of microfinance insitutions (MFIs) are not included in the public dataset. However, adults who report saving at an MFI are considered to have an account; this is reflected in the composite account indicator.
Estimates of standard errors (which account for sampling error) vary by country and indicator. For country- and indicator-specific standard errors, refer to the Annex and Country Table in Demirguc-Kunt, Asli and L. Klapper. 2012. "Measuring Financial Inclusion: The Global Findex." Policy Research Working Paper 6025, World Bank, Washington, D.C.
Well-functioning financial systems serve a vital purpose, offering savings, credit, payment, and risk management products to people with a wide range of needs. Yet until now little had been known about the global reach of the financial sector - the extent of financial inclusion and the degree to which such groups as the poor, women, and youth are excluded from formal financial systems. Systematic indicators of the use of different financial services had been lacking for most economies.
The Global Financial Inclusion (Global Findex) database provides such indicators. This database contains the first round of Global Findex indicators, measuring how adults in more than 140 economies save, borrow, make payments, and manage risk. The data set can be used to track the effects of financial inclusion policies globally and develop a deeper and more nuanced understanding of how people around the world manage their day-to-day finances. By making it possible to identify segments of the population excluded from the formal financial sector, the data can help policy makers prioritize reforms and design new policies.
National Coverage.
Individual
The target population is the civilian, non-institutionalized population 15 years and above.
Sample survey data [ssd]
The Global Findex indicators are drawn from survey data collected by Gallup, Inc. over the 2011 calendar year, covering more than 150,000 adults in 148 economies and representing about 97 percent of the world's population. Since 2005, Gallup has surveyed adults annually around the world, using a uniform methodology and randomly selected, nationally representative samples. The second round of Global Findex indicators was collected in 2014 and is forthcoming in 2015. The set of indicators will be collected again in 2017.
Surveys were conducted face-to-face in economies where landline telephone penetration is less than 80 percent, or where face-to-face interviewing is customary. The first stage of sampling is the identification of primary sampling units, consisting of clusters of households. The primary sampling units are stratified by population size, geography, or both, and clustering is achieved through one or more stages of sampling. Where population information is available, sample selection is based on probabilities proportional to population size; otherwise, simple random sampling is used. Random route procedures are used to select sampled households. Unless an outright refusal occurs, interviewers make up to three attempts to survey the sampled household. If an interview cannot be obtained at the initial sampled household, a simple substitution method is used. Respondents are randomly selected within the selected households by means of the Kish grid.
Surveys were conducted by telephone in economies where landline telephone penetration is over 80 percent. The telephone surveys were conducted using random digit dialing or a nationally representative list of phone numbers. In selected countries where cell phone penetration is high, a dual sampling frame is used. Random respondent selection is achieved by using either the latest birthday or Kish grid method. At least three attempts are made to teach a person in each household, spread over different days and times of year.
The sample size in Afghanistan was 1,000 individuals. Gender-matched sampling was used during the final stage of selection.
Face-to-face [f2f]
The questionnaire was designed by the World Bank, in conjunction with a Technical Advisory Board composed of leading academics, practitioners, and policy makers in the field of financial inclusion. The Bill and Melinda Gates Foundation and Gallup, Inc. also provided valuable input. The questionnaire was piloted in over 20 countries using focus groups, cognitive interviews, and field testing. The questionnaire is available in 142 languages upon request.
Questions on insurance, mobile payments, and loan purposes were asked only in developing economies. The indicators on awareness and use of microfinance insitutions (MFIs) are not included in the public dataset. However, adults who report saving at an MFI are considered to have an account; this is reflected in the composite account indicator.
Estimates of standard errors (which account for sampling error) vary by country and indicator. For country- and indicator-specific standard errors, refer to the Annex and Country Table in Demirguc-Kunt, Asli and L. Klapper. 2012. "Measuring Financial Inclusion: The Global Findex." Policy Research Working Paper 6025, World Bank, Washington, D.C.
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Language | Number of Samples |
Java | 153,119 |
Ruby | 233,710 |
Go | 137,998 |
JavaScript | 373,598 |
Python | 472,469 |
PHP | 294,394 |
[Edit 12/09/2020] You will now find in the files below the last 30 days, too many people do not respect the request not to recover too often the dataset (no interest in recovering every minute while the file changes 4 or 5 times a day) If you want access to the entire history, contact me [Edit 31/03/2020] Since yesterday, I made sure to have the data of the day since the ESSC, so the data of the same day are now available and updated several times a day (about every hour) as the new figures fall all over the world. The data of the previous day is always consolidated around 2am (it is no longer 1h since the time change). If you only want to have the complete data, just don't take into account the last day (today’s date) Here I share the data that I compile with the famous coronavirus infection world map created and maintained by The Johns Hopkins University and which serve me to display ** CoronaVirus statistics worldwide and by country** They share the day’s data each night on a GitHub deposit. My tools compile this new data as soon as they are available and I share the result here. This data is used to display tables and graphs on the CoronaVirus website (Covid19) of Politologue.com https://coronavirus.politologue.com/ This data will allow you to make your own graphs and analyses if you look at the subject. I do not oblige you to do it, but if my compilation allows you to do something about it and saved you time, a link to https://coronavirus.politologue.com/ will be appreciable. Information in files (csv and json) — Number of cases — Number of deaths — Number of healing — Death rate (percentage) — Healing rate (percentage) — Infection rate (persons still infected, not deceased or cured) (percentage) — And for data by country, you will find a field “country” If you integrate the client-side json or csv on a site or application, please keep a cache on your servers without risking an unexpected load on my servers.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
AbstractRecognizing the value of open-source research databases in advancing the art and science of HVAC, in 2014 the ASHRAE Global Thermal Comfort Database II project was launched under the leadership of University of California at Berkeley’s Center for the Built Environment and The University of Sydney’s Indoor Environmental Quality (IEQ) Laboratory. The ASHRAE Global Thermal Comfort Database II (as it is known) is intended to support diverse inquiries about thermal comfort in field settings. The exercise began with a systematic collection and harmonization of raw data from the last two decades of thermal comfort field studies around the world. The final database is comprised of field studies from around the world, with contributors releasing their raw data to the project for wider dissemination to the thermal comfort research community. After the quality-assurance process, there was a total of 77,304 rows of data of paired subjective comfort votes and objective instrumental measurements of thermal comfort parameters. An additional 25,288 rows of data from the original ASHRAE RP-884 database are included. The most recent update (version 2.1) has 6,441 new rows of data, bringing the total number of entries to 109,033. The project was partially performed within the framework of the International Energy Agency Energy in Buildings and Communities programm (IEA-EBC) Annex 69 "Strategy and Practice of Adaptive Thermal Comfort in Low Energy Buildings.
MethodsIn order to ensure that the quality of the database would permit end-users to conduct robust hypothesis testing, the team built the data collection methodology on specific requirements, as follows: Data needed to come from field experiments rather than climate chamber research, so that it represented research conducted in “real” buildings occupied by “real” people doing their normal day-to-day activities, rather than paid college students sitting in a controlled indoor environment of a climate chamber. Both instrumental (indoor climatic) and subjective (questionnaire) data were required, such that they were recorded in the same space at the same time The database needed to be built up from the raw data files generated by the original researchers, instead of their processed or published findings. The raw data needed to come with a supporting codebook explaining the coding conventions used by the data contributor, to allow harmonization with the standardized data formatting within the database. Data must have been published either in a peer-reviewed journal or conference paper. All datasets from individual studies were subject to a stringent quality assurance process before being assimilated into the database. The research team conducted a final validation by first comparing each raw dataset with its related publication provided by the data contributor to prevent transmission errors. Systematic quality control of each study was performed to ensure that records within the database were reasonable. Firstly, distributions of each variable were visualized to identify aberrant values. Then, cross-plots between two variables (e.g. thermal sensation and thermal comfort) were used to check for incorrectly coded data. Finally, a few rows from each study were randomly selected to verify consistency between the original dataset and the standardized database. Since the data came from multiple independent studies, every record did not necessarily include all of the thermal comfort variables. Where data were missing, that particular range of cells was filled with a null value.
Usage notesThe dataset is seperated into a metadata
table and a `measurements table. The metadata table has high-level information at the building level and is provided as a .csv file. The measurement table contiains all field measurements and is provided as a compressed comma-separated value (.csv.gz) file using UTF-8 character encoding. The first row contains human-readable column headers. Each row represents an individual’s questionnaire responses, and the associated instrumental measurements, thermal index values and outdoor meteorological observations where available. Full details can be found in the readme document.
Well-functioning financial systems serve a vital purpose, offering savings, credit, payment, and risk management products to people with a wide range of needs. Yet until now little had been known about the global reach of the financial sector - the extent of financial inclusion and the degree to which such groups as the poor, women, and youth are excluded from formal financial systems. Systematic indicators of the use of different financial services had been lacking for most economies.
The Global Financial Inclusion (Global Findex) database provides such indicators. This database contains the first round of Global Findex indicators, measuring how adults in more than 140 economies save, borrow, make payments, and manage risk. The data set can be used to track the effects of financial inclusion policies globally and develop a deeper and more nuanced understanding of how people around the world manage their day-to-day finances. By making it possible to identify segments of the population excluded from the formal financial sector, the data can help policy makers prioritize reforms and design new policies.
National Coverage.
Individual
The target population is the civilian, non-institutionalized population 15 years and above. The sample includes only Saudi Arabians and Arab expatriates. The excluded population represents approximately 20% of the total adult population.
Sample survey data [ssd]
The Global Findex indicators are drawn from survey data collected by Gallup, Inc. over the 2011 calendar year, covering more than 150,000 adults in 148 economies and representing about 97 percent of the world's population. Since 2005, Gallup has surveyed adults annually around the world, using a uniform methodology and randomly selected, nationally representative samples. The second round of Global Findex indicators was collected in 2014 and is forthcoming in 2015. The set of indicators will be collected again in 2017.
Surveys were conducted face-to-face in economies where landline telephone penetration is less than 80 percent, or where face-to-face interviewing is customary. The first stage of sampling is the identification of primary sampling units, consisting of clusters of households. The primary sampling units are stratified by population size, geography, or both, and clustering is achieved through one or more stages of sampling. Where population information is available, sample selection is based on probabilities proportional to population size; otherwise, simple random sampling is used. Random route procedures are used to select sampled households. Unless an outright refusal occurs, interviewers make up to three attempts to survey the sampled household. If an interview cannot be obtained at the initial sampled household, a simple substitution method is used. Respondents are randomly selected within the selected households by means of the Kish grid.
Surveys were conducted by telephone in economies where landline telephone penetration is over 80 percent. The telephone surveys were conducted using random digit dialing or a nationally representative list of phone numbers. In selected countries where cell phone penetration is high, a dual sampling frame is used. Random respondent selection is achieved by using either the latest birthday or Kish grid method. At least three attempts are made to teach a person in each household, spread over different days and times of year.
The sample size in the majority of economies was 1,000 individuals.
Face-to-face [f2f]
The questionnaire was designed by the World Bank, in conjunction with a Technical Advisory Board composed of leading academics, practitioners, and policy makers in the field of financial inclusion. The Bill and Melinda Gates Foundation and Gallup, Inc. also provided valuable input. The questionnaire was piloted in over 20 countries using focus groups, cognitive interviews, and field testing. The questionnaire is available in 142 languages upon request.
Questions on insurance, mobile payments, and loan purposes were asked only in developing economies. The indicators on awareness and use of microfinance insitutions (MFIs) are not included in the public dataset. However, adults who report saving at an MFI are considered to have an account; this is reflected in the composite account indicator.
Estimates of standard errors (which account for sampling error) vary by country and indicator. For country- and indicator-specific standard errors, refer to the Annex and Country Table in Demirguc-Kunt, Asli and L. Klapper. 2012. "Measuring Financial Inclusion: The Global Findex." Policy Research Working Paper 6025, World Bank, Washington, D.C.
Well-functioning financial systems serve a vital purpose, offering savings, credit, payment, and risk management products to people with a wide range of needs. Yet until now little had been known about the global reach of the financial sector - the extent of financial inclusion and the degree to which such groups as the poor, women, and youth are excluded from formal financial systems. Systematic indicators of the use of different financial services had been lacking for most economies.
The Global Financial Inclusion (Global Findex) database provides such indicators. This database contains the first round of Global Findex indicators, measuring how adults in more than 140 economies save, borrow, make payments, and manage risk. The data set can be used to track the effects of financial inclusion policies globally and develop a deeper and more nuanced understanding of how people around the world manage their day-to-day finances. By making it possible to identify segments of the population excluded from the formal financial sector, the data can help policy makers prioritize reforms and design new policies.
See Methodology document for country-specific geographic coverage details.
Individual
The target population is the civilian, non-institutionalized population 15 years and above.
Sample survey data [ssd]
The Global Findex indicators are drawn from survey data collected by Gallup, Inc. over the 2011 calendar year, covering more than 150,000 adults in 148 economies and representing about 97 percent of the world's population. Since 2005, Gallup has surveyed adults annually around the world, using a uniform methodology and randomly selected, nationally representative samples. The second round of Global Findex indicators was collected in 2014 and is forthcoming in 2015. The set of indicators will be collected again in 2017.
Surveys were conducted face-to-face in economies where landline telephone penetration is less than 80 percent, or where face-to-face interviewing is customary. The first stage of sampling is the identification of primary sampling units, consisting of clusters of households. The primary sampling units are stratified by population size, geography, or both, and clustering is achieved through one or more stages of sampling. Where population information is available, sample selection is based on probabilities proportional to population size; otherwise, simple random sampling is used. Random route procedures are used to select sampled households. Unless an outright refusal occurs, interviewers make up to three attempts to survey the sampled household. If an interview cannot be obtained at the initial sampled household, a simple substitution method is used. Respondents are randomly selected within the selected households by means of the Kish grid.
Surveys were conducted by telephone in economies where landline telephone penetration is over 80 percent. The telephone surveys were conducted using random digit dialing or a nationally representative list of phone numbers. In selected countries where cell phone penetration is high, a dual sampling frame is used. Random respondent selection is achieved by using either the latest birthday or Kish grid method. At least three attempts are made to teach a person in each household, spread over different days and times of year.
The sample size in the majority of economies was 1,000 individuals.
Face-to-face [f2f] OR Landline telephone OR Landline and cellular telephone
The questionnaire was designed by the World Bank, in conjunction with a Technical Advisory Board composed of leading academics, practitioners, and policy makers in the field of financial inclusion. The Bill and Melinda Gates Foundation and Gallup, Inc. also provided valuable input. The questionnaire was piloted in over 20 countries using focus groups, cognitive interviews, and field testing. The questionnaire is available in 142 languages upon request.
Questions on insurance, mobile payments, and loan purposes were asked only in developing economies. The indicators on awareness and use of microfinance insitutions (MFIs) are not included in the public dataset. However, adults who report saving at an MFI are considered to have an account; this is reflected in the composite account indicator.
Estimates of standard errors (which account for sampling error) vary by country and indicator. For country- and indicator-specific standard errors, refer to the Annex and Country Table in Demirguc-Kunt, Asli and L. Klapper. 2012. "Measuring Financial Inclusion: The Global Findex." Policy Research Working Paper 6025, World Bank, Washington, D.C.
Patterns of educational attainment vary greatly across countries, and across population groups within countries. In some countries, virtually all children complete basic education whereas in others large groups fall short. The primary purpose of this database, and the associated research program, is to document and analyze these differences using a compilation of a variety of household-based data sets: Demographic and Health Surveys (DHS); Multiple Indicator Cluster Surveys (MICS); Living Standards Measurement Study Surveys (LSMS); as well as country-specific Integrated Household Surveys (IHS) such as Socio-Economic Surveys.As shown at the website associated with this database, there are dramatic differences in attainment by wealth. When households are ranked according to their wealth status (or more precisely, a proxy based on the assets owned by members of the household) there are striking differences in the attainment patterns of children from the richest 20 percent compared to the poorest 20 percent.In Mali in 2012 only 34 percent of 15 to 19 year olds in the poorest quintile have completed grade 1 whereas 80 percent of the richest quintile have done so. In many countries, for example Pakistan, Peru and Indonesia, almost all the children from the wealthiest households have completed at least one year of schooling. In some countries, like Mali and Pakistan, wealth gaps are evident from grade 1 on, in other countries, like Peru and Indonesia, wealth gaps emerge later in the school system.The EdAttain website allows a visual exploration of gaps in attainment and enrollment within and across countries, based on the international database which spans multiple years from over 120 countries and includes indicators disaggregated by wealth, gender and urban/rural location. The database underlying that site can be downloaded from here.