MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
This dataset provides population data for all 195 recognized countries over a span of 11 years (2014–2024). Each country's population is projected using a base population from 2014 and realistic annual growth rates ranging between 0.5% and 3%. This dataset is ideal for demographic studies, trend analysis, and data visualization.
The population figures are simulated but based on reasonable assumptions about growth rates to ensure a realistic representation.
The world population surpassed eight billion people in 2022, having doubled from its figure less than 50 years previously. Looking forward, it is projected that the world population will reach nine billion in 2038, and 10 billion in 2060, but it will peak around 10.3 billion in the 2080s before it then goes into decline. Regional variations The global population has seen rapid growth since the early 1800s, due to advances in areas such as food production, healthcare, water safety, education, and infrastructure, however, these changes did not occur at a uniform time or pace across the world. Broadly speaking, the first regions to undergo their demographic transitions were Europe, North America, and Oceania, followed by Latin America and Asia (although Asia's development saw the greatest variation due to its size), while Africa was the last continent to undergo this transformation. Because of these differences, many so-called "advanced" countries are now experiencing population decline, particularly in Europe and East Asia, while the fastest population growth rates are found in Sub-Saharan Africa. In fact, the roughly two billion difference in population between now and the 2080s' peak will be found in Sub-Saharan Africa, which will rise from 1.2 billion to 3.2 billion in this time (although populations in other continents will also fluctuate). Changing projections The United Nations releases their World Population Prospects report every 1-2 years, and this is widely considered the foremost demographic dataset in the world. However, recent years have seen a notable decline in projections when the global population will peak, and at what number. Previous reports in the 2010s had suggested a peak of over 11 billion people, and that population growth would continue into the 2100s, however a sooner and shorter peak is now projected. Reasons for this include a more rapid population decline in East Asia and Europe, particularly China, as well as a prolongued development arc in Sub-Saharan Africa.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
All cities with a population > 1000 or seats of adm div (ca 80.000)Sources and ContributionsSources : GeoNames is aggregating over hundred different data sources. Ambassadors : GeoNames Ambassadors help in many countries. Wiki : A wiki allows to view the data and quickly fix error and add missing places. Donations and Sponsoring : Costs for running GeoNames are covered by donations and sponsoring.Enrichment:add country name
This dataset was created by Walter Maffione
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Context
The dataset tabulates the White Earth population over the last 20 plus years. It lists the population for each year, along with the year on year change in population, as well as the change in percentage terms for each year. The dataset can be utilized to understand the population change of White Earth across the last two decades. For example, using this dataset, we can identify if the population is declining or increasing. If there is a change, when the population peaked, or if it is still growing and has not reached its peak. We can also compare the trend with the overall trend of United States population over the same period of time.
Key observations
In 2023, the population of White Earth was 93, a 0% decrease year-by-year from 2022. Previously, in 2022, White Earth population was 93, a decline of 4.12% compared to a population of 97 in 2021. Over the last 20 plus years, between 2000 and 2023, population of White Earth increased by 28. In this period, the peak population was 99 in the year 2020. The numbers suggest that the population has already reached its peak and is showing a trend of decline. Source: U.S. Census Bureau Population Estimates Program (PEP).
When available, the data consists of estimates from the U.S. Census Bureau Population Estimates Program (PEP).
Data Coverage:
Variables / Data Columns
Good to know
Margin of Error
Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.
Custom data
If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.
Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.
This dataset is a part of the main dataset for White Earth Population by Year. You can refer the same here
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Any work using this dataset should cite the following paper:
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Context
The dataset tabulates the Earth Hispanic or Latino population. It includes the distribution of the Hispanic or Latino population, of Earth, by their ancestries, as identified by the Census Bureau. The dataset can be utilized to understand the origin of the Hispanic or Latino population of Earth.
Key observations
Among the Hispanic population in Earth, regardless of the race, the largest group is of Mexican origin, with a population of 588 (94.84% of the total Hispanic population).
When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.
Origin for Hispanic or Latino population include:
Variables / Data Columns
Good to know
Margin of Error
Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.
Custom data
If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.
Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.
This dataset is a part of the main dataset for Earth Population by Race & Ethnicity. You can refer the same here
The total amount of data created, captured, copied, and consumed globally is forecast to increase rapidly, reaching 149 zettabytes in 2024. Over the next five years up to 2028, global data creation is projected to grow to more than 394 zettabytes. In 2020, the amount of data created and replicated reached a new high. The growth was higher than previously expected, caused by the increased demand due to the COVID-19 pandemic, as more people worked and learned from home and used home entertainment options more often. Storage capacity also growing Only a small percentage of this newly created data is kept though, as just two percent of the data produced and consumed in 2020 was saved and retained into 2021. In line with the strong growth of the data volume, the installed base of storage capacity is forecast to increase, growing at a compound annual growth rate of 19.2 percent over the forecast period from 2020 to 2025. In 2020, the installed base of storage capacity reached 6.7 zettabytes.
Patterns of educational attainment vary greatly across countries, and across population groups within countries. In some countries, virtually all children complete basic education whereas in others large groups fall short. The primary purpose of this database, and the associated research program, is to document and analyze these differences using a compilation of a variety of household-based data sets: Demographic and Health Surveys (DHS); Multiple Indicator Cluster Surveys (MICS); Living Standards Measurement Study Surveys (LSMS); as well as country-specific Integrated Household Surveys (IHS) such as Socio-Economic Surveys.As shown at the website associated with this database, there are dramatic differences in attainment by wealth. When households are ranked according to their wealth status (or more precisely, a proxy based on the assets owned by members of the household) there are striking differences in the attainment patterns of children from the richest 20 percent compared to the poorest 20 percent.In Mali in 2012 only 34 percent of 15 to 19 year olds in the poorest quintile have completed grade 1 whereas 80 percent of the richest quintile have done so. In many countries, for example Pakistan, Peru and Indonesia, almost all the children from the wealthiest households have completed at least one year of schooling. In some countries, like Mali and Pakistan, wealth gaps are evident from grade 1 on, in other countries, like Peru and Indonesia, wealth gaps emerge later in the school system.The EdAttain website allows a visual exploration of gaps in attainment and enrollment within and across countries, based on the international database which spans multiple years from over 120 countries and includes indicators disaggregated by wealth, gender and urban/rural location. The database underlying that site can be downloaded from here.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
The World Bank is an international financial institution that provides loans to countries of the world for capital projects. The World Bank's stated goal is the reduction of poverty. Source: https://en.wikipedia.org/wiki/World_Bank
This dataset combines key health statistics from a variety of sources to provide a look at global health and population trends. It includes information on nutrition, reproductive health, education, immunization, and diseases from over 200 countries.
Update Frequency: Biannual
For more information, see the World Bank website.
Fork this kernel to get started with this dataset.
https://datacatalog.worldbank.org/dataset/health-nutrition-and-population-statistics
https://cloud.google.com/bigquery/public-data/world-bank-hnp
Dataset Source: World Bank. This dataset is publicly available for anyone to use under the following terms provided by the Dataset Source - http://www.data.gov/privacy-policy#data_policy - and is provided "AS IS" without any warranty, express or implied, from Google. Google disclaims all liability for any damages, direct or indirect, resulting from the use of the dataset.
Citation: The World Bank: Health Nutrition and Population Statistics
Banner Photo by @till_indeman from Unplash.
What’s the average age of first marriages for females around the world?
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset is about books, has 1 rows. and is filtered where the book is World-wide issues : people, resources and the environment. It features 7 columns including book, author, publication date, language, and book publisher. The preview is ordered by publication date (descending).
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset is about book subjects, has 4 rows. and is filtered where the books is Thoughts for developing world leaders and people. It features 10 columns including book subject, number of authors, number of books, earliest publication date, and latest publication date. The preview is ordered by number of books (descending).
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Context
The dataset tabulates the population of Globe by gender across 18 age groups. It lists the male and female population in each age group along with the gender ratio for Globe. The dataset can be utilized to understand the population distribution of Globe by gender and age. For example, using this dataset, we can identify the largest age group for both Men and Women in Globe. Additionally, it can be used to see how the gender ratio changes from birth to senior most age group and male to female ratio across each age group for Globe.
Key observations
Largest age group (population): Male # 20-24 years (347) | Female # 50-54 years (433). Source: U.S. Census Bureau American Community Survey (ACS) 2017-2021 5-Year Estimates.
When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2017-2021 5-Year Estimates.
Age groups:
Scope of gender :
Please note that American Community Survey asks a question about the respondents current sex, but not about gender, sexual orientation, or sex at birth. The question is intended to capture data for biological sex, not gender. Respondents are supposed to respond with the answer as either of Male or Female. Our research and this dataset mirrors the data reported as Male and Female for gender distribution analysis.
Variables / Data Columns
Good to know
Margin of Error
Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.
Custom data
If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.
Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.
This dataset is a part of the main dataset for Globe Population by Gender. You can refer the same here
How much time do people spend on social media? As of 2024, the average daily social media usage of internet users worldwide amounted to 143 minutes per day, down from 151 minutes in the previous year. Currently, the country with the most time spent on social media per day is Brazil, with online users spending an average of three hours and 49 minutes on social media each day. In comparison, the daily time spent with social media in the U.S. was just two hours and 16 minutes. Global social media usageCurrently, the global social network penetration rate is 62.3 percent. Northern Europe had an 81.7 percent social media penetration rate, topping the ranking of global social media usage by region. Eastern and Middle Africa closed the ranking with 10.1 and 9.6 percent usage reach, respectively. People access social media for a variety of reasons. Users like to find funny or entertaining content and enjoy sharing photos and videos with friends, but mainly use social media to stay in touch with current events friends. Global impact of social mediaSocial media has a wide-reaching and significant impact on not only online activities but also offline behavior and life in general. During a global online user survey in February 2019, a significant share of respondents stated that social media had increased their access to information, ease of communication, and freedom of expression. On the flip side, respondents also felt that social media had worsened their personal privacy, increased a polarization in politics and heightened everyday distractions.
The global number of internet users in was forecast to continuously increase between 2024 and 2029 by in total 1.3 billion users (+23.66 percent). After the fifteenth consecutive increasing year, the number of users is estimated to reach 7 billion users and therefore a new peak in 2029. Notably, the number of internet users of was continuously increasing over the past years.Depicted is the estimated number of individuals in the country or region at hand, that use the internet. As the datasource clarifies, connection quality and usage frequency are distinct aspects, not taken into account here.The shown data are an excerpt of Statista's Key Market Indicators (KMI). The KMI are a collection of primary and secondary indicators on the macro-economic, demographic and technological environment in up to 150 countries and regions worldwide. All indicators are sourced from international and national statistical offices, trade associations and the trade press and they are processed to generate comparable data sets (see supplementary notes under details for more information).Find more key insights for the number of internet users in countries like the Americas and Asia.
Attribution-NoDerivs 4.0 (CC BY-ND 4.0)https://creativecommons.org/licenses/by-nd/4.0/
License information was derived automatically
PerCapita_CO2_Footprint_InDioceses_FULLBurhans, Molly A., Cheney, David M., Gerlt, R.. . “PerCapita_CO2_Footprint_InDioceses_FULL”. Scale not given. Version 1.0. MO and CT, USA: GoodLands Inc., Environmental Systems Research Institute, Inc., 2019.MethodologyThis is the first global Carbon footprint of the Catholic population. We will continue to improve and develop these data with our research partners over the coming years. While it is helpful, it should also be viewed and used as a "beta" prototype that we and our research partners will build from and improve. The years of carbon data are (2010) and (2015 - SHOWN). The year of Catholic data is 2018. The year of population data is 2016. Care should be taken during future developments to harmonize the years used for catholic, population, and CO2 data.1. Zonal Statistics: Esri Population Data and Dioceses --> Population per dioceses, non Vatican based numbers2. Zonal Statistics: FFDAS and Dioceses and Population dataset --> Mean CO2 per Diocese3. Field Calculation: Population per Diocese and Mean CO2 per diocese --> CO2 per Capita4. Field Calculation: CO2 per Capita * Catholic Population --> Catholic Carbon FootprintAssumption: PerCapita CO2Deriving per-capita CO2 from mean CO2 in a geography assumes that people's footprint accounts for their personal lifestyle and involvement in local business and industries that are contribute CO2. Catholic CO2Assumes that Catholics and non-Catholic have similar CO2 footprints from their lifestyles.Derived from:A multiyear, global gridded fossil fuel CO2 emission data product: Evaluation and analysis of resultshttp://ffdas.rc.nau.edu/About.htmlRayner et al., JGR, 2010 - The is the first FFDAS paper describing the version 1.0 methods and results published in the Journal of Geophysical Research.Asefi et al., 2014 - This is the paper describing the methods and results of the FFDAS version 2.0 published in the Journal of Geophysical Research.Readme version 2.2 - A simple readme file to assist in using the 10 km x 10 km, hourly gridded Vulcan version 2.2 results.Liu et al., 2017 - A paper exploring the carbon cycle response to the 2015-2016 El Nino through the use of carbon cycle data assimilation with FFDAS as the boundary condition for FFCO2."S. Asefi‐Najafabady P. J. Rayner K. R. Gurney A. McRobert Y. Song K. Coltin J. Huang C. Elvidge K. BaughFirst published: 10 September 2014 https://doi.org/10.1002/2013JD021296 Cited by: 30Link to FFDAS data retrieval and visualization: http://hpcg.purdue.edu/FFDAS/index.phpAbstractHigh‐resolution, global quantification of fossil fuel CO2 emissions is emerging as a critical need in carbon cycle science and climate policy. We build upon a previously developed fossil fuel data assimilation system (FFDAS) for estimating global high‐resolution fossil fuel CO2 emissions. We have improved the underlying observationally based data sources, expanded the approach through treatment of separate emitting sectors including a new pointwise database of global power plants, and extended the results to cover a 1997 to 2010 time series at a spatial resolution of 0.1°. Long‐term trend analysis of the resulting global emissions shows subnational spatial structure in large active economies such as the United States, China, and India. These three countries, in particular, show different long‐term trends and exploration of the trends in nighttime lights, and population reveal a decoupling of population and emissions at the subnational level. Analysis of shorter‐term variations reveals the impact of the 2008–2009 global financial crisis with widespread negative emission anomalies across the U.S. and Europe. We have used a center of mass (CM) calculation as a compact metric to express the time evolution of spatial patterns in fossil fuel CO2 emissions. The global emission CM has moved toward the east and somewhat south between 1997 and 2010, driven by the increase in emissions in China and South Asia over this time period. Analysis at the level of individual countries reveals per capita CO2 emission migration in both Russia and India. The per capita emission CM holds potential as a way to succinctly analyze subnational shifts in carbon intensity over time. Uncertainties are generally lower than the previous version of FFDAS due mainly to an improved nightlight data set."Global Diocesan Boundaries:Burhans, M., Bell, J., Burhans, D., Carmichael, R., Cheney, D., Deaton, M., Emge, T. Gerlt, B., Grayson, J., Herries, J., Keegan, H., Skinner, A., Smith, M., Sousa, C., Trubetskoy, S. “Diocesean Boundaries of the Catholic Church” [Feature Layer]. Scale not given. Version 1.2. Redlands, CA, USA: GoodLands Inc., Environmental Systems Research Institute, Inc., 2016.Using: ArcGIS. 10.4. Version 10.0. Redlands, CA: Environmental Systems Research Institute, Inc., 2016.Boundary ProvenanceStatistics and Leadership DataCheney, D.M. “Catholic Hierarchy of the World” [Database]. Date Updated: August 2019. Catholic Hierarchy. Using: Paradox. Retrieved from Original Source.Catholic HierarchyAnnuario Pontificio per l’Anno .. Città del Vaticano :Tipografia Poliglotta Vaticana, Multiple Years.The data for these maps was extracted from the gold standard of Church data, the Annuario Pontificio, published yearly by the Vatican. The collection and data development of the Vatican Statistics Office are unknown. GoodLands is not responsible for errors within this data. We encourage people to document and report errant information to us at data@good-lands.org or directly to the Vatican.Additional information about regular changes in bishops and sees comes from a variety of public diocesan and news announcements.GoodLands’ polygon data layers, version 2.0 for global ecclesiastical boundaries of the Roman Catholic Church:Although care has been taken to ensure the accuracy, completeness and reliability of the information provided, due to this being the first developed dataset of global ecclesiastical boundaries curated from many sources it may have a higher margin of error than established geopolitical administrative boundary maps. Boundaries need to be verified with appropriate Ecclesiastical Leadership. The current information is subject to change without notice. No parties involved with the creation of this data are liable for indirect, special or incidental damage resulting from, arising out of or in connection with the use of the information. We referenced 1960 sources to build our global datasets of ecclesiastical jurisdictions. Often, they were isolated images of dioceses, historical documents and information about parishes that were cross checked. These sources can be viewed here:https://docs.google.com/spreadsheets/d/11ANlH1S_aYJOyz4TtG0HHgz0OLxnOvXLHMt4FVOS85Q/edit#gid=0To learn more or contact us please visit: https://good-lands.org/Esri Gridded Population Data 2016DescriptionThis layer is a global estimate of human population for 2016. Esri created this estimate by modeling a footprint of where people live as a dasymetric settlement likelihood surface, and then assigned 2016 population estimates stored on polygons of the finest level of geography available onto the settlement surface. Where people live means where their homes are, as in where people sleep most of the time, and this is opposed to where they work. Another way to think of this estimate is a night-time estimate, as opposed to a day-time estimate.Knowledge of population distribution helps us understand how humans affect the natural world and how natural events such as storms and earthquakes, and other phenomena affect humans. This layer represents the footprint of where people live, and how many people live there.Dataset SummaryEach cell in this layer has an integer value with the estimated number of people likely to live in the geographic region represented by that cell. Esri additionally produced several additional layers World Population Estimate Confidence 2016: the confidence level (1-5) per cell for the probability of people being located and estimated correctly. World Population Density Estimate 2016: this layer is represented as population density in units of persons per square kilometer.World Settlement Score 2016: the dasymetric likelihood surface used to create this layer by apportioning population from census polygons to the settlement score raster.To use this layer in analysis, there are several properties or geoprocessing environment settings that should be used:Coordinate system: WGS_1984. This service and its underlying data are WGS_1984. We do this because projecting population count data actually will change the populations due to resampling and either collapsing or splitting cells to fit into another coordinate system. Cell Size: 0.0013474728 degrees (approximately 150-meters) at the equator. No Data: -1Bit Depth: 32-bit signedThis layer has query, identify, pixel, and export image functions enabled, and is restricted to a maximum analysis size of 30,000 x 30,000 pixels - an area about the size of Africa.Frye, C. et al., (2018). Using Classified and Unclassified Land Cover Data to Estimate the Footprint of Human Settlement. Data Science Journal. 17, p.20. DOI: http://doi.org/10.5334/dsj-2018-020.What can you do with this layer?This layer is unsuitable for mapping or cartographic use, and thus it does not include a convenient legend. Instead, this layer is useful for analysis, particularly for estimating counts of people living within watersheds, coastal areas, and other areas that do not have standard boundaries. Esri recommends using the Zonal Statistics tool or the Zonal Statistics to Table tool where you provide input zones as either polygons, or raster data, and the tool will summarize the count of population within those zones. https://www.esri.com/arcgis-blog/products/arcgis-living-atlas/data-management/2016-world-population-estimate-services-are-now-available/
The Global Trust dataset measures how much trust people around the world have in major institutions and social networks. It contains two data files, one with the raw survey data and one putting the raw data into percentages of trust in certain institutions. These can be analyzed in different ways. The data comes from surveys of over 119,088 people from 113 countries. Survey respondents were asked such things as “How much do you trust each of the following: other people in your neighborhood; your national government; scientists; journalists; doctors and nurses; people who work at non-governmental or non-profit organizations; healers? Do you trust them a lot, some, not much, or not at all?" The National Trust Codebook contains both the survey and the national rate codebook files, titled “Survey” and “Rate” respectively. Both files contain the same variables such as neighbors, government, and journalists, with the only difference being that “Survey” has id as a variable to account for the 119,088 unique responses. The Survey file has the raw data of all the 119,088 unique responses and both categorical and ordinal variables. It can be used to analyze how different countries feel about trust in different people or institutions as well as how those variables can relate to each other. The Rate file creates a percentage of how much people from each country trust certain communities or institutions and this can be used to analyze how different countries feel about certain things, this allows room to analyze each country with each other in a more clear way than the raw data. Both files are unique in the sense of the data being worldwide, it is a unique trait to be able to compare from different countries survey respondents that were asked the same questions with the same methodology, making comparison all the more easy. Another interesting element of this survey data is the number of responses per nation. There were, at minimum, 1000 responses gathered from each nation featured in the survey. The sample size allows for better than typical representation for each country.
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
This starter data kit collects extracts from global, open datasets relating to climate hazards and infrastructure systems.
These extracts are derived from global datasets which have been clipped to the national scale (or subnational, in cases where national boundaries have been split, generally to separate outlying islands or non-contiguous regions), using Natural Earth (2023) boundaries, and is not meant to express an opinion about borders, territory or sovereignty.
Human-induced climate change is increasing the frequency and severity of climate and weather extremes. This is causing widespread, adverse impacts to societies, economies and infrastructures. Climate risk analysis is essential to inform policy decisions aimed at reducing risk. Yet, access to data is often a barrier, particularly in low and middle-income countries. Data are often scattered, hard to find, in formats that are difficult to use or requiring considerable technical expertise. Nevertheless, there are global, open datasets which provide some information about climate hazards, society, infrastructure and the economy. This "data starter kit" aims to kickstart the process and act as a starting point for further model development and scenario analysis.
Hazards:
Exposure:
The spatial intersection of hazard and exposure datasets is a first step to analyse vulnerability and risk to infrastructure and people.
To learn more about related concepts, there is a free short course available through the Open University on Infrastructure and Climate Resilience. This overview of the course has more details.
These Python libraries may be a useful place to start analysis of the data in the packages produced by this workflow:
snkit
helps clean network data
nismod-snail
is designed to help implement infrastructure
exposure, damage and risk calculations
The open-gira
repository contains a larger workflow for global-scale open-data infrastructure risk and resilience analysis.
For a more developed example, some of these datasets were key inputs to a regional climate risk assessment of current and future flooding risks to transport networks in East Africa, which has a related online visualisation tool at https://east-africa.infrastructureresilience.org/ and is described in detail in Hickford et al (2023).
References
The dataset is a relational dataset of 8,000 households households, representing a sample of the population of an imaginary middle-income country. The dataset contains two data files: one with variables at the household level, the other one with variables at the individual level. It includes variables that are typically collected in population censuses (demography, education, occupation, dwelling characteristics, fertility, mortality, and migration) and in household surveys (household expenditure, anthropometric data for children, assets ownership). The data only includes ordinary households (no community households). The dataset was created using REaLTabFormer, a model that leverages deep learning methods. The dataset was created for the purpose of training and simulation and is not intended to be representative of any specific country.
The full-population dataset (with about 10 million individuals) is also distributed as open data.
The dataset is a synthetic dataset for an imaginary country. It was created to represent the population of this country by province (equivalent to admin1) and by urban/rural areas of residence.
Household, Individual
The dataset is a fully-synthetic dataset representative of the resident population of ordinary households for an imaginary middle-income country.
ssd
The sample size was set to 8,000 households. The fixed number of households to be selected from each enumeration area was set to 25. In a first stage, the number of enumeration areas to be selected in each stratum was calculated, proportional to the size of each stratum (stratification by geo_1 and urban/rural). Then 25 households were randomly selected within each enumeration area. The R script used to draw the sample is provided as an external resource.
other
The dataset is a synthetic dataset. Although the variables it contains are variables typically collected from sample surveys or population censuses, no questionnaire is available for this dataset. A "fake" questionnaire was however created for the sample dataset extracted from this dataset, to be used as training material.
The synthetic data generation process included a set of "validators" (consistency checks, based on which synthetic observation were assessed and rejected/replaced when needed). Also, some post-processing was applied to the data to result in the distributed data files.
This is a synthetic dataset; the "response rate" is 100%.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset provides values for CORONAVIRUS DEATHS reported in several countries. The data includes current values, previous releases, historical highs and record lows, release frequency, reported unit and currency.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
This dataset provides population data for all 195 recognized countries over a span of 11 years (2014–2024). Each country's population is projected using a base population from 2014 and realistic annual growth rates ranging between 0.5% and 3%. This dataset is ideal for demographic studies, trend analysis, and data visualization.
The population figures are simulated but based on reasonable assumptions about growth rates to ensure a realistic representation.