100+ datasets found
  1. o

    Geonames - All Cities with a population > 1000

    • public.opendatasoft.com
    • data.smartidf.services
    • +2more
    csv, excel, geojson +1
    Updated Mar 10, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). Geonames - All Cities with a population > 1000 [Dataset]. https://public.opendatasoft.com/explore/dataset/geonames-all-cities-with-a-population-1000/
    Explore at:
    csv, json, geojson, excelAvailable download formats
    Dataset updated
    Mar 10, 2024
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    All cities with a population > 1000 or seats of adm div (ca 80.000)Sources and ContributionsSources : GeoNames is aggregating over hundred different data sources. Ambassadors : GeoNames Ambassadors help in many countries. Wiki : A wiki allows to view the data and quickly fix error and add missing places. Donations and Sponsoring : Costs for running GeoNames are covered by donations and sponsoring.Enrichment:add country name

  2. Selfie with ID Dataset

    • kaggle.com
    Updated May 29, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Unidata (2025). Selfie with ID Dataset [Dataset]. https://www.kaggle.com/datasets/unidpro/selfie-with-id
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 29, 2025
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Unidata
    License

    Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
    License information was derived automatically

    Description

    Selfie Identity Dataset - 2 ID photo, 13 selfie

    The dataset contains 65,000+ photo of more than 5,000 people from 40 countries, making it a valuable resource for exploring and developing identity verification solutions. This collection serves as a valuable resource for researchers and developers working on biometric verification solutions, especially in areas like facial recognition and financial services.

    By utilizing this dataset, researchers can develop more robust re-identification algorithms, a key factor in ensuring privacy and security in various applications. - Get the data

    Example of photos in the dataset

    https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F22059654%2F1014bc8e62e232cc2ecb28e7d8ccdc3c%2F.png?generation=1730863166146276&alt=media" alt="">

    This dataset offers a opportunity to explore re-identification challenges by providing 13 selfies of individuals against diverse backgrounds with different lighting, paired with 2 ID photos from different document types.

    💵 Buy the Dataset: This is a limited preview of the data. To access the full dataset, please contact us at https://unidata.pro to discuss your requirements and pricing options.

    Metadata for the dataset

    Devices: Samsung M31, Infinix note11, Tecno Pop 7, Samsung A05, Iphone 15 Pro Max and other

    Resolution: 1000 x 750 and higher https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F22059654%2F0f1a70b3b5056e2610f22499cac19c7f%2FFrame%20136.png?generation=1730588713101089&alt=media" alt="">

    This dataset enables the development of more robust and reliable authentication systems, ultimately contributing to enhancing customer onboarding experiences by streamlining verification processes, minimizing fraud, and improving overall security measures for a wide range of services, including online platforms, financial institutions, and government agencies.

    🌐 UniData provides high-quality datasets, content moderation, data collection and annotation for your AI/ML projects

  3. Global Data: GDP, Life Expectancy & More

    • kaggle.com
    Updated Oct 19, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Arslaan Siddiqui (2024). Global Data: GDP, Life Expectancy & More [Dataset]. https://www.kaggle.com/datasets/arslaan5/global-data-gdp-life-expectancy-and-more/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Oct 19, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Arslaan Siddiqui
    License

    https://www.gnu.org/licenses/gpl-3.0.htmlhttps://www.gnu.org/licenses/gpl-3.0.html

    Description

    Global Data: GDP, Life Expectancy & More

    This dataset comprises 204 entries and 38 attributes, providing a comprehensive analysis of key economic and social indicators across various countries. It includes a diverse range of metrics, allowing for in-depth exploration of global trends related to GDP, education, health, and environmental factors.

    Key Features:

    • GDP: Gross Domestic Product (in current US dollars), representing the total economic output of a country.
    • Sex Ratio: The ratio of males to females in the population, highlighting demographic trends.
    • Life Expectancy: Average lifespan for males and females, an essential indicator of healthcare quality.
    • Education Enrollment Rates: Data on primary, secondary, and post-secondary education enrollment for males and females, reflecting educational attainment.
    • Unemployment Rate: Percentage of the labor force that is unemployed, indicating economic health.
    • Homicide Rate: Number of homicides per 100,000 population, providing insight into safety and crime levels.
    • Urban Population Growth: Rate of growth in urban populations, illustrating migration trends.
    • CO2 Emissions: Carbon dioxide emissions per capita, an important measure of environmental impact.
    • Forested Area: Percentage of land covered by forests, indicating biodiversity and environmental health.
    • Tourist Numbers: Total number of international visitors, which can reflect a country's tourism potential.

    Applications and Uses:

    1. Research and Analysis: Ideal for researchers studying the correlation between economic performance and social indicators. This dataset can help identify trends and patterns relevant to global development.

    2. Policy Development: Policymakers can utilize this data to inform decisions on education, healthcare, and environmental policies, aiming to improve national outcomes.

    3. Machine Learning and Data Science: Data scientists can apply machine learning techniques to predict economic trends, analyze social impacts, or classify countries based on various indicators.

    4. Educational Purposes: Suitable for students and educators in fields like economics, sociology, and environmental science for practical data analysis exercises.

    5. Visualization Projects: Perfect for creating compelling visualizations that illustrate relationships between different metrics, aiding in public understanding and engagement.

    By leveraging this dataset, users can uncover insights into how different factors influence a country's development, making it a valuable resource for diverse applications across various fields.

  4. Africa - Population and Internet users statistics

    • kaggle.com
    Updated Dec 17, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ishmeet singh (2020). Africa - Population and Internet users statistics [Dataset]. https://www.kaggle.com/datasets/ishmeet/africa-population-and-internet-users-statistics
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Dec 17, 2020
    Dataset provided by
    Kaggle
    Authors
    Ishmeet singh
    License

    http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/

    Area covered
    Africa
    Description

    Context

    Africa - Population and Internet users statistics

    Content

    What's inside is more than just rows and columns. Make it easy for others to get started by describing how you acquired the data and what time period it represents, too.

    Acknowledgements

    Source: https://data.humdata.org/dataset/africa-population-and-internet-users-statistics Last updated at https://data.humdata.org/organization/openafrica : 2019-09-11

  5. w

    Afrobarometer Survey 1 1999-2000, Merged 7 Country - Botswana, Lesotho,...

    • microdata.worldbank.org
    • catalog.ihsn.org
    Updated Apr 27, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Institute for Democracy in South Africa (IDASA) (2021). Afrobarometer Survey 1 1999-2000, Merged 7 Country - Botswana, Lesotho, Malawi, Namibia, South Africa, Zambia, Zimbabwe [Dataset]. https://microdata.worldbank.org/index.php/catalog/889
    Explore at:
    Dataset updated
    Apr 27, 2021
    Dataset provided by
    Michigan State University (MSU)
    Ghana Centre for Democratic Development (CDD-Ghana)
    Institute for Democracy in South Africa (IDASA)
    Time period covered
    1999 - 2000
    Area covered
    Namibia, Africa, Zimbabwe, Zambia, Malawi, Botswana, South Africa, Lesotho
    Description

    Abstract

    Round 1 of the Afrobarometer survey was conducted from July 1999 through June 2001 in 12 African countries, to solicit public opinion on democracy, governance, markets, and national identity. The full 12 country dataset released was pieced together out of different projects, Round 1 of the Afrobarometer survey,the old Southern African Democracy Barometer, and similar surveys done in West and East Africa.

    The 7 country dataset is a subset of the Round 1 survey dataset, and consists of a combined dataset for the 7 Southern African countries surveyed with other African countries in Round 1, 1999-2000 (Botswana, Lesotho, Malawi, Namibia, South Africa, Zambia and Zimbabwe). It is a useful dataset because, in contrast to the full 12 country Round 1 dataset, all countries in this dataset were surveyed with the identical questionnaire

    Geographic coverage

    Botswana Lesotho Malawi Namibia South Africa Zambia Zimbabwe

    Analysis unit

    Basic units of analysis that the study investigates include: individuals and groups

    Kind of data

    Sample survey data [ssd]

    Sampling procedure

    A new sample has to be drawn for each round of Afrobarometer surveys. Whereas the standard sample size for Round 3 surveys will be 1200 cases, a larger sample size will be required in societies that are extremely heterogeneous (such as South Africa and Nigeria), where the sample size will be increased to 2400. Other adaptations may be necessary within some countries to account for the varying quality of the census data or the availability of census maps.

    The sample is designed as a representative cross-section of all citizens of voting age in a given country. The goal is to give every adult citizen an equal and known chance of selection for interview. We strive to reach this objective by (a) strictly applying random selection methods at every stage of sampling and by (b) applying sampling with probability proportionate to population size wherever possible. A randomly selected sample of 1200 cases allows inferences to national adult populations with a margin of sampling error of no more than plus or minus 2.5 percent with a confidence level of 95 percent. If the sample size is increased to 2400, the confidence interval shrinks to plus or minus 2 percent.

    Sample Universe

    The sample universe for Afrobarometer surveys includes all citizens of voting age within the country. In other words, we exclude anyone who is not a citizen and anyone who has not attained this age (usually 18 years) on the day of the survey. Also excluded are areas determined to be either inaccessible or not relevant to the study, such as those experiencing armed conflict or natural disasters, as well as national parks and game reserves. As a matter of practice, we have also excluded people living in institutionalized settings, such as students in dormitories and persons in prisons or nursing homes.

    What to do about areas experiencing political unrest? On the one hand we want to include them because they are politically important. On the other hand, we want to avoid stretching out the fieldwork over many months while we wait for the situation to settle down. It was agreed at the 2002 Cape Town Planning Workshop that it is difficult to come up with a general rule that will fit all imaginable circumstances. We will therefore make judgments on a case-by-case basis on whether or not to proceed with fieldwork or to exclude or substitute areas of conflict. National Partners are requested to consult Core Partners on any major delays, exclusions or substitutions of this sort.

    Sample Design

    The sample design is a clustered, stratified, multi-stage, area probability sample.

    To repeat the main sampling principle, the objective of the design is to give every sample element (i.e. adult citizen) an equal and known chance of being chosen for inclusion in the sample. We strive to reach this objective by (a) strictly applying random selection methods at every stage of sampling and by (b) applying sampling with probability proportionate to population size wherever possible.

    In a series of stages, geographically defined sampling units of decreasing size are selected. To ensure that the sample is representative, the probability of selection at various stages is adjusted as follows:

    The sample is stratified by key social characteristics in the population such as sub-national area (e.g. region/province) and residential locality (urban or rural). The area stratification reduces the likelihood that distinctive ethnic or language groups are left out of the sample. And the urban/rural stratification is a means to make sure that these localities are represented in their correct proportions. Wherever possible, and always in the first stage of sampling, random sampling is conducted with probability proportionate to population size (PPPS). The purpose is to guarantee that larger (i.e., more populated) geographical units have a proportionally greater probability of being chosen into the sample. The sampling design has four stages

    A first-stage to stratify and randomly select primary sampling units;

    A second-stage to randomly select sampling start-points;

    A third stage to randomly choose households;

    A final-stage involving the random selection of individual respondents

    We shall deal with each of these stages in turn.

    STAGE ONE: Selection of Primary Sampling Units (PSUs)

    The primary sampling units (PSU's) are the smallest, well-defined geographic units for which reliable population data are available. In most countries, these will be Census Enumeration Areas (or EAs). Most national census data and maps are broken down to the EA level. In the text that follows we will use the acronyms PSU and EA interchangeably because, when census data are employed, they refer to the same unit.

    We strongly recommend that NIs use official national census data as the sampling frame for Afrobarometer surveys. Where recent or reliable census data are not available, NIs are asked to inform the relevant Core Partner before they substitute any other demographic data. Where the census is out of date, NIs should consult a demographer to obtain the best possible estimates of population growth rates. These should be applied to the outdated census data in order to make projections of population figures for the year of the survey. It is important to bear in mind that population growth rates vary by area (region) and (especially) between rural and urban localities. Therefore, any projected census data should include adjustments to take such variations into account.

    Indeed, we urge NIs to establish collegial working relationships within professionals in the national census bureau, not only to obtain the most recent census data, projections, and maps, but to gain access to sampling expertise. NIs may even commission a census statistician to draw the sample to Afrobarometer specifications, provided that provision for this service has been made in the survey budget.

    Regardless of who draws the sample, the NIs should thoroughly acquaint themselves with the strengths and weaknesses of the available census data and the availability and quality of EA maps. The country and methodology reports should cite the exact census data used, its known shortcomings, if any, and any projections made from the data. At minimum, the NI must know the size of the population and the urban/rural population divide in each region in order to specify how to distribute population and PSU's in the first stage of sampling. National investigators should obtain this written data before they attempt to stratify the sample.

    Once this data is obtained, the sample population (either 1200 or 2400) should be stratified, first by area (region/province) and then by residential locality (urban or rural). In each case, the proportion of the sample in each locality in each region should be the same as its proportion in the national population as indicated by the updated census figures.

    Having stratified the sample, it is then possible to determine how many PSU's should be selected for the country as a whole, for each region, and for each urban or rural locality.

    The total number of PSU's to be selected for the whole country is determined by calculating the maximum degree of clustering of interviews one can accept in any PSU. Because PSUs (which are usually geographically small EAs) tend to be socially homogenous we do not want to select too many people in any one place. Thus, the Afrobarometer has established a standard of no more than 8 interviews per PSU. For a sample size of 1200, the sample must therefore contain 150 PSUs/EAs (1200 divided by 8). For a sample size of 2400, there must be 300 PSUs/EAs.

    These PSUs should then be allocated proportionally to the urban and rural localities within each regional stratum of the sample. Let's take a couple of examples from a country with a sample size of 1200. If the urban locality of Region X in this country constitutes 10 percent of the current national population, then the sample for this stratum should be 15 PSUs (calculated as 10 percent of 150 PSUs). If the rural population of Region Y constitutes 4 percent of the current national population, then the sample for this stratum should be 6 PSU's.

    The next step is to select particular PSUs/EAs using random methods. Using the above example of the rural localities in Region Y, let us say that you need to pick 6 sample EAs out of a census list that contains a total of 240 rural EAs in Region Y. But which 6? If the EAs created by the national census bureau are of equal or roughly equal population size, then selection is relatively straightforward. Just number all EAs consecutively, then make six selections using a table of random numbers. This procedure, known as simple random sampling (SRS), will

  6. N

    Town And Country, MO Population Breakdown by Gender and Age Dataset: Male...

    • neilsberg.com
    csv, json
    Updated Feb 24, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Neilsberg Research (2025). Town And Country, MO Population Breakdown by Gender and Age Dataset: Male and Female Population Distribution Across 18 Age Groups // 2025 Edition [Dataset]. https://www.neilsberg.com/research/datasets/e20538d3-f25d-11ef-8c1b-3860777c1fe6/
    Explore at:
    json, csvAvailable download formats
    Dataset updated
    Feb 24, 2025
    Dataset authored and provided by
    Neilsberg Research
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Town and Country
    Variables measured
    Male and Female Population Under 5 Years, Male and Female Population over 85 years, Male and Female Population Between 5 and 9 years, Male and Female Population Between 10 and 14 years, Male and Female Population Between 15 and 19 years, Male and Female Population Between 20 and 24 years, Male and Female Population Between 25 and 29 years, Male and Female Population Between 30 and 34 years, Male and Female Population Between 35 and 39 years, Male and Female Population Between 40 and 44 years, and 8 more
    Measurement technique
    The data presented in this dataset is derived from the latest U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates. To measure the three variables, namely (a) Population (Male), (b) Population (Female), and (c) Gender Ratio (Males per 100 Females), we initially analyzed and categorized the data for each of the gender classifications (biological sex) reported by the US Census Bureau across 18 age groups, ranging from under 5 years to 85 years and above. These age groups are described above in the variables section. For further information regarding these estimates, please feel free to reach out to us via email at research@neilsberg.com.
    Dataset funded by
    Neilsberg Research
    Description
    About this dataset

    Context

    The dataset tabulates the population of Town And Country by gender across 18 age groups. It lists the male and female population in each age group along with the gender ratio for Town And Country. The dataset can be utilized to understand the population distribution of Town And Country by gender and age. For example, using this dataset, we can identify the largest age group for both Men and Women in Town And Country. Additionally, it can be used to see how the gender ratio changes from birth to senior most age group and male to female ratio across each age group for Town And Country.

    Key observations

    Largest age group (population): Male # 60-64 years (538) | Female # 45-49 years (537). Source: U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.

    Content

    When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.

    Age groups:

    • Under 5 years
    • 5 to 9 years
    • 10 to 14 years
    • 15 to 19 years
    • 20 to 24 years
    • 25 to 29 years
    • 30 to 34 years
    • 35 to 39 years
    • 40 to 44 years
    • 45 to 49 years
    • 50 to 54 years
    • 55 to 59 years
    • 60 to 64 years
    • 65 to 69 years
    • 70 to 74 years
    • 75 to 79 years
    • 80 to 84 years
    • 85 years and over

    Scope of gender :

    Please note that American Community Survey asks a question about the respondents current sex, but not about gender, sexual orientation, or sex at birth. The question is intended to capture data for biological sex, not gender. Respondents are supposed to respond with the answer as either of Male or Female. Our research and this dataset mirrors the data reported as Male and Female for gender distribution analysis.

    Variables / Data Columns

    • Age Group: This column displays the age group for the Town And Country population analysis. Total expected values are 18 and are define above in the age groups section.
    • Population (Male): The male population in the Town And Country is shown in the following column.
    • Population (Female): The female population in the Town And Country is shown in the following column.
    • Gender Ratio: Also known as the sex ratio, this column displays the number of males per 100 females in Town And Country for each age group.

    Good to know

    Margin of Error

    Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

    Custom data

    If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

    Inspiration

    Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

    Recommended for further research

    This dataset is a part of the main dataset for Town And Country Population by Gender. You can refer the same here

  7. s

    Data from: Coastal proximity of populations in 22 Pacific Island Countries...

    • nauru-data.sprep.org
    • pacificdata.org
    • +14more
    pdf, xlsx
    Updated Feb 20, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Pacific Data Hub (2025). Coastal proximity of populations in 22 Pacific Island Countries and Territories [Dataset]. https://nauru-data.sprep.org/dataset/coastal-proximity-populations-22-pacific-island-countries-and-territories
    Explore at:
    pdf(365706), xlsx(21290)Available download formats
    Dataset updated
    Feb 20, 2025
    Dataset provided by
    Pacific Data Hub
    License

    Public Domain Mark 1.0https://creativecommons.org/publicdomain/mark/1.0/
    License information was derived automatically

    Area covered
    Pacific Region
    Description

    A recently published paper, titled “Coastal proximity of populations in 22 Pacific Island Countries and Territories” details the methodology used to undertake the analysis and presents the findings. Purpose * This analysis aims to estimate populations settled in coastal areas in 22 Pacific Island Countries and Territories (PICTS) using the data currently available. In addition to the coastal population estimates, the study compares the results obtained from the use of national population datasets (census) with those derived from the use of global population grids. * Accuracy and reliability from national and global datasets derived results have been evaluated to identify the most suitable options to estimate size and location of coastal populations in the region. A collaborative project between the Pacific Community (SPC), WorldFish and the University of Wollongong has produced the first detailed population estimates of people living close to the coast in the 22 Pacific Island Countries and Territories (PICTs).

  8. T

    GOLD RESERVES by Country Dataset

    • tradingeconomics.com
    csv, excel, json, xml
    Updated May 26, 2017
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    TRADING ECONOMICS (2017). GOLD RESERVES by Country Dataset [Dataset]. https://tradingeconomics.com/country-list/gold-reserves
    Explore at:
    excel, xml, csv, jsonAvailable download formats
    Dataset updated
    May 26, 2017
    Dataset authored and provided by
    TRADING ECONOMICS
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    2025
    Area covered
    World
    Description

    This dataset provides values for GOLD RESERVES reported in several countries. The data includes current values, previous releases, historical highs and record lows, release frequency, reported unit and currency.

  9. World Bank: Education Data

    • kaggle.com
    zip
    Updated Mar 20, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    World Bank (2019). World Bank: Education Data [Dataset]. https://www.kaggle.com/datasets/theworldbank/world-bank-intl-education
    Explore at:
    zip(0 bytes)Available download formats
    Dataset updated
    Mar 20, 2019
    Dataset provided by
    World Bank Grouphttp://www.worldbank.org/
    World Bankhttp://topics.nytimes.com/top/reference/timestopics/organizations/w/world_bank/index.html
    Authors
    World Bank
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Context

    The World Bank is an international financial institution that provides loans to countries of the world for capital projects. The World Bank's stated goal is the reduction of poverty. Source: https://en.wikipedia.org/wiki/World_Bank

    Content

    This dataset combines key education statistics from a variety of sources to provide a look at global literacy, spending, and access.

    For more information, see the World Bank website.

    Fork this kernel to get started with this dataset.

    Acknowledgements

    https://bigquery.cloud.google.com/dataset/bigquery-public-data:world_bank_health_population

    http://data.worldbank.org/data-catalog/ed-stats

    https://cloud.google.com/bigquery/public-data/world-bank-education

    Citation: The World Bank: Education Statistics

    Dataset Source: World Bank. This dataset is publicly available for anyone to use under the following terms provided by the Dataset Source - http://www.data.gov/privacy-policy#data_policy - and is provided "AS IS" without any warranty, express or implied, from Google. Google disclaims all liability for any damages, direct or indirect, resulting from the use of the dataset.

    Banner Photo by @till_indeman from Unplash.

    Inspiration

    Of total government spending, what percentage is spent on education?

  10. d

    The Extended Global Lake area, Climate, and Population Dataset (GLCP)

    • catalog.data.gov
    • data.usgs.gov
    • +4more
    Updated Aug 25, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Geological Survey (2024). The Extended Global Lake area, Climate, and Population Dataset (GLCP) [Dataset]. https://catalog.data.gov/dataset/the-extended-global-lake-area-climate-and-population-dataset-glcp
    Explore at:
    Dataset updated
    Aug 25, 2024
    Dataset provided by
    United States Geological Surveyhttp://www.usgs.gov/
    Description

    A changing climate and increasing human population necessitate understanding global freshwater availability and temporal variability. To examine lake freshwater availability from local-to-global and monthly-to-decadal scales, we created the Global Lake area, Climate, and Population (GLCP) dataset, which contains annual lake surface area for 1.42 million lakes with paired annual basin-level climate and population data. Building off an existing data product infrastructure, the next generation of the GLCP includes monthly lake ice area, snow basin area, and more climate variables including specific humidity, longwave and shortwave radiation, as well as cloud cover. The new generation of the GLCP continues previous FAIR data efforts by expanding its scripting repository and maintaining unique relational keys for merging with external data products. Compared to the original version, the new GLCP contains an even richer suite of variables capable of addressing disparate analyses of lake water trends at wide spatial and temporal scales.

  11. Victoria 2 (the game) Economy Data

    • kaggle.com
    Updated Nov 18, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Derrek Devon (2023). Victoria 2 (the game) Economy Data [Dataset]. https://www.kaggle.com/datasets/derrekdevon/victoria-2-the-game-economy-data/versions/1
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Nov 18, 2023
    Dataset provided by
    Kaggle
    Authors
    Derrek Devon
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    I have been a fan of Paradox Interactive's Victoria 2 for a while now. This dataset is based off my most recent campaign playing as the small nation of Biafra in Western Africa. Using a software I found on the web, I was able to extract much of the data however, I really wish I were able to get more data. That game has loads of interesting data trapped in it. Hopefully, in the nearest future, a software can be built to help me get that done.

    The data, I think, is fairly comprehensive. It maps out a 38 year period between 1993 and 2030, tracking each countries gdp, GDP per Capita, unemployment rate e.t.c.

    Note: Keen observers will notice that 4 of the largest economies in the world seem to nose dive around the year 2023-2024. This is because, within the game, India nukes The United States, France, and Great Britain in a great war. All three countries retaliate with their own nukes, thereby reducing all 4 countries to economic obscurity within a matter of 5 years. It was indeed a scary thing to watch. Nearly 700 million people lost their lives due to the fallout.

    Edit: You will find a lot of zero's in the gdp data. This is not because those countries gdp were actually 0. For the vast majority of countries with 0 as their GDP, they simply did not exist officially that year. For instance Ambazonia has many years of 0 GDP data. This is because Ambazonia did not exist as a country all those years. Also, within the game there was never any country with a population of 0. Therefore, any country with a population of 0 in our dataset did not exist.

  12. Climate Change: Earth Surface Temperature Data

    • kaggle.com
    zip
    Updated May 1, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Berkeley Earth (2017). Climate Change: Earth Surface Temperature Data [Dataset]. https://www.kaggle.com/berkeleyearth/climate-change-earth-surface-temperature-data
    Explore at:
    zip(88843537 bytes)Available download formats
    Dataset updated
    May 1, 2017
    Dataset authored and provided by
    Berkeley Earthhttp://berkeleyearth.org/
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Area covered
    Earth
    Description

    Some say climate change is the biggest threat of our age while others say it’s a myth based on dodgy science. We are turning some of the data over to you so you can form your own view.

    us-climate-change

    Even more than with other data sets that Kaggle has featured, there’s a huge amount of data cleaning and preparation that goes into putting together a long-time study of climate trends. Early data was collected by technicians using mercury thermometers, where any variation in the visit time impacted measurements. In the 1940s, the construction of airports caused many weather stations to be moved. In the 1980s, there was a move to electronic thermometers that are said to have a cooling bias.

    Given this complexity, there are a range of organizations that collate climate trends data. The three most cited land and ocean temperature data sets are NOAA’s MLOST, NASA’s GISTEMP and the UK’s HadCrut.

    We have repackaged the data from a newer compilation put together by the Berkeley Earth, which is affiliated with Lawrence Berkeley National Laboratory. The Berkeley Earth Surface Temperature Study combines 1.6 billion temperature reports from 16 pre-existing archives. It is nicely packaged and allows for slicing into interesting subsets (for example by country). They publish the source data and the code for the transformations they applied. They also use methods that allow weather observations from shorter time series to be included, meaning fewer observations need to be thrown away.

    In this dataset, we have include several files:

    Global Land and Ocean-and-Land Temperatures (GlobalTemperatures.csv):

    • Date: starts in 1750 for average land temperature and 1850 for max and min land temperatures and global ocean and land temperatures
    • LandAverageTemperature: global average land temperature in celsius
    • LandAverageTemperatureUncertainty: the 95% confidence interval around the average
    • LandMaxTemperature: global average maximum land temperature in celsius
    • LandMaxTemperatureUncertainty: the 95% confidence interval around the maximum land temperature
    • LandMinTemperature: global average minimum land temperature in celsius
    • LandMinTemperatureUncertainty: the 95% confidence interval around the minimum land temperature
    • LandAndOceanAverageTemperature: global average land and ocean temperature in celsius
    • LandAndOceanAverageTemperatureUncertainty: the 95% confidence interval around the global average land and ocean temperature

    Other files include:

    • Global Average Land Temperature by Country (GlobalLandTemperaturesByCountry.csv)
    • Global Average Land Temperature by State (GlobalLandTemperaturesByState.csv)
    • Global Land Temperatures By Major City (GlobalLandTemperaturesByMajorCity.csv)
    • Global Land Temperatures By City (GlobalLandTemperaturesByCity.csv)

    The raw data comes from the Berkeley Earth data page.

  13. F

    Native American Multi-Year Facial Image Dataset

    • futurebeeai.com
    wav
    Updated Aug 1, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    FutureBee AI (2022). Native American Multi-Year Facial Image Dataset [Dataset]. https://www.futurebeeai.com/dataset/image-dataset/facial-images-historical-native-american
    Explore at:
    wavAvailable download formats
    Dataset updated
    Aug 1, 2022
    Dataset provided by
    FutureBeeAI
    Authors
    FutureBee AI
    License

    https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement

    Area covered
    United States
    Dataset funded by
    FutureBeeAI
    Description

    Introduction

    Welcome to the Native American Multi-Year Facial Image Dataset, thoughtfully curated to support the development of advanced facial recognition systems, biometric identification models, KYC verification tools, and other computer vision applications. This dataset is ideal for training AI models to recognize individuals over time, track facial changes, and enhance age progression capabilities.

    Facial Image Data

    This dataset includes over 5,000+ high-quality facial images, organized into individual participant sets, each containing:

    Historical Images: 22 facial images per participant captured across a span of 10 years
    Enrollment Image: One recent high-resolution facial image for reference or ground truth

    Diversity & Representation

    Geographic Coverage: Participants from USA, Canada, Mexico and more and other Native American regions
    Demographics: Individuals aged 18 to 70 years, with a gender distribution of 60% male and 40% female
    File Formats: All images are available in JPEG and HEIC formats

    Image Quality & Capture Conditions

    To ensure model generalization and practical usability, images in this dataset reflect real-world diversity:

    Lighting Conditions: Images captured under various natural and artificial lighting setups
    Backgrounds: A wide range of indoor and outdoor backgrounds
    Device Quality: Captured using modern, high-resolution mobile devices for consistency and clarity

    Metadata

    Each participant’s dataset is accompanied by rich metadata to support advanced model training and analysis, including:

    Unique participant ID
    File name
    Age at the time of image capture
    Gender
    Country of origin
    Demographic profile
    File format

    Use Cases & Applications

    This dataset is highly valuable for a wide range of AI and computer vision applications:

    Facial Recognition Systems: Train models for high-accuracy face matching across time
    KYC & Identity Verification: Improve time-spanning verification for banks, insurance, and government services
    Biometric Security Solutions: Build reliable identity authentication models
    Age Progression & Estimation Models: Train AI to predict aging patterns or estimate age from facial features
    Generative AI: Support creation and validation of synthetic age progression or longitudinal face generation

    Secure & Ethical Collection

    Platform: All data was securely collected and processed through FutureBeeAI’s proprietary systems
    Ethical Compliance: Full participant consent obtained with transparent communication of use cases
    Privacy-Protected: No personally identifiable information is included; all data is anonymized and handled with care

    Dataset Updates & Customization

    To keep pace with evolving AI needs, this dataset is regularly updated and customizable. Custom data collection options include:

    <div style="margin-top:10px; margin-bottom: 10px; padding-left: 30px; display: flex; gap:

  14. N

    Country Club Hills, MO Age Cohorts Dataset: Children, Working Adults, and...

    • neilsberg.com
    csv, json
    Updated Feb 22, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Neilsberg Research (2025). Country Club Hills, MO Age Cohorts Dataset: Children, Working Adults, and Seniors in Country Club Hills - Population and Percentage Analysis // 2025 Edition [Dataset]. https://www.neilsberg.com/insights/country-club-hills-mo-population-by-age/
    Explore at:
    json, csvAvailable download formats
    Dataset updated
    Feb 22, 2025
    Dataset authored and provided by
    Neilsberg Research
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Country Club Hills, Missouri
    Variables measured
    Population Over 65 Years, Population Under 18 Years, Population Between 18 and 64 Years, Percent of Total Population for Age Groups
    Measurement technique
    The data presented in this dataset is derived from the latest U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates. To measure the two variables, namely (a) population and (b) population as a percentage of the total population, we initially analyzed and categorized the data for each of the age cohorts. For age cohorts we divided it into three buckets Children ( Under the age of 18 years), working population ( Between 18 and 64 years) and senior population ( Over 65 years). For further information regarding these estimates, please feel free to reach out to us via email at research@neilsberg.com.
    Dataset funded by
    Neilsberg Research
    Description
    About this dataset

    Context

    The dataset tabulates the Country Club Hills population by age cohorts (Children: Under 18 years; Working population: 18-64 years; Senior population: 65 years or more). It lists the population in each age cohort group along with its percentage relative to the total population of Country Club Hills. The dataset can be utilized to understand the population distribution across children, working population and senior population for dependency ratio, housing requirements, ageing, migration patterns etc.

    Key observations

    The largest age group was 18 to 64 years with a poulation of 633 (62.24% of the total population). Source: U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.

    Content

    When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.

    Age cohorts:

    • Under 18 years
    • 18 to 64 years
    • 65 years and over

    Variables / Data Columns

    • Age Group: This column displays the age cohort for the Country Club Hills population analysis. Total expected values are 3 groups ( Children, Working Population and Senior Population).
    • Population: The population for the age cohort in Country Club Hills is shown in the following column.
    • Percent of Total Population: The population as a percent of total population of the Country Club Hills is shown in the following column.

    Good to know

    Margin of Error

    Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

    Custom data

    If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

    Inspiration

    Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

    Recommended for further research

    This dataset is a part of the main dataset for Country Club Hills Population by Age. You can refer the same here

  15. p

    Bulgaria Number Dataset

    • listtodata.com
    .csv, .xls, .txt
    Updated Jul 17, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    List to Data (2025). Bulgaria Number Dataset [Dataset]. https://listtodata.com/bulgaria-dataset
    Explore at:
    .csv, .xls, .txtAvailable download formats
    Dataset updated
    Jul 17, 2025
    Authors
    List to Data
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Time period covered
    Jan 1, 2025 - Dec 31, 2025
    Area covered
    Belgium, Bulgaria, United States
    Variables measured
    phone numbers, Email Address, full name, Address, City, State, gender,age,income,ip address,
    Description

    ABulgaria number dataset can be a great element for direct marketing nationwide right now. Also, this Bulgaria number dataset has thousands of active mobile numbers that help to grow sales in the company. In fact, you can develop your business by getting many trustworthy B2C customers. Again, clients can send you a fast answer if they need it or not Similarly, this Bulgaria number dataset is a very essential tool for telemarketing. In other words, you get all these 95% accurate number leads at a very cheap price from us. In addition, our List To Data website always follows the full GDPR laws strictly. As such, the return on investment (ROI) will provide you satisfaction from the business. Bulgaria phone data is a very strong contact database that you can get in your budget. Moreover, the Bulgaria phone data is very beneficial for fast business growth through direct marketing. Besides, our List To Data assures you that we give verified numbers at an affordable cost. Most importantly, you can say that it brings you more profit than your expense. Additionally, the Bulgaria phone data has all the details like name, age, gender, location, and business. Anyway, people can join with the most extensive group of customers quickly through it. Yet, people can use these numbers directory without any worry. So, buy it from us as our experts are ready to present the most satisfactory service. Bulgaria phone number list is very helpful for any business and marketing. People can use this Bulgaria phone number list to develop their telemarketing. They can efficiently contact consumers through direct calls or SMS. In other words, we collect it from authentic sites, so you should purchase our packages right now. Furthermore, you can believe this proper directory to maximize your company’s growth rapidly. Also, we deliver the Bulgaria phone number list in an Excel and CSV file. Actually, the country’s mobile number data will help you in obtaining more profit than investment. Likewise, the List To Data expert team is ready to help you 24 hours with any necessary details that can help any business. Indeed, buy this telemarketing lead at a very reasonable price to expand sales through B2C customers.

  16. Antidepressant Use in Scandinavia

    • kaggle.com
    Updated Jan 24, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Devastator (2023). Antidepressant Use in Scandinavia [Dataset]. https://www.kaggle.com/datasets/thedevastator/antidepressant-use-in-scandinavia
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jan 24, 2023
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    The Devastator
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Area covered
    Scandinavia
    Description

    Antidepressant Use in Scandinavia

    A Study of Population Characteristics and Drug Utilization Rates

    By [source]

    About this dataset

    This fascinating dataset examines the use of antidepressant medications among children and adolescents in Denmark, Norway, and Sweden from 2007 until 2017. Through a comprehensive exploration of drug usage along with population characteristics, we can uncover deeper insights into the prevalence of antidepressant use in this demographic and its potential causes. By carefully inspecting this data set which contains details about drug use, census data and associated drug names by code, we can shed light on an important issue with far reaching implications for public health worldwide

    More Datasets

    For more datasets, click here.

    Featured Notebooks

    • 🚨 Your notebook can be here! 🚨!

    How to use the dataset

    This dataset offers an opportunity to analyze antidepressant use among children and adolescents in Denmark, Norway and Sweden from 2007 to 2017. To get started with your analysis, you'll need to familiarize yourself with the dataset. Below are some simple steps for getting acquainted with the available resources:

    • Familiarize yourself with the column descriptions and data types. Each column contains meaningful information about drug use and population characteristics in the three countries during this window of time.
    • Review the drug_names file contained in this dataset for a detailed list of drugs associated with each code represented in the main table. This is particularly important because ATC (Anatomical Therapeutic Chemical) codes provide an easy shorthand way of referring to individual medications without being too long-winded or cluttering up columns not relevant to your particular question or hypothesis
    • Explore correlations between different parameters using crosstabs, scatterplots, or other common visualizations as necessary
    • Use census data contained in census_data file as a reference when discussing population makeup within any given country during this period

    With this approach, you will have all that's necessary to derive meaningful results out of this dataset! Good luck on your exploration!

    Research Ideas

    • Comparing the sex, age and population weights of those using different types of antidepressants in each country
    • Tracking consumption trends across countries and between genders over time
    • Correlating antidepressant use with national income indicators such as GDP per capita or overall Mental Health Index scores

    Acknowledgements

    If you use this dataset in your research, please credit the original authors. Data Source

    License

    License: CC0 1.0 Universal (CC0 1.0) - Public Domain Dedication No Copyright - You can copy, modify, distribute and perform the work, even for commercial purposes, all without asking permission. See Other Information.

    Columns

    File: census.csv | Column name | Description | |:--------------|:------------------------------------------| | year | Year of the data (Integer) | | sex | Gender of the population (String) | | age | Age group of the population (Integer) | | cnt | Number of people using the drug (Integer) | | country | Country of the population (String) |

    File: drug_names.csv | Column name | Description | |:---------------|:------------------------------------------------------------------| | atc | Anatomical Therapeutic Chemical (ATC) code for the drug. (String) | | formalname | Formal name of the drug. (String) |

    Acknowledgements

    If you use this dataset in your research, please credit the original authors. If you use this dataset in your research, please credit .

  17. LinkedIn Datasets

    • brightdata.com
    .json, .csv, .xlsx
    Updated Dec 17, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bright Data (2021). LinkedIn Datasets [Dataset]. https://brightdata.com/products/datasets/linkedin
    Explore at:
    .json, .csv, .xlsxAvailable download formats
    Dataset updated
    Dec 17, 2021
    Dataset authored and provided by
    Bright Datahttps://brightdata.com/
    License

    https://brightdata.com/licensehttps://brightdata.com/license

    Area covered
    Worldwide
    Description

    Unlock the full potential of LinkedIn data with our extensive dataset that combines profiles, company information, and job listings into one powerful resource for business decision-making, strategic hiring, competitive analysis, and market trend insights. This all-encompassing dataset is ideal for professionals, recruiters, analysts, and marketers aiming to enhance their strategies and operations across various business functions. Dataset Features

    Profiles: Dive into detailed public profiles featuring names, titles, positions, experience, education, skills, and more. Utilize this data for talent sourcing, lead generation, and investment signaling, with a refresh rate ensuring up to 30 million records per month. Companies: Access comprehensive company data including ID, country, industry, size, number of followers, website details, subsidiaries, and posts. Tailored subsets by industry or region provide invaluable insights for CRM enrichment, competitive intelligence, and understanding the startup ecosystem, updated monthly with up to 40 million records. Job Listings: Explore current job opportunities detailed with job titles, company names, locations, and employment specifics such as seniority levels and employment functions. This dataset includes direct application links and real-time application numbers, serving as a crucial tool for job seekers and analysts looking to understand industry trends and the job market dynamics.

    Customizable Subsets for Specific Needs Our LinkedIn dataset offers the flexibility to tailor the dataset according to your specific business requirements. Whether you need comprehensive insights across all data points or are focused on specific segments like job listings, company profiles, or individual professional details, we can customize the dataset to match your needs. This modular approach ensures that you get only the data that is most relevant to your objectives, maximizing efficiency and relevance in your strategic applications. Popular Use Cases

    Strategic Hiring and Recruiting: Track talent movement, identify growth opportunities, and enhance your recruiting efforts with targeted data. Market Analysis and Competitive Intelligence: Gain a competitive edge by analyzing company growth, industry trends, and strategic opportunities. Lead Generation and CRM Enrichment: Enrich your database with up-to-date company and professional data for targeted marketing and sales strategies. Job Market Insights and Trends: Leverage detailed job listings for a nuanced understanding of employment trends and opportunities, facilitating effective job matching and market analysis. AI-Driven Predictive Analytics: Utilize AI algorithms to analyze large datasets for predicting industry shifts, optimizing business operations, and enhancing decision-making processes based on actionable data insights.

    Whether you are mapping out competitive landscapes, sourcing new talent, or analyzing job market trends, our LinkedIn dataset provides the tools you need to succeed. Customize your access to fit specific needs, ensuring that you have the most relevant and timely data at your fingertips.

  18. Q

    Data for: The Bystander Affect Detection (BAD) Dataset for Failure Detection...

    • data.qdr.syr.edu
    pdf, tsv, txt, zip
    Updated Sep 25, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Alexandra Bremers; Alexandra Bremers; Xuanyu Fang; Xuanyu Fang; Natalie Friedman; Natalie Friedman; Wendy Ju; Wendy Ju (2023). Data for: The Bystander Affect Detection (BAD) Dataset for Failure Detection in HRI [Dataset]. http://doi.org/10.5064/F6TAWBGS
    Explore at:
    zip(66872585), zip(67359564), zip(49981372), zip(45063165), zip(35942055), tsv(5431), zip(63732190), zip(32108293), zip(33064251), zip(49848937), zip(38858151), zip(137880775), zip(90804192), zip(36477139), zip(38068214), zip(36039067), zip(37592931), zip(34234760), zip(63445623), zip(38092264), zip(45582594), zip(50915158), zip(111033502), zip(32955394), zip(30549219), zip(39991378), zip(166237686), zip(50351519), zip(62744513), zip(46810648), zip(34379478), zip(35492684), zip(22036189), pdf(197935), zip(66187509), zip(40085473), zip(40798037), pdf(113804), zip(12931695), zip(31593404), zip(26677367), zip(35547615), tsv(244631), zip(35954889), txt(7329), zip(74593629), zip(52574377), zip(55483165), zip(31323914), zip(43519637), zip(42743107), zip(55790691), zip(50499507), zip(76761027), zip(38063092), zip(55654900), zip(30504764), zip(48203736), zip(40422817)Available download formats
    Dataset updated
    Sep 25, 2023
    Dataset provided by
    Qualitative Data Repository
    Authors
    Alexandra Bremers; Alexandra Bremers; Xuanyu Fang; Xuanyu Fang; Natalie Friedman; Natalie Friedman; Wendy Ju; Wendy Ju
    License

    https://qdr.syr.edu/policies/qdr-restricted-access-conditionshttps://qdr.syr.edu/policies/qdr-restricted-access-conditions

    Description

    Project Overview For a robot to repair its own error, it must first know it has made a mistake. One way that people detect errors is from the implicit reactions from bystanders – their confusion, smirks, or giggles clue us in that something unexpected occurred. To enable robots to detect and act on bystander responses to task failures, we developed a novel method to elicit bystander responses to human and robot errors. Data Overview This project introduces the Bystander Affect Detection (BAD) dataset – a dataset of videos of bystander reactions to videos of failures. This dataset includes 2,452 human reactions to failure, collected in contexts that approximate “in-the-wild” data collection – including natural variances in webcam quality, lighting, and background. The BAD dataset may be requested for use in related research projects. As the dataset contains facial video data of participants, access can be requested along with the presentation of a research protocol and data use agreement that protects participants. Data Collection Overview and Access Conditions Using 46 different stimulus videos featuring a variety of human and machine task failures, we collected a total of 2,452 webcam videos of human reactions from 54 participants. Recruitment happened through the online behavioral research platform Prolific (https://www.prolific.co/about), where the options were selected to recruit a gender-balanced sample across all countries available. Participants had to use a laptop or desktop. Compensation was set at the Prolific rate of $12/hr, which came down to about $8 per participant for about 40 minutes of participation. Participants agreed that their data can be shared for future research projects and the data were approved to be shared publicly by IRB review. However, considering the fact that this is a machine-learning dataset containing identifiable crowdsourced human subjects data, the research team has decided that potential secondary users of the data must meet the following criteria for the access request to be granted: 1. Agreement to three usage terms: - I will not redistribute the contents of the BAD Dataset - I will not use videos for purposes outside of human interaction research (broadly defined as any project that aims to study or develop improvements to human interactions with technology to result in a better user experience) - I will not use the videos to identify, defame, or otherwise negatively impact the health, welfare, employment or reputation of human participants 2. A description of what you want to use the BAD dataset for, indicating any applicable human subjects protection measures that are in place. (For instance, "Me and my fellow researchers at University of X, lab of Y, will use the BAD dataset to train a model to detect when our Nao robot interrupts people at awkward times. The PI is Professor Z. Our protocol was approved under IRB #.") 3. A copy of the IRB record or ethics approval document, confirming the research protocol and institutional approval. Data Analysis To test the viability of the collected data, we used the Bystander Reaction Dataset as input to a deep-learning model, BADNet, to predict failure occurrence. We tested different data labeling methods and learned how they affect model performance, achieving precisions above 90%. Shared Data Organization This data project consists of 54 zipped folders of recorded video data organized by participant, totaling 2,452 videos. The accompanying documentation includes a file containing the text of the consent form used for the research project, an inventory of the stimulus videos used, aggregate survey data, this data narrative, and an administrative readme file. Special Notes The data were approved to be shared publicly by IRB review. However, considering the fact that this is a machine-learning dataset containing identifiable crowdsourced human subjects data, the research team has decided that potential secondary users of the data must meet specific criteria before they qualify for access. Please consult the Terms tab below for more details and follow the instructions there if interested in requesting access.

  19. w

    Global Financial Inclusion (Global Findex) Database 2011 - Afghanistan

    • microdata.worldbank.org
    • catalog.ihsn.org
    • +1more
    Updated Apr 15, 2015
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Development Research Group, Finance and Private Sector Development Unit (2015). Global Financial Inclusion (Global Findex) Database 2011 - Afghanistan [Dataset]. https://microdata.worldbank.org/index.php/catalog/1117
    Explore at:
    Dataset updated
    Apr 15, 2015
    Dataset authored and provided by
    Development Research Group, Finance and Private Sector Development Unit
    Time period covered
    2011
    Area covered
    Afghanistan
    Description

    Abstract

    Well-functioning financial systems serve a vital purpose, offering savings, credit, payment, and risk management products to people with a wide range of needs. Yet until now little had been known about the global reach of the financial sector - the extent of financial inclusion and the degree to which such groups as the poor, women, and youth are excluded from formal financial systems. Systematic indicators of the use of different financial services had been lacking for most economies.

    The Global Financial Inclusion (Global Findex) database provides such indicators. This database contains the first round of Global Findex indicators, measuring how adults in more than 140 economies save, borrow, make payments, and manage risk. The data set can be used to track the effects of financial inclusion policies globally and develop a deeper and more nuanced understanding of how people around the world manage their day-to-day finances. By making it possible to identify segments of the population excluded from the formal financial sector, the data can help policy makers prioritize reforms and design new policies.

    Geographic coverage

    National Coverage.

    Analysis unit

    Individual

    Universe

    The target population is the civilian, non-institutionalized population 15 years and above.

    Kind of data

    Sample survey data [ssd]

    Sampling procedure

    The Global Findex indicators are drawn from survey data collected by Gallup, Inc. over the 2011 calendar year, covering more than 150,000 adults in 148 economies and representing about 97 percent of the world's population. Since 2005, Gallup has surveyed adults annually around the world, using a uniform methodology and randomly selected, nationally representative samples. The second round of Global Findex indicators was collected in 2014 and is forthcoming in 2015. The set of indicators will be collected again in 2017.

    Surveys were conducted face-to-face in economies where landline telephone penetration is less than 80 percent, or where face-to-face interviewing is customary. The first stage of sampling is the identification of primary sampling units, consisting of clusters of households. The primary sampling units are stratified by population size, geography, or both, and clustering is achieved through one or more stages of sampling. Where population information is available, sample selection is based on probabilities proportional to population size; otherwise, simple random sampling is used. Random route procedures are used to select sampled households. Unless an outright refusal occurs, interviewers make up to three attempts to survey the sampled household. If an interview cannot be obtained at the initial sampled household, a simple substitution method is used. Respondents are randomly selected within the selected households by means of the Kish grid.

    Surveys were conducted by telephone in economies where landline telephone penetration is over 80 percent. The telephone surveys were conducted using random digit dialing or a nationally representative list of phone numbers. In selected countries where cell phone penetration is high, a dual sampling frame is used. Random respondent selection is achieved by using either the latest birthday or Kish grid method. At least three attempts are made to teach a person in each household, spread over different days and times of year.

    The sample size in Afghanistan was 1,000 individuals. Gender-matched sampling was used during the final stage of selection.

    Mode of data collection

    Face-to-face [f2f]

    Research instrument

    The questionnaire was designed by the World Bank, in conjunction with a Technical Advisory Board composed of leading academics, practitioners, and policy makers in the field of financial inclusion. The Bill and Melinda Gates Foundation and Gallup, Inc. also provided valuable input. The questionnaire was piloted in over 20 countries using focus groups, cognitive interviews, and field testing. The questionnaire is available in 142 languages upon request.

    Questions on insurance, mobile payments, and loan purposes were asked only in developing economies. The indicators on awareness and use of microfinance insitutions (MFIs) are not included in the public dataset. However, adults who report saving at an MFI are considered to have an account; this is reflected in the composite account indicator.

    Sampling error estimates

    Estimates of standard errors (which account for sampling error) vary by country and indicator. For country- and indicator-specific standard errors, refer to the Annex and Country Table in Demirguc-Kunt, Asli and L. Klapper. 2012. "Measuring Financial Inclusion: The Global Findex." Policy Research Working Paper 6025, World Bank, Washington, D.C.

  20. w

    Global Financial Inclusion (Global Findex) Database 2021 - France

    • microdata.worldbank.org
    • catalog.ihsn.org
    • +1more
    Updated Dec 16, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Development Research Group, Finance and Private Sector Development Unit (2022). Global Financial Inclusion (Global Findex) Database 2021 - France [Dataset]. https://microdata.worldbank.org/index.php/catalog/4642
    Explore at:
    Dataset updated
    Dec 16, 2022
    Dataset authored and provided by
    Development Research Group, Finance and Private Sector Development Unit
    Time period covered
    2021
    Area covered
    France
    Description

    Abstract

    The fourth edition of the Global Findex offers a lens into how people accessed and used financial services during the COVID-19 pandemic, when mobility restrictions and health policies drove increased demand for digital services of all kinds.

    The Global Findex is the world's most comprehensive database on financial inclusion. It is also the only global demand-side data source allowing for global and regional cross-country analysis to provide a rigorous and multidimensional picture of how adults save, borrow, make payments, and manage financial risks. Global Findex 2021 data were collected from national representative surveys of about 128,000 adults in more than 120 economies. The latest edition follows the 2011, 2014, and 2017 editions, and it includes a number of new series measuring financial health and resilience and contains more granular data on digital payment adoption, including merchant and government payments.

    The Global Findex is an indispensable resource for financial service practitioners, policy makers, researchers, and development professionals.

    Geographic coverage

    National coverage

    Analysis unit

    Individual

    Kind of data

    Observation data/ratings [obs]

    Sampling procedure

    In most developing economies, Global Findex data have traditionally been collected through face-to-face interviews. Surveys are conducted face-to-face in economies where telephone coverage represents less than 80 percent of the population or where in-person surveying is the customary methodology. However, because of ongoing COVID-19 related mobility restrictions, face-to-face interviewing was not possible in some of these economies in 2021. Phone-based surveys were therefore conducted in 67 economies that had been surveyed face-to-face in 2017. These 67 economies were selected for inclusion based on population size, phone penetration rate, COVID-19 infection rates, and the feasibility of executing phone-based methods where Gallup would otherwise conduct face-to-face data collection, while complying with all government-issued guidance throughout the interviewing process. Gallup takes both mobile phone and landline ownership into consideration. According to Gallup World Poll 2019 data, when face-to-face surveys were last carried out in these economies, at least 80 percent of adults in almost all of them reported mobile phone ownership. All samples are probability-based and nationally representative of the resident adult population. Phone surveys were not a viable option in 17 economies that had been part of previous Global Findex surveys, however, because of low mobile phone ownership and surveying restrictions. Data for these economies will be collected in 2022 and released in 2023.

    In economies where face-to-face surveys are conducted, the first stage of sampling is the identification of primary sampling units. These units are stratified by population size, geography, or both, and clustering is achieved through one or more stages of sampling. Where population information is available, sample selection is based on probabilities proportional to population size; otherwise, simple random sampling is used. Random route procedures are used to select sampled households. Unless an outright refusal occurs, interviewers make up to three attempts to survey the sampled household. To increase the probability of contact and completion, attempts are made at different times of the day and, where possible, on different days. If an interview cannot be obtained at the initial sampled household, a simple substitution method is used. Respondents are randomly selected within the selected households. Each eligible household member is listed, and the hand-held survey device randomly selects the household member to be interviewed. For paper surveys, the Kish grid method is used to select the respondent. In economies where cultural restrictions dictate gender matching, respondents are randomly selected from among all eligible adults of the interviewer's gender.

    In traditionally phone-based economies, respondent selection follows the same procedure as in previous years, using random digit dialing or a nationally representative list of phone numbers. In most economies where mobile phone and landline penetration is high, a dual sampling frame is used.

    The same respondent selection procedure is applied to the new phone-based economies. Dual frame (landline and mobile phone) random digital dialing is used where landline presence and use are 20 percent or higher based on historical Gallup estimates. Mobile phone random digital dialing is used in economies with limited to no landline presence (less than 20 percent).

    For landline respondents in economies where mobile phone or landline penetration is 80 percent or higher, random selection of respondents is achieved by using either the latest birthday or household enumeration method. For mobile phone respondents in these economies or in economies where mobile phone or landline penetration is less than 80 percent, no further selection is performed. At least three attempts are made to reach a person in each household, spread over different days and times of day.

    Sample size for France is 1000.

    Mode of data collection

    Landline and mobile telephone

    Research instrument

    Questionnaires are available on the website.

    Sampling error estimates

    Estimates of standard errors (which account for sampling error) vary by country and indicator. For country-specific margins of error, please refer to the Methodology section and corresponding table in Demirgüç-Kunt, Asli, Leora Klapper, Dorothe Singer, Saniya Ansar. 2022. The Global Findex Database 2021: Financial Inclusion, Digital Payments, and Resilience in the Age of COVID-19. Washington, DC: World Bank.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
(2024). Geonames - All Cities with a population > 1000 [Dataset]. https://public.opendatasoft.com/explore/dataset/geonames-all-cities-with-a-population-1000/

Geonames - All Cities with a population > 1000

Explore at:
16 scholarly articles cite this dataset (View in Google Scholar)
csv, json, geojson, excelAvailable download formats
Dataset updated
Mar 10, 2024
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

All cities with a population > 1000 or seats of adm div (ca 80.000)Sources and ContributionsSources : GeoNames is aggregating over hundred different data sources. Ambassadors : GeoNames Ambassadors help in many countries. Wiki : A wiki allows to view the data and quickly fix error and add missing places. Donations and Sponsoring : Costs for running GeoNames are covered by donations and sponsoring.Enrichment:add country name

Search
Clear search
Close search
Google apps
Main menu