The United States Census Bureau’s international dataset provides estimates of country populations since 1950 and projections through 2050. Specifically, the dataset includes midyear population figures broken down by age and gender assignment at birth. Additionally, time-series data is provided for attributes including fertility rates, birth rates, death rates, and migration rates.
You can use the BigQuery Python client library to query tables in this dataset in Kernels. Note that methods available in Kernels are limited to querying data. Tables are at bigquery-public-data.census_bureau_international.
What countries have the longest life expectancy? In this query, 2016 census information is retrieved by joining the mortality_life_expectancy and country_names_area tables for countries larger than 25,000 km2. Without the size constraint, Monaco is the top result with an average life expectancy of over 89 years!
SELECT
age.country_name,
age.life_expectancy,
size.country_area
FROM (
SELECT
country_name,
life_expectancy
FROM
bigquery-public-data.census_bureau_international.mortality_life_expectancy
WHERE
year = 2016) age
INNER JOIN (
SELECT
country_name,
country_area
FROM
bigquery-public-data.census_bureau_international.country_names_area
where country_area > 25000) size
ON
age.country_name = size.country_name
ORDER BY
2 DESC
/* Limit removed for Data Studio Visualization */
LIMIT
10
Which countries have the largest proportion of their population under 25? Over 40% of the world’s population is under 25 and greater than 50% of the world’s population is under 30! This query retrieves the countries with the largest proportion of young people by joining the age-specific population table with the midyear (total) population table.
SELECT
age.country_name,
SUM(age.population) AS under_25,
pop.midyear_population AS total,
ROUND((SUM(age.population) / pop.midyear_population) * 100,2) AS pct_under_25
FROM (
SELECT
country_name,
population,
country_code
FROM
bigquery-public-data.census_bureau_international.midyear_population_agespecific
WHERE
year =2017
AND age < 25) age
INNER JOIN (
SELECT
midyear_population,
country_code
FROM
bigquery-public-data.census_bureau_international.midyear_population
WHERE
year = 2017) pop
ON
age.country_code = pop.country_code
GROUP BY
1,
3
ORDER BY
4 DESC /* Remove limit for visualization*/
LIMIT
10
The International Census dataset contains growth information in the form of birth rates, death rates, and migration rates. Net migration is the net number of migrants per 1,000 population, an important component of total population and one that often drives the work of the United Nations Refugee Agency. This query joins the growth rate table with the area table to retrieve 2017 data for countries greater than 500 km2.
SELECT
growth.country_name,
growth.net_migration,
CAST(area.country_area AS INT64) AS country_area
FROM (
SELECT
country_name,
net_migration,
country_code
FROM
bigquery-public-data.census_bureau_international.birth_death_growth_rates
WHERE
year = 2017) growth
INNER JOIN (
SELECT
country_area,
country_code
FROM
bigquery-public-data.census_bureau_international.country_names_area
Historic (none)
United States Census Bureau
Terms of use: This dataset is publicly available for anyone to use under the following terms provided by the Dataset Source - http://www.data.gov/privacy-policy#data_policy - and is provided "AS IS" without any warranty, express or implied, from Google. Google disclaims all liability for any damages, direct or indirect, resulting from the use of the dataset.
See the GCP Marketplace listing for more details and sample queries: https://console.cloud.google.com/marketplace/details/united-states-census-bureau/international-census-data
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This comprehensive dataset provides a wealth of information about all countries worldwide, covering a wide range of indicators and attributes. It encompasses demographic statistics, economic indicators, environmental factors, healthcare metrics, education statistics, and much more. With every country represented, this dataset offers a complete global perspective on various aspects of nations, enabling in-depth analyses and cross-country comparisons.
Key Features
- Country: Name of the country.
- Density (P/Km2): Population density measured in persons per square kilometer.
- Abbreviation: Abbreviation or code representing the country.
- Agricultural Land (%): Percentage of land area used for agricultural purposes.
- Land Area (Km2): Total land area of the country in square kilometers.
- Armed Forces Size: Size of the armed forces in the country.
- Birth Rate: Number of births per 1,000 population per year.
- Calling Code: International calling code for the country.
- Capital/Major City: Name of the capital or major city.
- CO2 Emissions: Carbon dioxide emissions in tons.
- CPI: Consumer Price Index, a measure of inflation and purchasing power.
- CPI Change (%): Percentage change in the Consumer Price Index compared to the previous year.
- Currency_Code: Currency code used in the country.
- Fertility Rate: Average number of children born to a woman during her lifetime.
- Forested Area (%): Percentage of land area covered by forests.
- Gasoline_Price: Price of gasoline per liter in local currency.
- GDP: Gross Domestic Product, the total value of goods and services produced in the country.
- Gross Primary Education Enrollment (%): Gross enrollment ratio for primary education.
- Gross Tertiary Education Enrollment (%): Gross enrollment ratio for tertiary education.
- Infant Mortality: Number of deaths per 1,000 live births before reaching one year of age.
- Largest City: Name of the country's largest city.
- Life Expectancy: Average number of years a newborn is expected to live.
- Maternal Mortality Ratio: Number of maternal deaths per 100,000 live births.
- Minimum Wage: Minimum wage level in local currency.
- Official Language: Official language(s) spoken in the country.
- Out of Pocket Health Expenditure (%): Percentage of total health expenditure paid out-of-pocket by individuals.
- Physicians per Thousand: Number of physicians per thousand people.
- Population: Total population of the country.
- Population: Labor Force Participation (%): Percentage of the population that is part of the labor force.
- Tax Revenue (%): Tax revenue as a percentage of GDP.
- Total Tax Rate: Overall tax burden as a percentage of commercial profits.
- Unemployment Rate: Percentage of the labor force that is unemployed.
- Urban Population: Percentage of the population living in urban areas.
- Latitude: Latitude coordinate of the country's location.
- Longitude: Longitude coordinate of the country's location.
Potential Use Cases
- Analyze population density and land area to study spatial distribution patterns.
- Investigate the relationship between agricultural land and food security.
- Examine carbon dioxide emissions and their impact on climate change.
- Explore correlations between economic indicators such as GDP and various socio-economic factors.
- Investigate educational enrollment rates and their implications for human capital development.
- Analyze healthcare metrics such as infant mortality and life expectancy to assess overall well-being.
- Study labor market dynamics through indicators such as labor force participation and unemployment rates.
- Investigate the role of taxation and its impact on economic development.
- Explore urbanization trends and their social and environmental consequences.
The "Global Country Rankings Dataset" is a comprehensive collection of metrics and indicators that ranks countries worldwide based on their socioeconomic performance. This datasets are providing valuable insights into the relative standings of nations in terms of key factors such as GDP per capita, economic growth, and various other relevant criteria.
Researchers, analysts, and policymakers can leverage this dataset to gain a deeper understanding of the global economic landscape and track the progress of countries over time. The dataset covers a wide range of metrics, including but not limited to:
Economic growth: the rate of change of real GDP- Country rankings: The average for 2021 based on 184 countries was 5.26 percent.The highest value was in the Maldives: 41.75 percent and the lowest value was in Afghanistan: -20.74 percent. The indicator is available from 1961 to 2021.
GDP per capita, Purchasing Power Parity - Country rankings: The average for 2021 based on 182 countries was 21283.21 U.S. dollars.The highest value was in Luxembourg: 115683.49 U.S. dollars and the lowest value was in Burundi: 705.03 U.S. dollars. The indicator is available from 1990 to 2021.
GDP per capita, current U.S. dollars - Country rankings: The average for 2021 based on 186 countries was 17937.03 U.S. dollars.The highest value was in Monaco: 234315.45 U.S. dollars and the lowest value was in Burundi: 221.48 U.S. dollars. The indicator is available from 1960 to 2021.
GDP per capita, constant 2010 dollars - Country rankings: The average for 2021 based on 184 countries was 15605.8 U.S. dollars.The highest value was in Monaco: 204190.16 U.S. dollars and the lowest value was in Burundi: 261.02 U.S. dollars. The indicator is available from 1960 to 2021.
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Welcome to the Country Information Dataset, meticulously curated by Aadarsh Vani. This dataset serves as an extensive resource for anyone interested in exploring the rich tapestry of countries around the globe, providing detailed information on various aspects of each nation.
This dataset contains valuable insights into countries worldwide, featuring the following attributes:
The aim of this dataset is to provide a comprehensive and reliable resource for researchers, data scientists, and cultural enthusiasts. It can facilitate analysis and visualizations that reveal global patterns in demographics, cultures, and economies.
Created by Aadarsh Vani, this dataset is a labor of love aimed at enriching the understanding of our world's countries. I encourage users to share their insights, visualizations, and analyses arising from this dataset. Together, we can foster a deeper appreciation of global diversity!
Thank you for exploring this dataset, and I hope it inspires your work in studying the fascinating intricacies of countries worldwide.
Note: This data set will be updated frequently to keep it updated by adding new columns and updating the updated values. Kindly use it for practice and projects only as it has missing values and may have unintentional wrong data in some cells.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Context
The dataset tabulates the Country Club Heights population by age cohorts (Children: Under 18 years; Working population: 18-64 years; Senior population: 65 years or more). It lists the population in each age cohort group along with its percentage relative to the total population of Country Club Heights. The dataset can be utilized to understand the population distribution across children, working population and senior population for dependency ratio, housing requirements, ageing, migration patterns etc.
Key observations
The largest age group was 18 to 64 years with a poulation of 122 (58.65% of the total population). Source: U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.
When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.
Age cohorts:
Variables / Data Columns
Good to know
Margin of Error
Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.
Custom data
If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.
Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.
This dataset is a part of the main dataset for Country Club Heights Population by Age. You can refer the same here
Public Domain Mark 1.0https://creativecommons.org/publicdomain/mark/1.0/
License information was derived automatically
Comparisons of the countries with the largest forest areas (representing 90% of the global primary forest area reported to FRA, 2015
The United States Geological Survey (USGS) - Science Analytics and Synthesis (SAS) - Gap Analysis Project (GAP) manages the Protected Areas Database of the United States (PAD-US), an Arc10x geodatabase, that includes a full inventory of areas dedicated to the preservation of biological diversity and to other natural, recreation, historic, and cultural uses, managed for these purposes through legal or other effective means (www.usgs.gov/core-science-systems/science-analytics-and-synthesis/gap/science/protected-areas). The PAD-US is developed in partnership with many organizations, including coordination groups at the [U.S.] Federal level, lead organizations for each State, and a number of national and other non-governmental organizations whose work is closely related to the PAD-US. Learn more about the USGS PAD-US partners program here: www.usgs.gov/core-science-systems/science-analytics-and-synthesis/gap/science/pad-us-data-stewards. The United Nations Environmental Program - World Conservation Monitoring Centre (UNEP-WCMC) tracks global progress toward biodiversity protection targets enacted by the Convention on Biological Diversity (CBD) through the World Database on Protected Areas (WDPA) and World Database on Other Effective Area-based Conservation Measures (WD-OECM) available at: www.protectedplanet.net. See the Aichi Target 11 dashboard (www.protectedplanet.net/en/thematic-areas/global-partnership-on-aichi-target-11) for official protection statistics recognized globally and developed for the CBD, or here for more information and statistics on the United States of America's protected areas: www.protectedplanet.net/country/USA. It is important to note statistics published by the National Oceanic and Atmospheric Administration (NOAA) Marine Protected Areas (MPA) Center (www.marineprotectedareas.noaa.gov/dataanalysis/mpainventory/) and the USGS-GAP (www.usgs.gov/core-science-systems/science-analytics-and-synthesis/gap/science/pad-us-statistics-and-reports) differ from statistics published by the UNEP-WCMC as methods to remove overlapping designations differ slightly and U.S. Territories are reported separately by the UNEP-WCMC (e.g. The largest MPA, "Pacific Remote Islands Marine Monument" is attributed to the United States Minor Outlying Islands statistics). At the time of PAD-US 2.1 publication (USGS-GAP, 2020), NOAA reported 26% of U.S. marine waters (including the Great Lakes) as protected in an MPA that meets the International Union for Conservation of Nature (IUCN) definition of biodiversity protection (www.iucn.org/theme/protected-areas/about). USGS-GAP plans to publish PAD-US 2.1 Statistics and Reports in the spring of 2021. The relationship between the USGS, the NOAA, and the UNEP-WCMC is as follows: - USGS manages and publishes the full inventory of U.S. marine and terrestrial protected areas data in the PAD-US representing many values, developed in collaboration with a partnership network in the U.S. and; - USGS is the primary source of U.S. marine and terrestrial protected areas data for the WDPA, developed from a subset of the PAD-US in collaboration with the NOAA, other agencies and non-governmental organizations in the U.S., and the UNEP-WCMC and; - UNEP-WCMC is the authoritative source of global protected area statistics from the WDPA and WD-OECM and; - NOAA is the authoritative source of MPA data in the PAD-US and MPA statistics in the U.S. and; - USGS is the authoritative source of PAD-US statistics (including areas primarily managed for biodiversity, multiple uses including natural resource extraction, and public access). The PAD-US 2.1 Combined Marine, Fee, Designation, Easement feature class (GAP Status Code 1 and 2 only) is the source of protected areas data in this WDPA update. Tribal areas and military lands represented in the PAD-US Proclamation feature class as GAP Status Code 4 (no known mandate for biodiversity protection) are not included as spatial data to represent internal protected areas are not available at this time. The USGS submitted more than 42,900 protected areas from PAD-US 2.1, including all 50 U.S. States and 6 U.S. Territories, to the UNEP-WCMC for inclusion in the May 2021 WDPA, available at www.protectedplanet.net. The NOAA is the sole source of MPAs in PAD-US and the National Conservation Easement Database (NCED, www.conservationeasement.us/) is the source of conservation easements. The USGS aggregates authoritative federal lands data directly from managing agencies for PAD-US (www.communities.geoplatform.gov/ngda-govunits/federal-lands-workgroup/), while a network of State data-stewards provide state, local government lands, and some land trust preserves. National nongovernmental organizations contribute spatial data directly (www.usgs.gov/core-science-systems/science-analytics-and-synthesis/gap/science/pad-us-data-stewards). The USGS translates the biodiversity focused subset of PAD-US into the WDPA schema (UNEP-WCMC, 2019) for efficient aggregation by the UNEP-WCMC. The USGS maintains WDPA Site Identifiers (WDPAID, WDPA_PID), a persistent identifier for each protected area, provided by UNEP-WCMC. Agency partners are encouraged to track WDPA Site Identifier values in source datasets to improve the efficiency and accuracy of PAD-US and WDPA updates. The IUCN protected areas in the U.S. are managed by thousands of agencies and organizations across the country and include over 42,900 designated sites such as National Parks, National Wildlife Refuges, National Monuments, Wilderness Areas, some State Parks, State Wildlife Management Areas, Local Nature Preserves, City Natural Areas, The Nature Conservancy and other Land Trust Preserves, and Conservation Easements. The boundaries of these protected places (some overlap) are represented as polygons in the PAD-US, along with informative descriptions such as Unit Name, Manager Name, and Designation Type. As the WDPA is a global dataset, their data standards (UNEP-WCMC 2019) require simplification to reduce the number of records included, focusing on the protected area site name and management authority as described in the Supplemental Information section in this metadata record. Given the numerous organizations involved, sites may be added or removed from the WDPA between PAD-US updates. These differences may reflect actual change in protected area status; however, they also reflect the dynamic nature of spatial data or Geographic Information Systems (GIS). Many agencies and non-governmental organizations are working to improve the accuracy of protected area boundaries, the consistency of attributes, and inventory completeness between PAD-US updates. In addition, USGS continually seeks partners to review and refine the assignment of conservation measures in the PAD-US.
Census data reveals that population density varies noticeably from area to area. Small area census data do a better job depicting where the crowded neighborhoods are. In this map, the yellow areas of highest density range from 30,000 to 150,000 persons per square kilometer. In those areas, if the people were spread out evenly across the area, there would be just 4 to 9 meters between them. Very high density areas exceed 7,000 persons per square kilometer. High density areas exceed 5,200 persons per square kilometer. The last categories break at 3,330 persons per square kilometer, and 1,500 persons per square kilometer.This dataset is comprised of multiple sources. All of the demographic data are from Michael Bauer Research with the exception of the following countries:Australia: Esri Australia and MapData ServicesCanada: Esri Canada and EnvironicsFrance: Esri FranceGermany: Esri Germany and NexigaIndia: Esri India and IndicusJapan: Esri JapanSouth Korea: Esri Korea and OPENmateSpain: Esri España and AISUnited States: Esri Demographics
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
The United States Census Bureau’s International Dataset provides estimates of country populations since 1950 and projections through 2050.
The U.S. Census Bureau provides estimates and projections for countries and areas that are recognized by the U.S. Department of State that have a population of at least 5,000. Specifically, the data set includes midyear population figures broken down by age and gender assignment at birth. Additionally, they provide time-series data for attributes including fertility rates, birth rates, death rates, and migration rates.
Fork this kernel to get started.
https://bigquery.cloud.google.com/dataset/bigquery-public-data:census_bureau_international
https://cloud.google.com/bigquery/public-data/international-census
Dataset Source: www.census.gov
This dataset is publicly available for anyone to use under the following terms provided by the Dataset Source -http://www.data.gov/privacy-policy#data_policy - and is provided "AS IS" without any warranty, express or implied, from Google. Google disclaims all liability for any damages, direct or indirect, resulting from the use of the dataset.
Banner Photo by Steve Richey from Unsplash.
What countries have the longest life expectancy?
Which countries have the largest proportion of their population under 25?
Which countries are seeing the largest net migration?
Overview The Office of the Geographer and Global Issues at the U.S. Department of State produces the Large Scale International Boundaries (LSIB) dataset. The current edition is version 11.4 (published 24 February 2025). The 11.4 release contains updated boundary lines and data refinements designed to extend the functionality of the dataset. These data and generalized derivatives are the only international boundary lines approved for U.S. Government use. The contents of this dataset reflect U.S. Government policy on international boundary alignment, political recognition, and dispute status. They do not necessarily reflect de facto limits of control. National Geospatial Data Asset This dataset is a National Geospatial Data Asset (NGDAID 194) managed by the Department of State. It is a part of the International Boundaries Theme created by the Federal Geographic Data Committee. Dataset Source Details Sources for these data include treaties, relevant maps, and data from boundary commissions, as well as national mapping agencies. Where available and applicable, the dataset incorporates information from courts, tribunals, and international arbitrations. The research and recovery process includes analysis of satellite imagery and elevation data. Due to the limitations of source materials and processing techniques, most lines are within 100 meters of their true position on the ground. Cartographic Visualization The LSIB is a geospatial dataset that, when used for cartographic purposes, requires additional styling. The LSIB download package contains example style files for commonly used software applications. The attribute table also contains embedded information to guide the cartographic representation. Additional discussion of these considerations can be found in the Use of Core Attributes in Cartographic Visualization section below. Additional cartographic information pertaining to the depiction and description of international boundaries or areas of special sovereignty can be found in Guidance Bulletins published by the Office of the Geographer and Global Issues: https://data.geodata.state.gov/guidance/index.html Contact Direct inquiries to internationalboundaries@state.gov. Direct download: https://data.geodata.state.gov/LSIB.zip Attribute Structure The dataset uses the following attributes divided into two categories: ATTRIBUTE NAME | ATTRIBUTE STATUS CC1 | Core CC1_GENC3 | Extension CC1_WPID | Extension COUNTRY1 | Core CC2 | Core CC2_GENC3 | Extension CC2_WPID | Extension COUNTRY2 | Core RANK | Core LABEL | Core STATUS | Core NOTES | Core LSIB_ID | Extension ANTECIDS | Extension PREVIDS | Extension PARENTID | Extension PARENTSEG | Extension These attributes have external data sources that update separately from the LSIB: ATTRIBUTE NAME | ATTRIBUTE STATUS CC1 | GENC CC1_GENC3 | GENC CC1_WPID | World Polygons COUNTRY1 | DoS Lists CC2 | GENC CC2_GENC3 | GENC CC2_WPID | World Polygons COUNTRY2 | DoS Lists LSIB_ID | BASE ANTECIDS | BASE PREVIDS | BASE PARENTID | BASE PARENTSEG | BASE The core attributes listed above describe the boundary lines contained within the LSIB dataset. Removal of core attributes from the dataset will change the meaning of the lines. An attribute status of “Extension” represents a field containing data interoperability information. Other attributes not listed above include “FID”, “Shape_length” and “Shape.” These are components of the shapefile format and do not form an intrinsic part of the LSIB. Core Attributes The eight core attributes listed above contain unique information which, when combined with the line geometry, comprise the LSIB dataset. These Core Attributes are further divided into Country Code and Name Fields and Descriptive Fields. County Code and Country Name Fields “CC1” and “CC2” fields are machine readable fields that contain political entity codes. These are two-character codes derived from the Geopolitical Entities, Names, and Codes Standard (GENC), Edition 3 Update 18. “CC1_GENC3” and “CC2_GENC3” fields contain the corresponding three-character GENC codes and are extension attributes discussed below. The codes “Q2” or “QX2” denote a line in the LSIB representing a boundary associated with areas not contained within the GENC standard. The “COUNTRY1” and “COUNTRY2” fields contain the names of corresponding political entities. These fields contain names approved by the U.S. Board on Geographic Names (BGN) as incorporated in the ‘"Independent States in the World" and "Dependencies and Areas of Special Sovereignty" lists maintained by the Department of State. To ensure maximum compatibility, names are presented without diacritics and certain names are rendered using common cartographic abbreviations. Names for lines associated with the code "Q2" are descriptive and not necessarily BGN-approved. Names rendered in all CAPITAL LETTERS denote independent states. Names rendered in normal text represent dependencies, areas of special sovereignty, or are otherwise presented for the convenience of the user. Descriptive Fields The following text fields are a part of the core attributes of the LSIB dataset and do not update from external sources. They provide additional information about each of the lines and are as follows: ATTRIBUTE NAME | CONTAINS NULLS RANK | No STATUS | No LABEL | Yes NOTES | Yes Neither the "RANK" nor "STATUS" fields contain null values; the "LABEL" and "NOTES" fields do. The "RANK" field is a numeric expression of the "STATUS" field. Combined with the line geometry, these fields encode the views of the United States Government on the political status of the boundary line. ATTRIBUTE NAME | | VALUE | RANK | 1 | 2 | 3 STATUS | International Boundary | Other Line of International Separation | Special Line A value of “1” in the “RANK” field corresponds to an "International Boundary" value in the “STATUS” field. Values of ”2” and “3” correspond to “Other Line of International Separation” and “Special Line,” respectively. The “LABEL” field contains required text to describe the line segment on all finished cartographic products, including but not limited to print and interactive maps. The “NOTES” field contains an explanation of special circumstances modifying the lines. This information can pertain to the origins of the boundary lines, limitations regarding the purpose of the lines, or the original source of the line. Use of Core Attributes in Cartographic Visualization Several of the Core Attributes provide information required for the proper cartographic representation of the LSIB dataset. The cartographic usage of the LSIB requires a visual differentiation between the three categories of boundary lines. Specifically, this differentiation must be between: International Boundaries (Rank 1); Other Lines of International Separation (Rank 2); and Special Lines (Rank 3). Rank 1 lines must be the most visually prominent. Rank 2 lines must be less visually prominent than Rank 1 lines. Rank 3 lines must be shown in a manner visually subordinate to Ranks 1 and 2. Where scale permits, Rank 2 and 3 lines must be labeled in accordance with the “Label” field. Data marked with a Rank 2 or 3 designation does not necessarily correspond to a disputed boundary. Please consult the style files in the download package for examples of this depiction. The requirement to incorporate the contents of the "LABEL" field on cartographic products is scale dependent. If a label is legible at the scale of a given static product, a proper use of this dataset would encourage the application of that label. Using the contents of the "COUNTRY1" and "COUNTRY2" fields in the generation of a line segment label is not required. The "STATUS" field contains the preferred description for the three LSIB line types when they are incorporated into a map legend but is otherwise not to be used for labeling. Use of the “CC1,” “CC1_GENC3,” “CC2,” “CC2_GENC3,” “RANK,” or “NOTES” fields for cartographic labeling purposes is prohibited. Extension Attributes Certain elements of the attributes within the LSIB dataset extend data functionality to make the data more interoperable or to provide clearer linkages to other datasets. The fields “CC1_GENC3” and “CC2_GENC” contain the corresponding three-character GENC code to the “CC1” and “CC2” attributes. The code “QX2” is the three-character counterpart of the code “Q2,” which denotes a line in the LSIB representing a boundary associated with a geographic area not contained within the GENC standard. To allow for linkage between individual lines in the LSIB and World Polygons dataset, the “CC1_WPID” and “CC2_WPID” fields contain a Universally Unique Identifier (UUID), version 4, which provides a stable description of each geographic entity in a boundary pair relationship. Each UUID corresponds to a geographic entity listed in the World Polygons dataset. These fields allow for linkage between individual lines in the LSIB and the overall World Polygons dataset. Five additional fields in the LSIB expand on the UUID concept and either describe features that have changed across space and time or indicate relationships between previous versions of the feature. The “LSIB_ID” attribute is a UUID value that defines a specific instance of a feature. Any change to the feature in a lineset requires a new “LSIB_ID.” The “ANTECIDS,” or antecedent ID, is a UUID that references line geometries from which a given line is descended in time. It is used when there is a feature that is entirely new, not when there is a new version of a previous feature. This is generally used to reference countries that have dissolved. The “PREVIDS,” or Previous ID, is a UUID field that contains old versions of a line. This is an additive field, that houses all Previous IDs. A new version of a feature is defined by any change to the
There are three csv files.One is of world population by year (1955-2020).The second file is of world population by region and the third file is of world population by countries.In which there is population of each country in our world till year 2020. Features in these 3 datasets :
I have scraped these 3 datasets from worldometers.info website using BeautifulSoup
Analyse the increase of world population in last 10 year and do the world population forecast .Find 10 largest countries by population and population density
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset provides values for GDP reported in several countries. The data includes current values, previous releases, historical highs and record lows, release frequency, reported unit and currency.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset provides values for GOLD RESERVES reported in several countries. The data includes current values, previous releases, historical highs and record lows, release frequency, reported unit and currency.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Data DescriptionWater Quality Parameters: Ammonia, BOD, DO, Orthophosphate, pH, Temperature, Nitrogen, Nitrate.Countries/Regions: United States, Canada, Ireland, England, China.Years Covered: 1940-2023.Data Records: 2.82 million.Definition of ColumnsCountry: Name of the water-body region.Area: Name of the area in the region.Waterbody Type: Type of the water-body source.Date: Date of the sample collection (dd-mm-yyyy).Ammonia (mg/l): Ammonia concentration.Biochemical Oxygen Demand (BOD) (mg/l): Oxygen demand measurement.Dissolved Oxygen (DO) (mg/l): Concentration of dissolved oxygen.Orthophosphate (mg/l): Orthophosphate concentration.pH (pH units): pH level of water.Temperature (°C): Temperature in Celsius.Nitrogen (mg/l): Total nitrogen concentration.Nitrate (mg/l): Nitrate concentration.CCME_Values: Calculated water quality index values using the CCME WQI model.CCME_WQI: Water Quality Index classification based on CCME_Values.Data Directory Description:Category 1: DatasetCombined Data: This folder contains two CSV files: Combined_dataset.csv and Summary.xlsx. The Combined_dataset.csv file includes all eight water quality parameter readings across five countries, with additional data for initial preprocessing steps like missing value handling, outlier detection, and other operations. It also contains the CCME Water Quality Index calculation for empirical analysis and ML-based research. The Summary.xlsx provides a brief description of the datasets, including data distributions (e.g., maximum, minimum, mean, standard deviation).Combined_dataset.csvSummary.xlsxCountry-wise Data: This folder contains separate country-based datasets in CSV files. Each file includes the eight water quality parameters for regional analysis. The Summary_country.xlsx file presents country-wise dataset descriptions with data distributions (e.g., maximum, minimum, mean, standard deviation).England_dataset.csvCanada_dataset.csvUSA_dataset.csvIreland_dataset.csvChina_dataset.csvSummary_country.xlsxCategory 2: CodeData processing and harmonization code (e.g., Language Conversion, Date Conversion, Parameter Naming and Unit Conversion, Missing Value Handling, WQI Measurement and Classification).Data_Processing_Harmonnization.ipynbThe code used for Technical Validation (e.g., assessing the Data Distribution, Outlier Detection, Water Quality Trend Analysis, and Vrifying the Application of the Dataset for the ML Models).Technical_Validation.ipynbCategory 3: Data Collection SourcesThis category includes links to the selected dataset sources, which were used to create the dataset and are provided for further reconstruction or data formation. It contains links to various data collection sources.DataCollectionSources.xlsxOriginal Paper Title: A Comprehensive Dataset of Surface Water Quality Spanning 1940-2023 for Empirical and ML Adopted ResearchAbstractAssessment and monitoring of surface water quality are essential for food security, public health, and ecosystem protection. Although water quality monitoring is a known phenomenon, little effort has been made to offer a comprehensive and harmonized dataset for surface water at the global scale. This study presents a comprehensive surface water quality dataset that preserves spatio-temporal variability, integrity, consistency, and depth of the data to facilitate empirical and data-driven evaluation, prediction, and forecasting. The dataset is assembled from a range of sources, including regional and global water quality databases, water management organizations, and individual research projects from five prominent countries in the world, e.g., the USA, Canada, Ireland, England, and China. The resulting dataset consists of 2.82 million measurements of eight water quality parameters that span 1940 - 2023. This dataset can support meta-analysis of water quality models and can facilitate Machine Learning (ML) based data and model-driven investigation of the spatial and temporal drivers and patterns of surface water quality at a cross-regional to global scale.Note: Cite this repository and the original paper when using this dataset.
A global self-hosted location dataset containing all administrative divisions, cities, and zip codes for 247 countries. All geospatial data is updated weekly to maintain the highest data quality, including challenging countries such as China, Brazil, Russia, and the United Kingdom.
Use cases for the Global Zip Code Database (Geospatial data)
Address capture and validation
Map and visualization
Reporting and Business Intelligence (BI)
Master Data Mangement
Logistics and Supply Chain Management
Sales and Marketing
Data export methodology
Our location data packages are offered in variable formats, including .csv. All geospatial data are optimized for seamless integration with popular systems like Esri ArcGIS, Snowflake, QGIS, and more.
Product Features
Fully and accurately geocoded
Administrative areas with a level range of 0-4
Multi-language support including address names in local and foreign languages
Comprehensive city definitions across countries
For additional insights, you can combine the map data with:
UNLOCODE and IATA codes
Time zones and Daylight Saving Times
Why do companies choose our location databases
Enterprise-grade service
Reduce integration time and cost by 30%
Weekly updates for the highest quality
Note: Custom geospatial data packages are available. Please submit a request via the above contact button for more details.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Analysis of ‘Argentina provincial data’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/kingabzpro/argentina-provincial-data on 28 January 2022.
--- Dataset description provided by original source is as follows ---
With almost 40 million inhabitants and a diverse geography that encompasses the Andes mountains, glacial lakes, and the Pampas grasslands, Argentina is the second largest country (by area) and has one of the largest economies in South America. It is politically organized as a federation of 23 provinces and an autonomous city, Buenos Aires.
We will analyze ten economic and social indicators collected for each province. Because these indicators are highly correlated, we will use principal component analysis (PCA) to reduce redundancies and highlight patterns that are not apparent in the raw data. After visualizing the patterns, we will use k-means clustering to partition the provinces into groups with similar development levels.
These results can be used to plan public policy by helping allocate resources to develop infrastructure, education, and welfare programs.
DataCamp
--- Original source retains full ownership of the source dataset ---
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The European Business Performance database describes the performance of the largest enterprises in the twentieth century. It covers eight countries that together consistently account for above 80 per cent of western European GDP: Great Britain, Germany, France, Belgium, Italy, Spain, Sweden, and Finland. Data have been collected for five benchmark years, namely on the eve of WWI (1913), before the Great Depression (1927), at the extremes of the golden age (1954 and 1972), and in 2000.The database is comprised of two distinct datasets. The Small Sample (625 firms) includes the largest enterprises in each country across all industries (economy-wide). To avoid over-representation of certain countries and sectors, countries contribute a number of firms that is roughly proportionate to the size of the economy: 30 firms from Great Britain, 25 from Germany, 20 from France, 15 from Italy, 10 from Belgium, Spain, and Sweden, and 5 from Finland. By the same token, a cap has been set on the number of financial firms entering the sample, so that they range between up to 6 for Britain and 1 for Finland.The second dataset, or Large Sample (1,167 firms), is made up of the largest firms per industry. Here industries are so selected as to take into account long-term technological developments and the rise of entirely new products and services. Firms have been individually classified using the two-digit ISIC Rev. 3.1 codes, then grouped under a manageable number of industries. To some extent and broadly speaking, the two samples have a rather distinct focus: the Small Sample is biased in favour of sheer bigness, whereas the Large Sample emphasizes industries.As far as size and performance indicators are concerned, total assets has been picked as the main size measure in the first three benchmarks, turnover in 1972 and 2000 (financial intermediaries, though, are ranked by total assets throughout the database). Performance is gauged by means of two financial ratios, namely return on equity and shareholders’ return, i.e. the percentage year-on-year change in share price based on year-end values. In order to smooth out volatility, at each benchmark performance figures have been averaged over three consecutive years (for instance, performance in 1913 reflects average performance in 1911, 1912, and 1913).All figures were collected in national currency and converted to US dollars at current year-average exchange rates.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
INAO Directive 1/2000 refers to the materialisation of the production area of the PDO under the term ‘geographical area’. It is defined by a list of administrative entities (departments, cantons, municipalities) or by natural geographical boundaries. It corresponds to the largest demarcated area in which all stages of product development are permitted. Nevertheless, for wines there may be an area of immediate proximity, defined by derogation, for winemaking and winemaking. The details of this derogation are set out in Chapter XI of the specification for the designation. In some cases, the geographical area differs from territories where only part of the preparation of the product is authorised. The PDO, the protected designation of origin, corresponds to the European controlled designation of origin. It is the name of a region, a specified place or, in exceptional cases, a country, which is used to designate an agricultural product or foodstuff originating in that region, place or country, of which: — the quality or characteristics are due essentially or exclusively to the geographical environment including natural and human factors and, — production, processing and preparation take place in the defined geographical area. A designation includes 1 to n denomination(s) and 1 to n products (e.g. the colours of the wines, which may be subject to specific delimitations). The recognition of an AOC in France is a prerequisite for its final recognition at European level as a PDO. In case of refusal to register as PDO, the product loses the benefit of the AOC. In the case of a wine-type PDO, the value of the attribute TYPE_PRODUIT is 4.1 (Vins)
The QoG Institute is an independent research institute within the Department of Political Science at the University of Gothenburg. Overall 30 researchers conduct and promote research on the causes, consequences and nature of Good Governance and the Quality of Government - that is, trustworthy, reliable, impartial, uncorrupted and competent government institutions.
The main objective of our research is to address the theoretical and empirical problem of how political institutions of high quality can be created and maintained. A second objective is to study the effects of Quality of Government on a number of policy areas, such as health, the environment, social policy, and poverty.
QoG Standard Dataset is the largest dataset consisting of more than 2,000 variables from sources related to the Quality of Government. The data exist in both time-series (year 1946 and onwards) and cross-section (year 2020). Many of the variables are available in both datasets, but some are not. The datasets draws on a number of freely available data sources related to QoG and its correlates.
In the QoG Standard CS dataset, data from and around 2020 is included. Data from 2020 is prioritized; however, if no data is available for a country for 2020, data for 2021 is included. If no data exists for 2021, data for 2019 is included, and so on up to a maximum of +/- 3 years.
In the QoG Standard TS dataset, data from 1946 and onwards is included and the unit of analysis is country-year (e.g., Sweden-1946, Sweden-1947, etc.).
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Context
The dataset tabulates the Country Club Hills population by age cohorts (Children: Under 18 years; Working population: 18-64 years; Senior population: 65 years or more). It lists the population in each age cohort group along with its percentage relative to the total population of Country Club Hills. The dataset can be utilized to understand the population distribution across children, working population and senior population for dependency ratio, housing requirements, ageing, migration patterns etc.
Key observations
The largest age group was 18 to 64 years with a poulation of 633 (62.24% of the total population). Source: U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.
When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.
Age cohorts:
Variables / Data Columns
Good to know
Margin of Error
Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.
Custom data
If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.
Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.
This dataset is a part of the main dataset for Country Club Hills Population by Age. You can refer the same here
The United States Census Bureau’s international dataset provides estimates of country populations since 1950 and projections through 2050. Specifically, the dataset includes midyear population figures broken down by age and gender assignment at birth. Additionally, time-series data is provided for attributes including fertility rates, birth rates, death rates, and migration rates.
You can use the BigQuery Python client library to query tables in this dataset in Kernels. Note that methods available in Kernels are limited to querying data. Tables are at bigquery-public-data.census_bureau_international.
What countries have the longest life expectancy? In this query, 2016 census information is retrieved by joining the mortality_life_expectancy and country_names_area tables for countries larger than 25,000 km2. Without the size constraint, Monaco is the top result with an average life expectancy of over 89 years!
SELECT
age.country_name,
age.life_expectancy,
size.country_area
FROM (
SELECT
country_name,
life_expectancy
FROM
bigquery-public-data.census_bureau_international.mortality_life_expectancy
WHERE
year = 2016) age
INNER JOIN (
SELECT
country_name,
country_area
FROM
bigquery-public-data.census_bureau_international.country_names_area
where country_area > 25000) size
ON
age.country_name = size.country_name
ORDER BY
2 DESC
/* Limit removed for Data Studio Visualization */
LIMIT
10
Which countries have the largest proportion of their population under 25? Over 40% of the world’s population is under 25 and greater than 50% of the world’s population is under 30! This query retrieves the countries with the largest proportion of young people by joining the age-specific population table with the midyear (total) population table.
SELECT
age.country_name,
SUM(age.population) AS under_25,
pop.midyear_population AS total,
ROUND((SUM(age.population) / pop.midyear_population) * 100,2) AS pct_under_25
FROM (
SELECT
country_name,
population,
country_code
FROM
bigquery-public-data.census_bureau_international.midyear_population_agespecific
WHERE
year =2017
AND age < 25) age
INNER JOIN (
SELECT
midyear_population,
country_code
FROM
bigquery-public-data.census_bureau_international.midyear_population
WHERE
year = 2017) pop
ON
age.country_code = pop.country_code
GROUP BY
1,
3
ORDER BY
4 DESC /* Remove limit for visualization*/
LIMIT
10
The International Census dataset contains growth information in the form of birth rates, death rates, and migration rates. Net migration is the net number of migrants per 1,000 population, an important component of total population and one that often drives the work of the United Nations Refugee Agency. This query joins the growth rate table with the area table to retrieve 2017 data for countries greater than 500 km2.
SELECT
growth.country_name,
growth.net_migration,
CAST(area.country_area AS INT64) AS country_area
FROM (
SELECT
country_name,
net_migration,
country_code
FROM
bigquery-public-data.census_bureau_international.birth_death_growth_rates
WHERE
year = 2017) growth
INNER JOIN (
SELECT
country_area,
country_code
FROM
bigquery-public-data.census_bureau_international.country_names_area
Historic (none)
United States Census Bureau
Terms of use: This dataset is publicly available for anyone to use under the following terms provided by the Dataset Source - http://www.data.gov/privacy-policy#data_policy - and is provided "AS IS" without any warranty, express or implied, from Google. Google disclaims all liability for any damages, direct or indirect, resulting from the use of the dataset.
See the GCP Marketplace listing for more details and sample queries: https://console.cloud.google.com/marketplace/details/united-states-census-bureau/international-census-data