Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This comprehensive dataset provides a wealth of information about all countries worldwide, covering a wide range of indicators and attributes. It encompasses demographic statistics, economic indicators, environmental factors, healthcare metrics, education statistics, and much more. With every country represented, this dataset offers a complete global perspective on various aspects of nations, enabling in-depth analyses and cross-country comparisons.
Key Features
Country: Name of the country.
Density (P/Km2): Population density measured in persons per square kilometer.
Abbreviation: Abbreviation or code representing the country.
Agricultural Land (%): Percentage of land area used for agricultural purposes.
Land Area (Km2): Total land area of the country in square kilometers.
Armed Forces Size: Size of the armed forces in the country.
Birth Rate: Number of births per 1,000 population per year.
Calling Code: International calling code for the country.
Capital/Major City: Name of the capital or major city.
CO2 Emissions: Carbon dioxide emissions in tons.
CPI: Consumer Price Index, a measure of inflation and purchasing power.
CPI Change (%): Percentage change in the Consumer Price Index compared to the previous year.
Currency_Code: Currency code used in the country.
Fertility Rate: Average number of children born to a woman during her lifetime.
Forested Area (%): Percentage of land area covered by forests.
Gasoline_Price: Price of gasoline per liter in local currency.
GDP: Gross Domestic Product, the total value of goods and services produced in the country.
Gross Primary Education Enrollment (%): Gross enrollment ratio for primary education.
Gross Tertiary Education Enrollment (%): Gross enrollment ratio for tertiary education.
Infant Mortality: Number of deaths per 1,000 live births before reaching one year of age.
Largest City: Name of the country's largest city.
Life Expectancy: Average number of years a newborn is expected to live.
Maternal Mortality Ratio: Number of maternal deaths per 100,000 live births.
Minimum Wage: Minimum wage level in local currency.
Official Language: Official language(s) spoken in the country.
Out of Pocket Health Expenditure (%): Percentage of total health expenditure paid out-of-pocket by individuals.
Physicians per Thousand: Number of physicians per thousand people.
Population: Total population of the country.
Population: Labor Force Participation (%): Percentage of the population that is part of the labor force.
Tax Revenue (%): Tax revenue as a percentage of GDP.
Total Tax Rate: Overall tax burden as a percentage of commercial profits.
Unemployment Rate: Percentage of the labor force that is unemployed.
Urban Population: Percentage of the population living in urban areas.
Latitude: Latitude coordinate of the country's location.
Longitude: Longitude coordinate of the country's location.
Potential Use Cases
Analyze population density and land area to study spatial distribution patterns.
Investigate the relationship between agricultural land and food security.
Examine carbon dioxide emissions and their impact on climate change.
Explore correlations between economic indicators such as GDP and various socio-economic factors.
Investigate educational enrollment rates and their implications for human capital development.
Analyze healthcare metrics such as infant mortality and life expectancy to assess overall well-being.
Study labor market dynamics through indicators such as labor force participation and unemployment rates.
Investigate the role of taxation and its impact on economic development.
Explore urbanization trends and their social and environmental consequences.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset is extracted from https://en.wikipedia.org/wiki/List_of_countries_by_population_in_1800. Context: There s a story behind every dataset and heres your opportunity to share yours.Content: What s inside is more than just rows and columns. Make it easy for others to get started by describing how you acquired the data and what time period it represents, too. Acknowledgements:We wouldn t be here without the help of others. If you owe any attributions or thanks, include them here along with any citations of past research.Inspiration: Your data will be in front of the world s largest data science community. What questions do you want to see answered?
The United States Census Bureau’s international dataset provides estimates of country populations since 1950 and projections through 2050. Specifically, the dataset includes midyear population figures broken down by age and gender assignment at birth. Additionally, time-series data is provided for attributes including fertility rates, birth rates, death rates, and migration rates.
You can use the BigQuery Python client library to query tables in this dataset in Kernels. Note that methods available in Kernels are limited to querying data. Tables are at bigquery-public-data.census_bureau_international.
What countries have the longest life expectancy? In this query, 2016 census information is retrieved by joining the mortality_life_expectancy and country_names_area tables for countries larger than 25,000 km2. Without the size constraint, Monaco is the top result with an average life expectancy of over 89 years!
SELECT
age.country_name,
age.life_expectancy,
size.country_area
FROM (
SELECT
country_name,
life_expectancy
FROM
bigquery-public-data.census_bureau_international.mortality_life_expectancy
WHERE
year = 2016) age
INNER JOIN (
SELECT
country_name,
country_area
FROM
bigquery-public-data.census_bureau_international.country_names_area
where country_area > 25000) size
ON
age.country_name = size.country_name
ORDER BY
2 DESC
/* Limit removed for Data Studio Visualization */
LIMIT
10
Which countries have the largest proportion of their population under 25? Over 40% of the world’s population is under 25 and greater than 50% of the world’s population is under 30! This query retrieves the countries with the largest proportion of young people by joining the age-specific population table with the midyear (total) population table.
SELECT
age.country_name,
SUM(age.population) AS under_25,
pop.midyear_population AS total,
ROUND((SUM(age.population) / pop.midyear_population) * 100,2) AS pct_under_25
FROM (
SELECT
country_name,
population,
country_code
FROM
bigquery-public-data.census_bureau_international.midyear_population_agespecific
WHERE
year =2017
AND age < 25) age
INNER JOIN (
SELECT
midyear_population,
country_code
FROM
bigquery-public-data.census_bureau_international.midyear_population
WHERE
year = 2017) pop
ON
age.country_code = pop.country_code
GROUP BY
1,
3
ORDER BY
4 DESC /* Remove limit for visualization*/
LIMIT
10
The International Census dataset contains growth information in the form of birth rates, death rates, and migration rates. Net migration is the net number of migrants per 1,000 population, an important component of total population and one that often drives the work of the United Nations Refugee Agency. This query joins the growth rate table with the area table to retrieve 2017 data for countries greater than 500 km2.
SELECT
growth.country_name,
growth.net_migration,
CAST(area.country_area AS INT64) AS country_area
FROM (
SELECT
country_name,
net_migration,
country_code
FROM
bigquery-public-data.census_bureau_international.birth_death_growth_rates
WHERE
year = 2017) growth
INNER JOIN (
SELECT
country_area,
country_code
FROM
bigquery-public-data.census_bureau_international.country_names_area
Historic (none)
United States Census Bureau
Terms of use: This dataset is publicly available for anyone to use under the following terms provided by the Dataset Source - http://www.data.gov/privacy-policy#data_policy - and is provided "AS IS" without any warranty, express or implied, from Google. Google disclaims all liability for any damages, direct or indirect, resulting from the use of the dataset.
See the GCP Marketplace listing for more details and sample queries: https://console.cloud.google.com/marketplace/details/united-states-census-bureau/international-census-data
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset provides values for GDP reported in several countries. The data includes current values, previous releases, historical highs and record lows, release frequency, reported unit and currency.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset contains carbon fluxes for the 10 largest countries in the world (here EU27 is treated as a country) using GOSAT and OCO-2 observational constraints for 2017-2019.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset provides values for CORONAVIRUS DEATHS reported in several countries. The data includes current values, previous releases, historical highs and record lows, release frequency, reported unit and currency.
Even though Canada is the second largest country in the world in terms of land area, it ranks 33rd in terms of population. Almost all of Canada’s population is concentrated in a narrow band along the country’s southern edge. Nearly 80% of the total population lives within the 25 major metropolitan areas, which represent only 0.79% of the total area of the country.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Top 1000 YouTubers in World's 5th largest country (According to population) Pakistan. This Data contains the Total Views of the Channel, Channel Category, Number of subscribers, and Total Videos on the Channel.
# Inspiration
I want to see what Pakistanis are watching.
channel_name : Name of YouTube Channel
subscribers : Total No. of Subscribers
total_views : Total Views of All Videos
total_videos : Total video content of a channel
category : Category of YouTube Channel like education , food etc
started : Starting Year of Channel.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
The World Bank is an international financial institution that provides loans to countries of the world for capital projects. The World Bank's stated goal is the reduction of poverty. Source: https://en.wikipedia.org/wiki/World_Bank
This dataset contains both national and regional debt statistics captured by over 200 economic indicators. Time series data is available for those indicators from 1970 to 2015 for reporting countries.
For more information, see the World Bank website.
Fork this kernel to get started with this dataset.
https://bigquery.cloud.google.com/dataset/bigquery-public-data:world_bank_intl_debt
https://cloud.google.com/bigquery/public-data/world-bank-international-debt
Citation: The World Bank: International Debt Statistics
Dataset Source: World Bank. This dataset is publicly available for anyone to use under the following terms provided by the Dataset Source - http://www.data.gov/privacy-policy#data_policy - and is provided "AS IS" without any warranty, express or implied, from Google. Google disclaims all liability for any damages, direct or indirect, resulting from the use of the dataset.
Banner Photo by @till_indeman from Unplash.
What countries have the largest outstanding debt?
https://cloud.google.com/bigquery/images/outstanding-debt.png" alt="enter image description here">
https://cloud.google.com/bigquery/images/outstanding-debt.png
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
To create the dataset, the top 10 countries leading in the incidence of COVID-19 in the world were selected as of October 22, 2020 (on the eve of the second full of pandemics), which are presented in the Global 500 ranking for 2020: USA, India, Brazil, Russia, Spain, France and Mexico. For each of these countries, no more than 10 of the largest transnational corporations included in the Global 500 rating for 2020 and 2019 were selected separately. The arithmetic averages were calculated and the change (increase) in indicators such as profitability and profitability of enterprises, their ranking position (competitiveness), asset value and number of employees. The arithmetic mean values of these indicators for all countries of the sample were found, characterizing the situation in international entrepreneurship as a whole in the context of the COVID-19 crisis in 2020 on the eve of the second wave of the pandemic. The data is collected in a general Microsoft Excel table. Dataset is a unique database that combines COVID-19 statistics and entrepreneurship statistics. The dataset is flexible data that can be supplemented with data from other countries and newer statistics on the COVID-19 pandemic. Due to the fact that the data in the dataset are not ready-made numbers, but formulas, when adding and / or changing the values in the original table at the beginning of the dataset, most of the subsequent tables will be automatically recalculated and the graphs will be updated. This allows the dataset to be used not just as an array of data, but as an analytical tool for automating scientific research on the impact of the COVID-19 pandemic and crisis on international entrepreneurship. The dataset includes not only tabular data, but also charts that provide data visualization. The dataset contains not only actual, but also forecast data on morbidity and mortality from COVID-19 for the period of the second wave of the pandemic in 2020. The forecasts are presented in the form of a normal distribution of predicted values and the probability of their occurrence in practice. This allows for a broad scenario analysis of the impact of the COVID-19 pandemic and crisis on international entrepreneurship, substituting various predicted morbidity and mortality rates in risk assessment tables and obtaining automatically calculated consequences (changes) on the characteristics of international entrepreneurship. It is also possible to substitute the actual values identified in the process and following the results of the second wave of the pandemic to check the reliability of pre-made forecasts and conduct a plan-fact analysis. The dataset contains not only the numerical values of the initial and predicted values of the set of studied indicators, but also their qualitative interpretation, reflecting the presence and level of risks of a pandemic and COVID-19 crisis for international entrepreneurship.
This dataset provides supply chain health commodity shipment and pricing data. Specifically, the data set identifies Antiretroviral (ARV) and HIV lab shipments to supported countries. In addition, the data set provides the commodity pricing and associated supply chain expenses necessary to move the commodities to countries for use. The dataset has similar fields to the Global Fund's Price, Quality and Reporting (PQR) data. PEPFAR and the Global Fund represent the two largest procurers of HIV health commodities. This dataset, when analyzed in conjunction with the PQR data, provides a more complete picture of global spending on specific health commodities. The data are particularly valuable for understanding ranges and trends in pricing as well as volumes delivered by country. The US Government believes this data will help stakeholders make better, data-driven decisions. Care should be taken to consider contextual factors when using the database. Conclusions related to costs associated with moving specific line items or products to specific countries and lead times by product/country will not be accurate.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset provides values for GOLD RESERVES reported in several countries. The data includes current values, previous releases, historical highs and record lows, release frequency, reported unit and currency.
Business-critical Data Types We offer access to robust datasets sourced from over 13M job ads daily. Track companies’ growth, market focus, technological shifts, planned geographic expansion, and more: - Identify new business opportunities - Identify and forecast industry & technological trends - Help identify the jobs, teams, and business units that have the highest impact on corporate goals - Identify most in-demand skills and qualifications for key positions.
Fresh Datasets We regularly update our datasets, assuring you access to the latest data and allowing for timely analysis of rapidly evolving markets & dynamic businesses.
Historical Datasets We maintain at your disposal historical datasets, allowing for comprehensive, reliable, and statistically sound historical analysis, trend identification, and forecasting.
Easy Access and Retrieval Our job listing datasets are available in industry-standard, convenient JSON and CSV formats. These structured formats make our datasets compatible with machine learning, artificial intelligence training, and similar applications. The historical data retrieval process is quick and reliable thanks to our robust, easy-to-implement API integration.
Datasets for investors Investment firms and hedge funds use our datasets to better inform their investment decisions by gaining up-to-date, reliable insights into workforce growth, geographic expansion, market focus, technology shifts, and other factors of start-ups and established companies.
Datasets for businesses Our datasets are used by retailers, manufacturers, real estate agents, and many other types of B2B & B2C businesses to stay ahead of the curve. They can gain insights into the competitive landscape, technology, and product adoption trends as well as power their lead generation processes with data-driven decision-making.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
http://inspire.ec.europa.eu/metadata-codelist/LimitationsOnPublicAccess/INSPIRE_Directive_Article13_1dhttp://inspire.ec.europa.eu/metadata-codelist/LimitationsOnPublicAccess/INSPIRE_Directive_Article13_1d
The British Geological Survey has one of the largest databases in the world on the production and trade of minerals. The dataset contains annual production statistics by mass for more than 70 mineral commodities covering the majority of economically important and internationally-traded minerals, metals and mineral-based materials. For each commodity the annual production statistics are recorded for individual countries, grouped by continent. Import and export statistics are also available for years up to 2002. Maintenance of the database is funded by the Science Budget and output is used by government, private industry and others in support of policy, economic analysis and commercial strategy. As far as possible the production data are compiled from primary, official sources. Quality assurance is maintained by participation in such groups as the International Consultative Group on Non-ferrous Metal Statistics. Individual commodity and country tables are available for sale on request.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Associated with manuscript titled: Fifty Muslim-majority countries have fewer COVID-19 cases and deaths than the 50 richest non-Muslim countriesThe objective of this research was to determine the difference in the total number of COVID-19 cases and deaths between Muslim-majority and non-Muslim countries, and investigate reasons for the disparities. Methods: The 50 Muslim-majority countries had more than 50.0% Muslims with an average of 87.5%. The non-Muslim country sample consisted of 50 countries with the highest GDP while omitting any Muslim-majority countries listed. The non-Muslim countries’ average percentage of Muslims was 4.7%. Data pulled on September 18, 2020 included the percentage of Muslim population per country by World Population Review15 and GDP per country, population count, and total number of COVID-19 cases and deaths by Worldometers.16 The data set was transferred via an Excel spreadsheet on September 23, 2020 and analyzed. To measure COVID-19’s incidence in the countries, three different Average Treatment Methods (ATE) were used to validate the results. Results published as a preprint at https://doi.org/10.31235/osf.io/84zq5(15) Muslim Majority Countries 2020 [Internet]. Walnut (CA): World Population Review. 2020- [Cited 2020 Sept 28]. Available from: http://worldpopulationreview.com/country-rankings/muslim-majority-countries (16) Worldometers.info. Worldometer. Dover (DE): Worldometer; 2020 [cited 2020 Sept 28]. Available from: http://worldometers.info
Xverum’s Point of Interest (POI) Data is a comprehensive dataset containing 230M+ verified locations across 5000 business categories. Our dataset delivers structured geographic data, business attributes, location intelligence, and mapping insights, making it an essential tool for GIS applications, market research, urban planning, and competitive analysis.
With regular updates and continuous POI discovery, Xverum ensures accurate, up-to-date information on businesses, landmarks, retail stores, and more. Delivered in bulk to S3 Bucket and cloud storage, our dataset integrates seamlessly into mapping, geographic information systems, and analytics platforms.
🔥 Key Features:
Extensive POI Coverage: ✅ 230M+ Points of Interest worldwide, covering 5000 business categories. ✅ Includes retail stores, restaurants, corporate offices, landmarks, and service providers.
Geographic & Location Intelligence Data: ✅ Latitude & longitude coordinates for mapping and navigation applications. ✅ Geographic classification, including country, state, city, and postal code. ✅ Business status tracking – Open, temporarily closed, or permanently closed.
Continuous Discovery & Regular Updates: ✅ New POIs continuously added through discovery processes. ✅ Regular updates ensure data accuracy, reflecting new openings and closures.
Rich Business Insights: ✅ Detailed business attributes, including company name, category, and subcategories. ✅ Contact details, including phone number and website (if available). ✅ Consumer review insights, including rating distribution and total number of reviews (additional feature). ✅ Operating hours where available.
Ideal for Mapping & Location Analytics: ✅ Supports geospatial analysis & GIS applications. ✅ Enhances mapping & navigation solutions with structured POI data. ✅ Provides location intelligence for site selection & business expansion strategies.
Bulk Data Delivery (NO API): ✅ Delivered in bulk via S3 Bucket or cloud storage. ✅ Available in structured format (.json) for seamless integration.
🏆Primary Use Cases:
Mapping & Geographic Analysis: 🔹 Power GIS platforms & navigation systems with precise POI data. 🔹 Enhance digital maps with accurate business locations & categories.
Retail Expansion & Market Research: 🔹 Identify key business locations & competitors for market analysis. 🔹 Assess brand presence across different industries & geographies.
Business Intelligence & Competitive Analysis: 🔹 Benchmark competitor locations & regional business density. 🔹 Analyze market trends through POI growth & closure tracking.
Smart City & Urban Planning: 🔹 Support public infrastructure projects with accurate POI data. 🔹 Improve accessibility & zoning decisions for government & businesses.
💡 Why Choose Xverum’s POI Data?
Access Xverum’s 230M+ POI dataset for mapping, geographic analysis, and location intelligence. Request a free sample or contact us to customize your dataset today!
The World Database on Protected Areas (WDPA) is the largest assembly of data on the world's terrestrial and marine protected areas, containing more than 260,000 protected areas as of August 2020, with records covering 245 countries and territories throughout the world.
The WDPA is a joint venture between the United Nations Environment Programme World Conservation Monitoring Centre (UNEP-WCMC) and the International Union for Conservation of Nature (IUCN) World Commission on Protected Areas (WCPA).
Data for the WDPA is collected from international convention secretariats, governments and collaborating NGOs, but the role of custodian is allocated to the Protected Areas Programme of UNEP-WCMC, based in Cambridge, UK, who have hosted the database since its creation in 1981. The WDPA is updated on a monthly basis, and can be downloaded from https://www.protectedplanet.net/en/thematic-areas/wdpa.
Data creation: 2020-08-01
Citation:
IUCN and UNEP-WCMC (2020), The World Database on Protected Areas (WDPA) [https://www.protectedplanet.net/en/search-areas?filters%5Bdb_type%5D%5B%5D=wdpa&geo_type=region], [08/2020], Cambridge, UK: UNEP-WCMC. Available at: www.protectedplanet.net.
Contact points:
Metadata Contact: UN Environment Programme World Conservation Monitoring Centre (UNEP-WCMC)
Responsible Party: UN Environment Programme World Conservation Monitoring Centre (UNEP-WCMC)
Resource Contact: Protected Planet (WDPA) UNEP-WCMC & IUCN
Resource constraints:
PLEASE READ THESE TERMS AND CONDITIONS CAREFULLY. IF YOU DO NOT AGREE TO ANY OF THE TERMS AND CONDITIONS DO NOT DOWNLOAD. BY DOWNLOADING THE WDPA MATERIALS MADE AVAILABLE ON PROTECTEDPLANET.NET YOU ACCEPT AND AGREE TO COMPLY WITH THE TERMS AND CONDITIONS BELOW.
No Commercial Use
Neither (a) the WDPA Materials nor (b) any work derived from or based upon the WDPA Materials (“Derivative Works") may be put to Commercial Use without the prior written permission of UNEP-WCMC. For the purposes of these Terms and Conditions, “Commercial Use" means a) any use for profit or to generate revenue, or b) any use by an individual or entity operating within or on behalf of or to the benefit of or to assist the activities of any entity other than a not-for-profit organisation. To apply for permission for Commercial Use of the WDPA Materials please send an email to business-support@unep-wcmc.org outlining your needs.
No Sub-licensing or Redistribution of WDPA Data
The WDPA Materials may not be sub-licensed in whole or in part including within Derivative Works without the prior written permission of UNEP-WCMC. You may not redistribute the WDPA Data contained in the WDPA in whole or in part by any means including (but not limited to) electronic formats such as web downloads, through web services, through interactive web maps (including mobile applications) that grant users download access, KML Files or through file transfer protocols. If you know of others who wish to use the WDPA Data please refer them to protectedplanet.net. If you wish to provide a service through which the WDPA Data are downloadable or otherwise made available for redistribution you must contact protectedareas@unep-wcmc.org for permission and further guidance. More information on the WDPA license at https://www.unep-wcmc.org/policies/wdpa-data-licence#data_policy
Map Disclaimer:
The designations employed and the presentation of material on this map do not imply the expression of any opinion whatsoever on the part of the Secretariat of the United Nations concerning the legal status of any country, territory, city or area or of its authorities, or concerning the delimitation of its frontiers or boundaries. Final status of the Abyei area is not yet determined.
Online resources:
Dimension Translator Json file
Dimensions labels jsonstat file
Download World Database on Protected Areas (WDPA) from Protected Planet
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Analysis of ‘Supply Chain Shipment Pricing Data’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://catalog.data.gov/dataset/e7707c1f-2856-4df6-8d0c-ed1ba8a3cd91 on 12 February 2022.
--- Dataset description provided by original source is as follows ---
This data set provides supply chain health commodity shipment and pricing data. Specifically, the data set identifies Antiretroviral (ARV) and HIV lab shipments to supported countries. In addition, the data set provides the commodity pricing and associated supply chain expenses necessary to move the commodities to countries for use. The dataset has similar fields to the Global Fund's Price, Quality and Reporting (PQR) data. PEPFAR and the Global Fund represent the two largest procurers of HIV health commodities. This dataset, when analyzed in conjunction with the PQR data, provides a more complete picture of global spending on specific health commodities. The data are particularly valuable for understanding ranges and trends in pricing as well as volumes delivered by country. The US Government believes this data will help stakeholders make better, data-driven decisions. Care should be taken to consider contextual factors when using the database. Conclusions related to costs associated with moving specific line items or products to specific countries and lead times by product/country will not be accurate.
--- Original source retains full ownership of the source dataset ---
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
We present the GLOBAL ROADKILL DATA, the largest worldwide compilation of roadkill data on terrestrial vertebrates. We outline the workflow (Fig. 1) to illustrate the sequential steps of the study, in which we merged local-scale survey datasets and opportunistic records into a unified roadkill large dataset comprising 208,570 roadkill records. These records include 2283 species and subspecies from 54 countries across six continents, ranging from 1971 to 2024.Large roadkill datasets offer the advantage ofpreventing the collection of redundant data and are valuable resources for both local and macro-scale analyses regarding roadkill rates, road and landscape features associated with roadkill risk, species more vulnerable to road traffic, and populations at risk due to additional mortality. The standardization of data - such as scientific names, projection coordinates, and units - in a user-friendly format, makes themreadily accessible to a broader scientific and non-scientific community, including NGOs, consultants, public administration officials, and road managers. The open-access approach promotes collaboration among researchers and road practitioners, facilitating the replication of studies, validation of findings, and expansion of previous work. Moreover, researchers can utilize suchdatasets to develop new hypotheses, conduct meta-analyses, address pressing challenges more efficiently and strengthen the robustness of road ecology research. Ensuring widespreadaccess to roadkill data fosters a more diverse and inclusive research community. This not only grants researchers in emerging economies with more data for analysis, but also cultivates a diverse array of perspectives and insightspromoting the advance of infrastructure ecology.MethodsInformation sources: A core team from different continents performed a systematic literature search in Web of Science and Google Scholar for published peer-reviewed papers and dissertations. It was searched for the following terms: “roadkill* OR “road-kill” OR “road mortality” AND (country) in English, Portuguese, Spanish, French and/or Mandarin. This initiative was also disseminated to the mailing lists associated with transport infrastructure: The CCSG Transport Working Group (WTG), Infrastructure & Ecology Network Europe (IENE) and Latin American & Caribbean Transport Working Group (LACTWG) (Fig. 1). The core team identified 750 scientific papers and dissertations with information on roadkill and contacted the first authors of the publications to request georeferenced locations of roadkill andofferco-authorship to this data paper. Of the 824 authors contacted, 145agreed to sharegeoreferenced roadkill locations, often involving additional colleagues who contributed to data collection. Since our main goal was to provide open access to data that had never been shared in this format before, data from citizen science projects (e.g., globalroakill.net) that are already available were not included.Data compilation: A total of 423 co-authors compiled the following information: continent, country, latitude and longitude in WGS 84 decimal degrees of the roadkill, coordinates uncertainty, class, order, family, scientific name of the roadkill, vernacular name, IUCN status, number of roadkill, year, month, and day of the record, identification of the road, type of road, survey type, references, and observers that recorded the roadkill (Supplementary Information Table S1 - description of the fields and Table S2 - reference list). When roadkill data were derived from systematic surveys, the dataset included additional information on road length that was surveyed, latitude and longitude of the road (initial and final part of the road segment), survey period, start year of the survey, final year of the survey, 1st month of the year surveyed, last month of the year surveyed, and frequency of the survey. We consolidated 142 valid datasets into a single dataset. We complemented this data with OccurenceID (a UUID generated using Java code), basisOfRecord, countryCode, locality using OpenStreetMap’s API (https://www.openstreetmap.org), geodeticDatum, verbatimScientificName, Kingdom, phylum, genus, specificEpithet, infraspecificEpithet, acceptedNameUsage, scientific name authorship, matchType, taxonRank using Darwin Core Reference Guide (https://dwc.tdwg.org/terms/#dwc:coordinateUncertaintyInMeters) and link of the associatedReference (URL).Data standardization - We conducted a clustering analysis on all text fields to identify similar entries with minor variations, such as typos, and corrected them using OpenRefine (http://openrefine.org). Wealsostandardized all date values using OpenRefine. Coordinate uncertainties listed as 0 m were adjusted to either 30m or 100m, depending on whether they were recorded after or before 2000, respectively, following the recommendation in the Darwin Core Reference Guide (https://dwc.tdwg.org/terms/#dwc:coordinateUncertaintyInMeters).Taxonomy - We cross-referenced all species names with the Global Biodiversity Information Facility (GBIF) Backbone Taxonomy using Java and GBIF’s API (https://doi.org/10.15468/39omei). This process aimed to rectify classification errors, include additional fields such as Kingdom, Phylum, and scientific authorship, and gather comprehensive taxonomic information to address any gap withinthe datasets. For species not automatically matched (matchType - Table S1), we manually searched for correct synonyms when available.Species conservation status - Using the species names, we retrieved their conservation status and also vernacular names by cross-referencing with the database downloaded from the IUCNRed List of Threatened Species (https://www.iucnredlist.org). Species without a match were categorized as "Not Evaluated".Data RecordsGLOBAL ROADKILL DATA is available at Figshare27 https://doi.org/10.6084/m9.figshare.25714233. The dataset incorporates opportunistic (collected incidentally without data collection efforts) and systematic data (collected through planned, structured, and controlled methods designed to ensure consistency and reliability). In total, it comprises 208,570 roadkill records across 177,428 different locations(Fig. 2). Data were collected from the road network of 54 countries from 6 continents: Europe (n = 19), Asia (n = 16), South America (n=7), North America (n = 4), Africa (n = 6) and Oceania (n = 2).(Figure 2 goes here)All data are georeferenced in WGS84 decimals with maximum uncertainty of 5000 m. Approximately 92% of records have a location uncertainty of 30 m or less, with only 1138 records having location uncertainties ranging from 1000 to 5000 m. Mammals have the highest number of roadkill records (61%), followed by amphibians (21%), reptiles (10%) and birds (8%). The species with the highest number of records were roe deer (Capreolus capreolus, n = 44,268), pool frog (Pelophylax lessonae, n = 11,999) and European fallow deer (Dama dama, n = 7,426).We collected information on 126 threatened species with a total of 4570 records. Among the threatened species, the giant anteater (Myrmecophaga tridactyla, VULNERABLE) has the highest number of records n = 1199), followed by the common fire salamander (Salamandra salamandra, VULNERABLE, n=1043), and European rabbit (Oryctolagus cuniculus, ENDANGERED, n = 440). Records ranged from 1971 and 2024, comprising 72% of the roadkill recorded since 2013. Over 46% of the records were obtained from systematic surveys, with road length and survey period averaging, respectively, 66 km (min-max: 0.09-855 km) and 780 days (1-25,720 days).Technical ValidationWe employed the OpenStreetMap API through Java todetect location inaccuracies, andvalidate whether the geographic coordinates aligned with the specified country. We calculated the distance of each occurrence to the nearest road using the GRIP global roads database28, ensuring that all records were within the defined coordinate uncertainty. We verified if the survey duration matched the provided initial and final survey dates. We calculated the distance between the provided initial and final road coordinates and cross-checked it with the given road length. We identified and merged duplicate entries within the same dataset (same location, species, and date), aggregating the number of roadkills for each occurrence.Usage NotesThe GLOBAL ROADKILL DATA is a compilation of roadkill records and was designed to serve as a valuable resource for a wide range of analyses. Nevertheless, to prevent the generation of meaningless results, users should be aware of the followinglimitations:- Geographic representation – There is an evident bias in the distribution of records. Data originatedpredominantly from Europe (60% of records), South America (22%), and North America (12%). Conversely, there is a notable lack of records from Asia (5%), Oceania (1%) and Africa (0.3%). This dataset represents 36% of the initial contacts that provided geo-referenced records, which may not necessarily correspond to locations where high-impact roads are present.- Location accuracy - Insufficient location accuracy was observed for 1% of the data (ranging from 1000 to 5000 m), that was associated with various factors, such as survey methods, recording practices, or timing of the survey.- Sampling effort - This dataset comprised both opportunistic data and records from systematic surveys, with a high variability in survey duration and frequency. As a result, the use of both opportunistic and systematic surveys may affect the relative abundance of roadkill making it hard to make sound comparisons among species or areas.- Detectability and carcass removal bias - Although several studies had a high frequency of road surveys,the duration of carcass persistence on roads may vary with species size and environmental conditions, affecting detectability. Accordingly, several approaches account for survey frequency and target speciesto estimate more
Xverum’s Location Data is a highly structured dataset of 230M+ verified locations, covering businesses, landmarks, and points of interest (POI) across 5000 industry categories. With accurate geographic coordinates, business metadata, and mapping attributes, our dataset is optimized for GIS applications, real estate analysis, market research, and urban planning.
With continuous discovery of new locations and regular updates, Xverum ensures that your location intelligence solutions have the most current data on business openings, closures, and POI movements. Delivered in bulk via S3 Bucket or cloud storage, our dataset integrates seamlessly into mapping, navigation, and geographic analysis platforms.
🔥 Key Features:
Comprehensive Location Coverage: ✅ 230M+ locations worldwide, spanning 5000 business categories. ✅ Includes retail stores, corporate offices, landmarks, service providers & more.
Geographic & Mapping Data: ✅ Latitude & longitude coordinates for precise location tracking. ✅ Country, state, city, and postal code classifications. ✅ Business status tracking – Open, temporarily closed, permanently closed.
Continuous Discovery & Regular Updates: ✅ New locations added frequently to ensure fresh data. ✅ Updated business metadata, reflecting new openings, closures & status changes.
Detailed Business & Address Metadata: ✅ Company name, category, & subcategories for industry segmentation. ✅ Business contact details, including phone number & website (if available). ✅ Operating hours for businesses with scheduling data.
Optimized for Mapping & Location Intelligence: ✅ Supports GIS, real estate analysis & smart city planning. ✅ Enhances navigation & mapping solutions with structured geographic data. ✅ Helps businesses optimize site selection & expansion strategies.
Bulk Data Delivery (NO API): ✅ Delivered via S3 Bucket or cloud storage for full dataset access. ✅ Available in a structured format (.json) for easy integration.
🏆 Primary Use Cases:
Location Intelligence & Mapping: 🔹 Power GIS platforms & digital maps with structured geographic data. 🔹 Integrate accurate location insights into real estate, logistics & market analysis.
Retail Expansion & Business Planning: 🔹 Identify high-traffic locations & competitors for strategic site selection. 🔹 Analyze brand distribution & presence across different industries & regions.
Market Research & Competitive Analysis: 🔹 Track openings, closures & business density to assess industry trends. 🔹 Benchmark competitors based on location data & geographic presence.
Smart City & Infrastructure Planning: 🔹 Optimize city development projects with accurate POI & business location data. 🔹 Support public & commercial zoning strategies with real-world business insights.
💡 Why Choose Xverum’s Location Data? - 230M+ Verified Locations – One of the largest & most structured location datasets available. - Global Coverage – Spanning 249+ countries, with diverse business & industry data. - Regular Updates – Continuous discovery & refresh cycles ensure data accuracy. - Comprehensive Geographic & Business Metadata – Coordinates, addresses, industry categories & more. - Bulk Dataset Delivery (NO API) – Seamless access via S3 Bucket or cloud storage. - 100% Compliant – Ethically sourced & legally compliant.
Access Xverum’s 230M+ Location Data for mapping, geographic analysis & business intelligence. Request a free sample or contact us to customize your dataset today!
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This comprehensive dataset provides a wealth of information about all countries worldwide, covering a wide range of indicators and attributes. It encompasses demographic statistics, economic indicators, environmental factors, healthcare metrics, education statistics, and much more. With every country represented, this dataset offers a complete global perspective on various aspects of nations, enabling in-depth analyses and cross-country comparisons.
Key Features
Country: Name of the country.
Density (P/Km2): Population density measured in persons per square kilometer.
Abbreviation: Abbreviation or code representing the country.
Agricultural Land (%): Percentage of land area used for agricultural purposes.
Land Area (Km2): Total land area of the country in square kilometers.
Armed Forces Size: Size of the armed forces in the country.
Birth Rate: Number of births per 1,000 population per year.
Calling Code: International calling code for the country.
Capital/Major City: Name of the capital or major city.
CO2 Emissions: Carbon dioxide emissions in tons.
CPI: Consumer Price Index, a measure of inflation and purchasing power.
CPI Change (%): Percentage change in the Consumer Price Index compared to the previous year.
Currency_Code: Currency code used in the country.
Fertility Rate: Average number of children born to a woman during her lifetime.
Forested Area (%): Percentage of land area covered by forests.
Gasoline_Price: Price of gasoline per liter in local currency.
GDP: Gross Domestic Product, the total value of goods and services produced in the country.
Gross Primary Education Enrollment (%): Gross enrollment ratio for primary education.
Gross Tertiary Education Enrollment (%): Gross enrollment ratio for tertiary education.
Infant Mortality: Number of deaths per 1,000 live births before reaching one year of age.
Largest City: Name of the country's largest city.
Life Expectancy: Average number of years a newborn is expected to live.
Maternal Mortality Ratio: Number of maternal deaths per 100,000 live births.
Minimum Wage: Minimum wage level in local currency.
Official Language: Official language(s) spoken in the country.
Out of Pocket Health Expenditure (%): Percentage of total health expenditure paid out-of-pocket by individuals.
Physicians per Thousand: Number of physicians per thousand people.
Population: Total population of the country.
Population: Labor Force Participation (%): Percentage of the population that is part of the labor force.
Tax Revenue (%): Tax revenue as a percentage of GDP.
Total Tax Rate: Overall tax burden as a percentage of commercial profits.
Unemployment Rate: Percentage of the labor force that is unemployed.
Urban Population: Percentage of the population living in urban areas.
Latitude: Latitude coordinate of the country's location.
Longitude: Longitude coordinate of the country's location.
Potential Use Cases
Analyze population density and land area to study spatial distribution patterns.
Investigate the relationship between agricultural land and food security.
Examine carbon dioxide emissions and their impact on climate change.
Explore correlations between economic indicators such as GDP and various socio-economic factors.
Investigate educational enrollment rates and their implications for human capital development.
Analyze healthcare metrics such as infant mortality and life expectancy to assess overall well-being.
Study labor market dynamics through indicators such as labor force participation and unemployment rates.
Investigate the role of taxation and its impact on economic development.
Explore urbanization trends and their social and environmental consequences.