30 datasets found

census-bureau-international
kaggle.com
zip
Updated May 6, 2020
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Google BigQuery (2020). census-bureau-international [Dataset]. https://www.kaggle.com/bigquery/census-bureau-international
Explore at:
zip(0 bytes)Available download formats
Dataset updated
May 6, 2020
Dataset provided by
BigQueryhttps://cloud.google.com/bigquery
Authors
Google BigQuery
Description
Context

The United States Census Bureau’s international dataset provides estimates of country populations since 1950 and projections through 2050. Specifically, the dataset includes midyear population figures broken down by age and gender assignment at birth. Additionally, time-series data is provided for attributes including fertility rates, birth rates, death rates, and migration rates.

Querying BigQuery tables

You can use the BigQuery Python client library to query tables in this dataset in Kernels. Note that methods available in Kernels are limited to querying data. Tables are at bigquery-public-data.census_bureau_international.

Sample Query 1

What countries have the longest life expectancy? In this query, 2016 census information is retrieved by joining the mortality_life_expectancy and country_names_area tables for countries larger than 25,000 km2. Without the size constraint, Monaco is the top result with an average life expectancy of over 89 years!

standardSQL

SELECT age.country_name, age.life_expectancy, size.country_area FROM ( SELECT country_name, life_expectancy FROM bigquery-public-data.census_bureau_international.mortality_life_expectancy WHERE year = 2016) age INNER JOIN ( SELECT country_name, country_area FROM bigquery-public-data.census_bureau_international.country_names_area where country_area > 25000) size ON age.country_name = size.country_name ORDER BY 2 DESC /* Limit removed for Data Studio Visualization */ LIMIT 10

Sample Query 2

Which countries have the largest proportion of their population under 25? Over 40% of the world’s population is under 25 and greater than 50% of the world’s population is under 30! This query retrieves the countries with the largest proportion of young people by joining the age-specific population table with the midyear (total) population table.

standardSQL

SELECT age.country_name, SUM(age.population) AS under_25, pop.midyear_population AS total, ROUND((SUM(age.population) / pop.midyear_population) * 100,2) AS pct_under_25 FROM ( SELECT country_name, population, country_code FROM bigquery-public-data.census_bureau_international.midyear_population_agespecific WHERE year =2017 AND age < 25) age INNER JOIN ( SELECT midyear_population, country_code FROM bigquery-public-data.census_bureau_international.midyear_population WHERE year = 2017) pop ON age.country_code = pop.country_code GROUP BY 1, 3 ORDER BY 4 DESC /* Remove limit for visualization*/ LIMIT 10

Sample Query 3

The International Census dataset contains growth information in the form of birth rates, death rates, and migration rates. Net migration is the net number of migrants per 1,000 population, an important component of total population and one that often drives the work of the United Nations Refugee Agency. This query joins the growth rate table with the area table to retrieve 2017 data for countries greater than 500 km2.

SELECT growth.country_name, growth.net_migration, CAST(area.country_area AS INT64) AS country_area FROM ( SELECT country_name, net_migration, country_code FROM bigquery-public-data.census_bureau_international.birth_death_growth_rates WHERE year = 2017) growth INNER JOIN ( SELECT country_area, country_code FROM bigquery-public-data.census_bureau_international.country_names_area

Update frequency

Historic (none)

Dataset source

United States Census Bureau

Terms of use: This dataset is publicly available for anyone to use under the following terms provided by the Dataset Source - http://www.data.gov/privacy-policy#data_policy - and is provided "AS IS" without any warranty, express or implied, from Google. Google disclaims all liability for any damages, direct or indirect, resulting from the use of the dataset.

See the GCP Marketplace listing for more details and sample queries: https://console.cloud.google.com/marketplace/details/united-states-census-bureau/international-census-data
o
Geonames - All Cities with a population > 1000
public.opendatasoft.com
data.smartidf.services
+2more
csv, excel, geojson +1
Updated Mar 10, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). Geonames - All Cities with a population > 1000 [Dataset]. https://public.opendatasoft.com/explore/dataset/geonames-all-cities-with-a-population-1000/
Explore at:
csv, json, geojson, excelAvailable download formats
Dataset updated
Mar 10, 2024
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
All cities with a population > 1000 or seats of adm div (ca 80.000)Sources and ContributionsSources : GeoNames is aggregating over hundred different data sources. Ambassadors : GeoNames Ambassadors help in many countries. Wiki : A wiki allows to view the data and quickly fix error and add missing places. Donations and Sponsoring : Costs for running GeoNames are covered by donations and sponsoring.Enrichment:add country name
Bank Rankings by Total Assets
kaggle.com
Updated Dec 6, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
The Devastator (2022). Bank Rankings by Total Assets [Dataset]. https://www.kaggle.com/datasets/thedevastator/global-banking-rankings-by-total-assets-2017-12
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Dec 6, 2022
Dataset provided by
Kaggle
Authors
The Devastator
Description
Bank Rankings by Total Assets

Tracking the Financial Performance of the Top Banks

By Arthur Keen [source]

About this dataset

This dataset contains the top 100 global banks ranked by total assets on December 31, 2017. With a detailed list of key information for each bank's rank, country, balance sheet and US Total Assets (in billions), this data will be invaluable for those looking to research and study the current status of some of the world's leading financial organizations. From billion-dollar mega-banks such as JP Morgan Chase to small, local savings & loans institutions like BancorpSouth; this comprehensive overview allows researchers and analysts to gain a better understanding of who holds power in the world economy today

More Datasets

For more datasets, click here.

Featured Notebooks

🚨 Your notebook can be here! 🚨!

How to use the dataset

This dataset contains the rank and total asset information of the top 100 global banks as of December 31, 2017. It is a useful resource for researchers who wish to study how key financial institutions' asset information relate to each other across countries.

Using this dataset is relatively straightforward – it consists of three columns - rank (the order in which each bank appears in the list), country (the country in which the bank is located) and total assets US billions (the total value expressed in US dollars). Additionally, there is a fourth column containing the balance sheet information for each bank as well.

In order to make full use of this dataset, one should analyse it by creating comparison grids based on different factors such as region, size or ownership structures. This can provide an interesting insight into how financial markets are structured within different economies and allow researchers to better understand some banking sector dynamics that are particularly relevant for certain countries or regions. Additionally, one can compare any two banks side-by-side using their respective balance sheets or distribution plot graphs based on size or concentration metrics by leverage or other financial ratios as well.

Overall, this dataset provides useful resources that can be put into practice through data visualization making an interesting reference point for trends analysis and forecasting purposes focusing on certain banking activities worldwide

Research Ideas

Analyzing the differences in total assets across countries. By comparing and contrasting data, patterns could be found that give insight into the factors driving differences in banks’ assets between different markets.

Using predictive models to identify which banks are more likely to perform better based on their balance sheet data, such as by predicting future profits or cashflows of said banks.

Leveraging the information on holdings and investments of “top-ranked” banks as a guide for personal investments decisions or informing investment strategies of large financial institutions or hedge funds

Acknowledgements

If you use this dataset in your research, please credit the original authors. Data Source

License

License: Dataset copyright by authors - You are free to: - Share - copy and redistribute the material in any medium or format for any purpose, even commercially. - Adapt - remix, transform, and build upon the material for any purpose, even commercially. - You must: - Give appropriate credit - Provide a link to the license, and indicate if changes were made. - ShareAlike - You must distribute your contributions under the same license as the original. - Keep intact - all notices that refer to this license, including copyright notices.

Columns

File: top50banks2017-03-31.csv | Column name | Description | |:----------------------|:------------------------------------------------------------------------| | rank | The rank of the bank globally based on total assets. (Integer) | | country | The country where the bank is located. (String) | | total_assets_us_b | The total assets of a bank expressed in billions of US dollars. (Float) | | balance_sheet | A snapshot of banking activities for a specific date. (Date) |

File: top100banks2017-12-31.csv | Column name | Description | |:----------------------|:--------------------------------------------...
Large Scale International Boundaries
catalog.data.gov
geodata.state.gov
Updated Jun 13, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. Department of State (Point of Contact) (2025). Large Scale International Boundaries [Dataset]. https://catalog.data.gov/dataset/large-scale-international-boundaries
Explore at:
Dataset updated
Jun 13, 2025
Dataset provided by
United States Department of Statehttp://state.gov/
Description
Overview The Office of the Geographer and Global Issues at the U.S. Department of State produces the Large Scale International Boundaries (LSIB) dataset. The current edition is version 11.4 (published 24 February 2025). The 11.4 release contains updated boundary lines and data refinements designed to extend the functionality of the dataset. These data and generalized derivatives are the only international boundary lines approved for U.S. Government use. The contents of this dataset reflect U.S. Government policy on international boundary alignment, political recognition, and dispute status. They do not necessarily reflect de facto limits of control. National Geospatial Data Asset This dataset is a National Geospatial Data Asset (NGDAID 194) managed by the Department of State. It is a part of the International Boundaries Theme created by the Federal Geographic Data Committee. Dataset Source Details Sources for these data include treaties, relevant maps, and data from boundary commissions, as well as national mapping agencies. Where available and applicable, the dataset incorporates information from courts, tribunals, and international arbitrations. The research and recovery process includes analysis of satellite imagery and elevation data. Due to the limitations of source materials and processing techniques, most lines are within 100 meters of their true position on the ground. Cartographic Visualization The LSIB is a geospatial dataset that, when used for cartographic purposes, requires additional styling. The LSIB download package contains example style files for commonly used software applications. The attribute table also contains embedded information to guide the cartographic representation. Additional discussion of these considerations can be found in the Use of Core Attributes in Cartographic Visualization section below. Additional cartographic information pertaining to the depiction and description of international boundaries or areas of special sovereignty can be found in Guidance Bulletins published by the Office of the Geographer and Global Issues: https://data.geodata.state.gov/guidance/index.html Contact Direct inquiries to internationalboundaries@state.gov. Direct download: https://data.geodata.state.gov/LSIB.zip Attribute Structure The dataset uses the following attributes divided into two categories: ATTRIBUTE NAME | ATTRIBUTE STATUS CC1 | Core CC1_GENC3 | Extension CC1_WPID | Extension COUNTRY1 | Core CC2 | Core CC2_GENC3 | Extension CC2_WPID | Extension COUNTRY2 | Core RANK | Core LABEL | Core STATUS | Core NOTES | Core LSIB_ID | Extension ANTECIDS | Extension PREVIDS | Extension PARENTID | Extension PARENTSEG | Extension These attributes have external data sources that update separately from the LSIB: ATTRIBUTE NAME | ATTRIBUTE STATUS CC1 | GENC CC1_GENC3 | GENC CC1_WPID | World Polygons COUNTRY1 | DoS Lists CC2 | GENC CC2_GENC3 | GENC CC2_WPID | World Polygons COUNTRY2 | DoS Lists LSIB_ID | BASE ANTECIDS | BASE PREVIDS | BASE PARENTID | BASE PARENTSEG | BASE The core attributes listed above describe the boundary lines contained within the LSIB dataset. Removal of core attributes from the dataset will change the meaning of the lines. An attribute status of “Extension” represents a field containing data interoperability information. Other attributes not listed above include “FID”, “Shape_length” and “Shape.” These are components of the shapefile format and do not form an intrinsic part of the LSIB. Core Attributes The eight core attributes listed above contain unique information which, when combined with the line geometry, comprise the LSIB dataset. These Core Attributes are further divided into Country Code and Name Fields and Descriptive Fields. County Code and Country Name Fields “CC1” and “CC2” fields are machine readable fields that contain political entity codes. These are two-character codes derived from the Geopolitical Entities, Names, and Codes Standard (GENC), Edition 3 Update 18. “CC1_GENC3” and “CC2_GENC3” fields contain the corresponding three-character GENC codes and are extension attributes discussed below. The codes “Q2” or “QX2” denote a line in the LSIB representing a boundary associated with areas not contained within the GENC standard. The “COUNTRY1” and “COUNTRY2” fields contain the names of corresponding political entities. These fields contain names approved by the U.S. Board on Geographic Names (BGN) as incorporated in the ‘"Independent States in the World" and "Dependencies and Areas of Special Sovereignty" lists maintained by the Department of State. To ensure maximum compatibility, names are presented without diacritics and certain names are rendered using common cartographic abbreviations. Names for lines associated with the code "Q2" are descriptive and not necessarily BGN-approved. Names rendered in all CAPITAL LETTERS denote independent states. Names rendered in normal text represent dependencies, areas of special sovereignty, or are otherwise presented for the convenience of the user. Descriptive Fields The following text fields are a part of the core attributes of the LSIB dataset and do not update from external sources. They provide additional information about each of the lines and are as follows: ATTRIBUTE NAME | CONTAINS NULLS RANK | No STATUS | No LABEL | Yes NOTES | Yes Neither the "RANK" nor "STATUS" fields contain null values; the "LABEL" and "NOTES" fields do. The "RANK" field is a numeric expression of the "STATUS" field. Combined with the line geometry, these fields encode the views of the United States Government on the political status of the boundary line. ATTRIBUTE NAME | | VALUE | RANK | 1 | 2 | 3 STATUS | International Boundary | Other Line of International Separation | Special Line A value of “1” in the “RANK” field corresponds to an "International Boundary" value in the “STATUS” field. Values of ”2” and “3” correspond to “Other Line of International Separation” and “Special Line,” respectively. The “LABEL” field contains required text to describe the line segment on all finished cartographic products, including but not limited to print and interactive maps. The “NOTES” field contains an explanation of special circumstances modifying the lines. This information can pertain to the origins of the boundary lines, limitations regarding the purpose of the lines, or the original source of the line. Use of Core Attributes in Cartographic Visualization Several of the Core Attributes provide information required for the proper cartographic representation of the LSIB dataset. The cartographic usage of the LSIB requires a visual differentiation between the three categories of boundary lines. Specifically, this differentiation must be between: International Boundaries (Rank 1); Other Lines of International Separation (Rank 2); and Special Lines (Rank 3). Rank 1 lines must be the most visually prominent. Rank 2 lines must be less visually prominent than Rank 1 lines. Rank 3 lines must be shown in a manner visually subordinate to Ranks 1 and 2. Where scale permits, Rank 2 and 3 lines must be labeled in accordance with the “Label” field. Data marked with a Rank 2 or 3 designation does not necessarily correspond to a disputed boundary. Please consult the style files in the download package for examples of this depiction. The requirement to incorporate the contents of the "LABEL" field on cartographic products is scale dependent. If a label is legible at the scale of a given static product, a proper use of this dataset would encourage the application of that label. Using the contents of the "COUNTRY1" and "COUNTRY2" fields in the generation of a line segment label is not required. The "STATUS" field contains the preferred description for the three LSIB line types when they are incorporated into a map legend but is otherwise not to be used for labeling. Use of the “CC1,” “CC1_GENC3,” “CC2,” “CC2_GENC3,” “RANK,” or “NOTES” fields for cartographic labeling purposes is prohibited. Extension Attributes Certain elements of the attributes within the LSIB dataset extend data functionality to make the data more interoperable or to provide clearer linkages to other datasets. The fields “CC1_GENC3” and “CC2_GENC” contain the corresponding three-character GENC code to the “CC1” and “CC2” attributes. The code “QX2” is the three-character counterpart of the code “Q2,” which denotes a line in the LSIB representing a boundary associated with a geographic area not contained within the GENC standard. To allow for linkage between individual lines in the LSIB and World Polygons dataset, the “CC1_WPID” and “CC2_WPID” fields contain a Universally Unique Identifier (UUID), version 4, which provides a stable description of each geographic entity in a boundary pair relationship. Each UUID corresponds to a geographic entity listed in the World Polygons dataset. These fields allow for linkage between individual lines in the LSIB and the overall World Polygons dataset. Five additional fields in the LSIB expand on the UUID concept and either describe features that have changed across space and time or indicate relationships between previous versions of the feature. The “LSIB_ID” attribute is a UUID value that defines a specific instance of a feature. Any change to the feature in a lineset requires a new “LSIB_ID.” The “ANTECIDS,” or antecedent ID, is a UUID that references line geometries from which a given line is descended in time. It is used when there is a feature that is entirely new, not when there is a new version of a previous feature. This is generally used to reference countries that have dissolved. The “PREVIDS,” or Previous ID, is a UUID field that contains old versions of a line. This is an additive field, that houses all Previous IDs. A new version of a feature is defined by any change to the
g
Population Density Around the Globe
globalmidwiveshub.org
covid19.esriuk.com
+5more
Updated May 20, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Direct Relief (2020). Population Density Around the Globe [Dataset]. https://www.globalmidwiveshub.org/maps/b71f7fd5dbc8486b8b37362726a11452
Explore at:
Dataset updated
May 20, 2020
Dataset authored and provided by
Direct Relief
Area covered

Description
Census data reveals that population density varies noticeably from area to area. Small area census data do a better job depicting where the crowded neighborhoods are. In this map, the yellow areas of highest density range from 30,000 to 150,000 persons per square kilometer. In those areas, if the people were spread out evenly across the area, there would be just 4 to 9 meters between them. Very high density areas exceed 7,000 persons per square kilometer. High density areas exceed 5,200 persons per square kilometer. The last categories break at 3,330 persons per square kilometer, and 1,500 persons per square kilometer.This dataset is comprised of multiple sources. All of the demographic data are from Michael Bauer Research with the exception of the following countries:Australia: Esri Australia and MapData ServicesCanada: Esri Canada and EnvironicsFrance: Esri FranceGermany: Esri Germany and NexigaIndia: Esri India and IndicusJapan: Esri JapanSouth Korea: Esri Korea and OPENmateSpain: Esri España and AISUnited States: Esri Demographics
T
GOLD RESERVES by Country Dataset
tradingeconomics.com
csv, excel, json, xml
Updated May 26, 2014
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
TRADING ECONOMICS (2014). GOLD RESERVES by Country Dataset [Dataset]. https://tradingeconomics.com/country-list/gold-reserves
Explore at:
excel, xml, csv, jsonAvailable download formats
Dataset updated
May 26, 2014
Dataset authored and provided by
TRADING ECONOMICS
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
2025
Area covered
World
Description
This dataset provides values for GOLD RESERVES reported in several countries. The data includes current values, previous releases, historical highs and record lows, release frequency, reported unit and currency.
T
GDP by Country in ASIA
tradingeconomics.com
csv, excel, json, xml
Updated Jun 20, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
TRADING ECONOMICS (2025). GDP by Country in ASIA [Dataset]. https://tradingeconomics.com/country-list/gdp?continent=asia
Explore at:
xml, json, csv, excelAvailable download formats
Dataset updated
Jun 20, 2025
Dataset authored and provided by
TRADING ECONOMICS
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
2025
Area covered
Asia
Description
This dataset provides values for GDP reported in several countries. The data includes current values, previous releases, historical highs and record lows, release frequency, reported unit and currency.
o
Country Codes
public.opendatasoft.com
data.smartidf.services
+6more
csv, excel, geojson +1
Updated Aug 25, 2015
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2015). Country Codes [Dataset]. https://public.opendatasoft.com/explore/dataset/countries-codes/
Explore at:
geojson, json, excel, csvAvailable download formats
Dataset updated
Aug 25, 2015
License
https://en.wikipedia.org/wiki/Public_domainhttps://en.wikipedia.org/wiki/Public_domain
Description
Country codes: ISO 2ISO 3UNLANGLABEL (EN, FR, SP)
Z
Dataset for: "Big data suggest strong constraints of linguistic similarity...
data.niaid.nih.gov
Updated Jan 24, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Job Schepens (2020). Dataset for: "Big data suggest strong constraints of linguistic similarity on adult language learning" [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_2863532
Explore at:
Dataset updated
Jan 24, 2020
Dataset provided by
T. Florian Jaeger
Roeland van Hout
Job Schepens
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset is adapted from raw data with fully anonymized results on the State Examination of Dutch as a Second Language. This exam is officially administred by the Board of Tests and Examinations (College voor Toetsen en Examens, or CvTE). See cvte.nl/about-cvte. The Board of Tests and Examinations is mandated by the Dutch government.

The article accompanying the dataset:

Schepens, Job, Roeland van Hout, and T. Florian Jaeger. “Big Data Suggest Strong Constraints of Linguistic Similarity on Adult Language Learning.” Cognition 194 (January 1, 2020): 104056. https://doi.org/10.1016/j.cognition.2019.104056.

Every row in the dataset represents the first official testing score of a unique learner. The columns contain the following information as based on questionnaires filled in at the time of the exam:

"L1" - The first language of the learner "C" - The country of birth "L1L2" - The combination of first and best additional language besides Dutch "L2" - The best additional language besides Dutch "AaA" - Age at Arrival in the Netherlands in years (starting date of residence) "LoR" - Length of residence in the Netherlands in years "Edu.day" - Duration of daily education (1 low, 2 middle, 3 high, 4 very high). From 1992 until 2006, learners' education has been measured by means of a side-by-side matrix question in a learner's questionnaire. Learners were asked to mark which type of education they have had (elementary, secondary, or tertiary schooling) by means of filling in for how many years they have been enrolled, in which country, and whether or not they have graduated. Based on this information we were able to estimate how many years learners have had education on a daily basis from six years of age onwards. Since 2006, the question about learners' education has been altered and it is asked directly how many years learners have had formal education on a daily basis from six years of age onwards. Possible answering categories are: 1) 0 thru 5 years; 2) 6 thru 10 years; 3) 11 thru 15 years; 4) 16 years or more. The answers have been merged into the categorical answer. "Sex" - Gender "Family" - Language Family "ISO639.3" - Language ID code according to Ethnologue "Enroll" - Proportion of school-aged youth enrolled in secondary education according to the World Bank. The World Bank reports on education data in a wide number of countries around the world on a regular basis. We took the gross enrollment rate in secondary schooling per country in the year the learner has arrived in the Netherlands as an indicator for a country's educational accessibility at the time learners have left their country of origin. "STEX_speaking_score" - The STEX test score for speaking proficiency. "Dissimilarity_morphological" - Morphological similarity "Dissimilarity_lexical" - Lexical similarity "Dissimilarity_phonological_new_features" - Phonological similarity (in terms of new features) "Dissimilarity_phonological_new_categories" - Phonological similarity (in terms of new sounds)

A few rows of the data:

"L1","C","L1L2","L2","AaA","LoR","Edu.day","Sex","Family","ISO639.3","Enroll","STEX_speaking_score","Dissimilarity_morphological","Dissimilarity_lexical","Dissimilarity_phonological_new_features","Dissimilarity_phonological_new_categories" "English","UnitedStates","EnglishMonolingual","Monolingual",34,0,4,"Female","Indo-European","eng ",94,541,0.0094,0.083191,11,19 "English","UnitedStates","EnglishGerman","German",25,16,3,"Female","Indo-European","eng ",94,603,0.0094,0.083191,11,19 "English","UnitedStates","EnglishFrench","French",32,3,4,"Male","Indo-European","eng ",94,562,0.0094,0.083191,11,19 "English","UnitedStates","EnglishSpanish","Spanish",27,8,4,"Male","Indo-European","eng ",94,537,0.0094,0.083191,11,19 "English","UnitedStates","EnglishMonolingual","Monolingual",47,5,3,"Male","Indo-European","eng ",94,505,0.0094,0.083191,11,19
G
Political stability by country, around the world | TheGlobalEconomy.com
theglobaleconomy.com
csv, excel, xml
Updated Apr 7, 2016
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Globalen LLC (2016). Political stability by country, around the world | TheGlobalEconomy.com [Dataset]. www.theglobaleconomy.com/rankings/wb_political_stability/
Explore at:
xml, excel, csvAvailable download formats
Dataset updated
Apr 7, 2016
Dataset authored and provided by
Globalen LLC
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
Dec 31, 1996 - Dec 31, 2023
Area covered
World, World
Description
The average for 2023 based on 193 countries was -0.07 points. The highest value was in Liechtenstein: 1.61 points and the lowest value was in Syria: -2.75 points. The indicator is available from 1996 to 2023. Below is a chart for all countries where data are available.
r
Data for: Public preferences on policies for climate, local pollution, and...
researchdata.se
datacatalogue.cessda.eu
+1more
Updated Feb 11, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Richard T. Carson; Jiajun Lu; Emily A. Khossravi; Gunnar Köhlin; Erik Sterner; Thomas Sterner; Dale Whittington (2025). Data for: Public preferences on policies for climate, local pollution, and health - a survey in seven large Global South countries [Dataset]. http://doi.org/10.5878/jy7v-5k80
Explore at:
(214435), (3045), (274772)Available download formats
Unique identifier
https://doi.org/10.5878/jy7v-5k80
Dataset updated
Feb 11, 2025
Dataset provided by
University of Gothenburg
Authors
Richard T. Carson; Jiajun Lu; Emily A. Khossravi; Gunnar Köhlin; Erik Sterner; Thomas Sterner; Dale Whittington
Time period covered
Feb 2, 2022 - May 7, 2023
Area covered
Colombia, Kenya, Tanzania, India, Chile, Nigeria, South Africa, Viet Nam
Description
The current dataset is a subset of a large data collection based on a purpose-built survey conducted in seven middle-income countries in the Global South: Chile, Colombia, India, Kenya, Nigeria, Tanzania, South Africa and Vietnam. The purpose of the collected variables in the present dataset aims to understanding public preferences as a critical way to any effort to reduce greenhouse gas emissions. There are many studies of public preferences regarding climate change in the Global North. However, survey work in low and middle-income countries is limited. Survey work facilitating cross-country comparisons not using the major omnibus surveys is relatively rare.
We designed the Environment for Development (EfD) Seven-country Global South Climate Survey (the EfD Survey) which collected information on respondents’ knowledge about climate change, the information sources that respondents rely on, and opinions on climate policy. The EfD survey contains a battery of well-known climate knowledge questions and questions concerning the attention to and degree of trust in various sources for climate information. Respondents faced several ranking tasks using a best-worst elicitation format. This approach offers greater robustness to cultural differences in how questions are answered than the Likert-scale questions commonly asked in omnibus surveys. We examine: (a) priorities for spending in thirteen policy areas including climate and COVID-19, (b) how respiratory diseases due to air pollution rank relative to six other health problems, (c) agreement with ten statements characterizing various aspects of climate policies, and (d) prioritization of uses for carbon tax revenue. The company YouGov collected data for the EfD Survey in 2023 from 8400 respondents, 1200 in each country. It supplements an earlier survey wave (administered a year earlier) that focused on COVID-19. Respondents were drawn from YouGov’s online panels. During the COVID-19 pandemic almost all surveys were conducted online. This has advantages and disadvantages. Online survey administration reduces costs and data collection times and allows for experimental designs assigning different survey stimuli. With substantial incentive payments, high response rates within the sampling frame are achievable and such incentivized respondents are hopefully motivated to carefully answer the questions posed. The main disadvantage is that the sampling frame is comprised of the internet-enabled portion of the population in each country (e.g., with computers, mobile phones, and tablets). This sample systematically underrepresents those with lower incomes and living in rural areas. This large segment of the population is, however, of considerable interest in its own right due to its exposure to online media and outsized influence on public opinion. The data includes respondents’ preferences for climate change mitigation policies and competing policy issues like health. The data also includes questions such as how respondents think revenues from carbon taxes should be used. The outcome provide important information for policymakers to understand, evaluate, and shape national climate policies. It is worth noting that the data from Tanzania is only present in Wave 1 and that the data from Chile is only present in Wave 2.
s
Scimago Country Rankings
scimagojr.com
hgxjs.org
xlsx
Updated Jul 1, 2017
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Scimago Lab (2017). Scimago Country Rankings [Dataset]. https://www.scimagojr.com/countryrank.php
Explore at:
xlsxAvailable download formats
Dataset updated
Jul 1, 2017
Dataset authored and provided by
Scimago Lab
Description
Country scientific indicators developed from the information contained in the Scopus® database (Elsevier B.V.). These indicators can be used to assess and analyze scientific domains. Country rankings may be compared or analysed separately. Indicators offered for each country: H Index, Documents, Citations, Citation per Document and Citable Documents.
MGD: Music Genre Dataset
zenodo.org
data.niaid.nih.gov
zip
Updated May 28, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Gabriel P. Oliveira; Gabriel P. Oliveira; Mariana O. Silva; Mariana O. Silva; Danilo B. Seufitelli; Danilo B. Seufitelli; Anisio Lacerda; Mirella M. Moro; Mirella M. Moro; Anisio Lacerda (2021). MGD: Music Genre Dataset [Dataset]. http://doi.org/10.5281/zenodo.4778563
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.4778563
Dataset updated
May 28, 2021
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Gabriel P. Oliveira; Gabriel P. Oliveira; Mariana O. Silva; Mariana O. Silva; Danilo B. Seufitelli; Danilo B. Seufitelli; Anisio Lacerda; Mirella M. Moro; Mirella M. Moro; Anisio Lacerda
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
MGD: Music Genre Dataset

Over recent years, the world has seen a dramatic change in the way people consume music, moving from physical records to streaming services. Since 2017, such services have become the main source of revenue within the global recorded music market.
Therefore, this dataset is built by using data from Spotify. It provides a weekly chart of the 200 most streamed songs for each country and territory it is present, as well as an aggregated global chart.

Considering that countries behave differently when it comes to musical tastes, we use chart data from global and regional markets from January 2017 to December 2019, considering eight of the top 10 music markets according to IFPI: United States (1st), Japan (2nd), United Kingdom (3rd), Germany (4th), France (5th), Canada (8th), Australia (9th), and Brazil (10th).

We also provide information about the hit songs and artists present in the charts, such as all collaborating artists within a song (since the charts only provide the main ones) and their respective genres, which is the core of this work. MGD also provides data about musical collaboration, as we build collaboration networks based on artist partnerships in hit songs. Therefore, this dataset contains:

Genre Networks: Success-based genre collaboration networks

Genre Mapping: Genre mapping from Spotify genres to super-genres

Artist Networks: Success-based artist collaboration networks

Artists: Some artist data

Hit Songs: Hit Song data and features

Charts: Enhanced data from Spotify Weekly Top 200 Charts

This dataset was originally built for a conference paper at ISMIR 2020. If you make use of the dataset, please also cite the following paper:

Gabriel P. Oliveira, Mariana O. Silva, Danilo B. Seufitelli, Anisio Lacerda, and Mirella M. Moro. Detecting Collaboration Profiles in Success-based Music Genre Networks. In Proceedings of the 21st International Society for Music Information Retrieval Conference (ISMIR 2020), 2020.

@inproceedings{ismir/OliveiraSSLM20, title = {Detecting Collaboration Profiles in Success-based Music Genre Networks}, author = {Gabriel P. Oliveira and Mariana O. Silva and Danilo B. Seufitelli and Anisio Lacerda and Mirella M. Moro}, booktitle = {21st International Society for Music Information Retrieval Conference} pages = {726--732}, year = {2020} }
Data from: A large synthetic dataset for machine learning applications in...
zenodo.org
csv, json, png, zip
Updated Mar 25, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Marc Gillioz; Marc Gillioz; Guillaume Dubuis; Philippe Jacquod; Philippe Jacquod; Guillaume Dubuis (2025). A large synthetic dataset for machine learning applications in power transmission grids [Dataset]. http://doi.org/10.5281/zenodo.13378476
Explore at:
zip, png, csv, jsonAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.13378476
Dataset updated
Mar 25, 2025
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Marc Gillioz; Marc Gillioz; Guillaume Dubuis; Philippe Jacquod; Philippe Jacquod; Guillaume Dubuis
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
With the ongoing energy transition, power grids are evolving fast. They operate more and more often close to their technical limit, under more and more volatile conditions. Fast, essentially real-time computational approaches to evaluate their operational safety, stability and reliability are therefore highly desirable. Machine Learning methods have been advocated to solve this challenge, however they are heavy consumers of training and testing data, while historical operational data for real-world power grids are hard if not impossible to access.

This dataset contains long time series for production, consumption, and line flows, amounting to 20 years of data with a time resolution of one hour, for several thousands of loads and several hundreds of generators of various types representing the ultra-high-voltage transmission grid of continental Europe. The synthetic time series have been statistically validated agains real-world data.

Data generation algorithm

The algorithm is described in a Nature Scientific Data paper. It relies on the PanTaGruEl model of the European transmission network -- the admittance of its lines as well as the location, type and capacity of its power generators -- and aggregated data gathered from the ENTSO-E transparency platform, such as power consumption aggregated at the national level.

Network

The network information is encoded in the file europe_network.json. It is given in PowerModels format, which it itself derived from MatPower and compatible with PandaPower. The network features 7822 power lines and 553 transformers connecting 4097 buses, to which are attached 815 generators of various types.

Time series

The time series forming the core of this dataset are given in CSV format. Each CSV file is a table with 8736 rows, one for each hourly time step of a 364-day year. All years are truncated to exactly 52 weeks of 7 days, and start on a Monday (the load profiles are typically different during weekdays and weekends). The number of columns depends on the type of table: there are 4097 columns in load files, 815 for generators, and 8375 for lines (including transformers). Each column is described by a header corresponding to the element identifier in the network file. All values are given in per-unit, both in the model file and in the tables, i.e. they are multiples of a base unit taken to be 100 MW.

There are 20 tables of each type, labeled with a reference year (2016 to 2020) and an index (1 to 4), zipped into archive files arranged by year. This amount to a total of 20 years of synthetic data. When using loads, generators, and lines profiles together, it is important to use the same label: for instance, the files loads_2020_1.csv, gens_2020_1.csv, and lines_2020_1.csv represent a same year of the dataset, whereas gens_2020_2.csv is unrelated (it actually shares some features, such as nuclear profiles, but it is based on a dispatch with distinct loads).

Usage

The time series can be used without a reference to the network file, simply using all or a selection of columns of the CSV files, depending on the needs. We show below how to select series from a particular country, or how to aggregate hourly time steps into days or weeks. These examples use Python and the data analyis library pandas, but other frameworks can be used as well (Matlab, Julia). Since all the yearly time series are periodic, it is always possible to define a coherent time window modulo the length of the series.

Selecting a particular country

This example illustrates how to select generation data for Switzerland in Python. This can be done without parsing the network file, but using instead gens_by_country.csv, which contains a list of all generators for any country in the network. We start by importing the pandas library, and read the column of the file corresponding to Switzerland (country code CH):

import pandas as pd CH_gens = pd.read_csv('gens_by_country.csv', usecols=['CH'], dtype=str)

The object created in this way is Dataframe with some null values (not all countries have the same number of generators). It can be turned into a list with:

CH_gens_list = CH_gens.dropna().squeeze().to_list()

Finally, we can import all the time series of Swiss generators from a given data table with

pd.read_csv('gens_2016_1.csv', usecols=CH_gens_list)

The same procedure can be applied to loads using the list contained in the file loads_by_country.csv.

Averaging over time

This second example shows how to change the time resolution of the series. Suppose that we are interested in all the loads from a given table, which are given by default with a one-hour resolution:

hourly_loads = pd.read_csv('loads_2018_3.csv')

To get a daily average of the loads, we can use:

daily_loads = hourly_loads.groupby([t // 24 for t in range(24 * 364)]).mean()

This results in series of length 364. To average further over entire weeks and get series of length 52, we use:

weekly_loads = hourly_loads.groupby([t // (24 * 7) for t in range(24 * 364)]).mean()

Source code

The code used to generate the dataset is freely available at https://github.com/GeeeHesso/PowerData. It consists in two packages and several documentation notebooks. The first package, written in Python, provides functions to handle the data and to generate synthetic series based on historical data. The second package, written in Julia, is used to perform the optimal power flow. The documentation in the form of Jupyter notebooks contains numerous examples on how to use both packages. The entire workflow used to create this dataset is also provided, starting from raw ENTSO-E data files and ending with the synthetic dataset given in the repository.

Funding

This work was supported by the Cyber-Defence Campus of armasuisse and by an internal research grant of the Engineering and Architecture domain of HES-SO.
A
‘Netflix "Top 10" TV Shows and Films’ analyzed by Analyst-2
analyst-2.ai
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com), ‘Netflix "Top 10" TV Shows and Films’ analyzed by Analyst-2 [Dataset]. https://analyst-2.ai/analysis/kaggle-netflix-top-10-tv-shows-and-films-9146/f663e96b/?iid=011-677&v=presentation
Explore at:
Dataset authored and provided by
Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com)
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Analysis of ‘Netflix "Top 10" TV Shows and Films’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/dhruvildave/netflix-top-10-tv-shows-and-films on 28 January 2022.

--- Dataset description provided by original source is as follows ---

Every Tuesday, Netflix publishes four global Top 10 lists for films and TV: Film (English), TV (English), Film (Non-English), and TV (Non-English). These lists rank titles based on weekly hours viewed: the total number of hours that members around the world watched each title from Monday to Sunday of the previous week.

Each season of a series and each film is considered on their own, so you might see both Stranger Things seasons 2 and 3 in the Top 10. Because titles sometimes move in and out of the Top 10, there is also the total number of weeks that a season of a series or film has spent on the list.

Netflix also publishes Top 10 lists for nearly 100 countries and territories (the same locations where there are Top 10 rows on Netflix). Country lists are also ranked based on hours viewed but don’t show country-level viewing directly.

Finally, Netflix provides a list of the Top 10 most popular Netflix films and TV (branded Netflix in any country) in each of the four categories based on the hours that each title was viewed during its first 28 days.

--- Original source retains full ownership of the source dataset ---
Ranking of happiest countries worldwide 2024, by score
statista.com
Updated Jun 10, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2025). Ranking of happiest countries worldwide 2024, by score [Dataset]. https://www.statista.com/statistics/1225047/ranking-of-happiest-countries-worldwide-by-score/
Explore at:
Dataset updated
Jun 10, 2025
Dataset authored and provided by
Statistahttp://statista.com/
Area covered
World
Description
Finland was ranked the happiest country in the world, according to the World Happiness Report from 2025. The Nordic country scored 7.74 on a scale from 0 to 10. Two other Nordic countries, Denmark and Iceland, followed in second and third place, respectively. The World Happiness Report is a landmark survey of the state of global happiness that ranks countries by how happy their citizens perceive themselves to be. Criticism The index has received criticism from different perspectives. Some argue that it is impossible to measure general happiness in a country. Others argue that the index places too much emphasis on material well-being as well as freedom from oppression. As a result, the Happy Planet Index was introduced, which takes life expectancy, experienced well-being, inequality of outcomes, and ecological footprint into account. Here, Costa Rica was ranked as the happiest country in the world. Afghanistan is the least happy country Nevertheless, most people agree that high levels of poverty, lack of access to food and water, as well as a prevalence of conflict are factors hindering public happiness. Hence, it comes as no surprise that Afghanistan was ranked as the least happy country in the world in 2024. The South Asian country is ridden by poverty and undernourishment, and topped the Global Terrorism Index in 2024.
Data from: Global Roadkill Data: a dataset on terrestrial vertebrate...
figshare.com
pdf
Updated Apr 3, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Clara Grilo; Tomé Neves; Jennifer Bates; Aliza le Roux; Pablo Medrano‐Vizcaíno; Mattia Quaranta; Inês Silva; KYLIE SOANES; Yun Wang; Sergio Damián Abate; Fernanda Delborgo Abra; Stuart Aldaz Cedeño; Pedro Rodrigues de Alencar; Mariana Fernada Peres de Almeida; Mario Henrique Alves; Paloma Alves; André Ambrozio de Assis; Rob Ament; Richard Andrášik; Edison Araguillin; Danielle Rodrigues de Araújo; Alexis Araujo-Quintero; Jesús Arca-Rubio; Morteza Arianejad; Carlos Armas; Erin Arnold; Fernando Ascensão; Badrul Azhar; Seung-Yun Baek (2025). Global Roadkill Data: a dataset on terrestrial vertebrate mortality caused by collision with vehicles [Dataset]. http://doi.org/10.6084/m9.figshare.25714233.v5
Explore at:
pdfAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.25714233.v5
Dataset updated
Apr 3, 2025
Dataset provided by
Figsharehttp://figshare.com/
Authors
Clara Grilo; Tomé Neves; Jennifer Bates; Aliza le Roux; Pablo Medrano‐Vizcaíno; Mattia Quaranta; Inês Silva; KYLIE SOANES; Yun Wang; Sergio Damián Abate; Fernanda Delborgo Abra; Stuart Aldaz Cedeño; Pedro Rodrigues de Alencar; Mariana Fernada Peres de Almeida; Mario Henrique Alves; Paloma Alves; André Ambrozio de Assis; Rob Ament; Richard Andrášik; Edison Araguillin; Danielle Rodrigues de Araújo; Alexis Araujo-Quintero; Jesús Arca-Rubio; Morteza Arianejad; Carlos Armas; Erin Arnold; Fernando Ascensão; Badrul Azhar; Seung-Yun Baek
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
We present the GLOBAL ROADKILL DATA, the largest worldwide compilation of roadkill data on terrestrial vertebrates. We outline the workflow (Fig. 1) to illustrate the sequential steps of the study, in which we merged local-scale survey datasets and opportunistic records into a unified roadkill large dataset comprising 208,570 roadkill records. These records include 2283 species and subspecies from 54 countries across six continents, ranging from 1971 to 2024.Large roadkill datasets offer the advantage ofpreventing the collection of redundant data and are valuable resources for both local and macro-scale analyses regarding roadkill rates, road and landscape features associated with roadkill risk, species more vulnerable to road traffic, and populations at risk due to additional mortality. The standardization of data - such as scientific names, projection coordinates, and units - in a user-friendly format, makes themreadily accessible to a broader scientific and non-scientific community, including NGOs, consultants, public administration officials, and road managers. The open-access approach promotes collaboration among researchers and road practitioners, facilitating the replication of studies, validation of findings, and expansion of previous work. Moreover, researchers can utilize suchdatasets to develop new hypotheses, conduct meta-analyses, address pressing challenges more efficiently and strengthen the robustness of road ecology research. Ensuring widespreadaccess to roadkill data fosters a more diverse and inclusive research community. This not only grants researchers in emerging economies with more data for analysis, but also cultivates a diverse array of perspectives and insightspromoting the advance of infrastructure ecology.MethodsInformation sources: A core team from different continents performed a systematic literature search in Web of Science and Google Scholar for published peer-reviewed papers and dissertations. It was searched for the following terms: “roadkill* OR “road-kill” OR “road mortality” AND (country) in English, Portuguese, Spanish, French and/or Mandarin. This initiative was also disseminated to the mailing lists associated with transport infrastructure: The CCSG Transport Working Group (WTG), Infrastructure & Ecology Network Europe (IENE) and Latin American & Caribbean Transport Working Group (LACTWG) (Fig. 1). The core team identified 750 scientific papers and dissertations with information on roadkill and contacted the first authors of the publications to request georeferenced locations of roadkill andofferco-authorship to this data paper. Of the 824 authors contacted, 145agreed to sharegeoreferenced roadkill locations, often involving additional colleagues who contributed to data collection. Since our main goal was to provide open access to data that had never been shared in this format before, data from citizen science projects (e.g., globalroakill.net) that are already available were not included.Data compilation: A total of 423 co-authors compiled the following information: continent, country, latitude and longitude in WGS 84 decimal degrees of the roadkill, coordinates uncertainty, class, order, family, scientific name of the roadkill, vernacular name, IUCN status, number of roadkill, year, month, and day of the record, identification of the road, type of road, survey type, references, and observers that recorded the roadkill (Supplementary Information Table S1 - description of the fields and Table S2 - reference list). When roadkill data were derived from systematic surveys, the dataset included additional information on road length that was surveyed, latitude and longitude of the road (initial and final part of the road segment), survey period, start year of the survey, final year of the survey, 1st month of the year surveyed, last month of the year surveyed, and frequency of the survey. We consolidated 142 valid datasets into a single dataset. We complemented this data with OccurenceID (a UUID generated using Java code), basisOfRecord, countryCode, locality using OpenStreetMap’s API (https://www.openstreetmap.org), geodeticDatum, verbatimScientificName, Kingdom, phylum, genus, specificEpithet, infraspecificEpithet, acceptedNameUsage, scientific name authorship, matchType, taxonRank using Darwin Core Reference Guide (https://dwc.tdwg.org/terms/#dwc:coordinateUncertaintyInMeters) and link of the associatedReference (URL).Data standardization - We conducted a clustering analysis on all text fields to identify similar entries with minor variations, such as typos, and corrected them using OpenRefine (http://openrefine.org). Wealsostandardized all date values using OpenRefine. Coordinate uncertainties listed as 0 m were adjusted to either 30m or 100m, depending on whether they were recorded after or before 2000, respectively, following the recommendation in the Darwin Core Reference Guide (https://dwc.tdwg.org/terms/#dwc:coordinateUncertaintyInMeters).Taxonomy - We cross-referenced all species names with the Global Biodiversity Information Facility (GBIF) Backbone Taxonomy using Java and GBIF’s API (https://doi.org/10.15468/39omei). This process aimed to rectify classification errors, include additional fields such as Kingdom, Phylum, and scientific authorship, and gather comprehensive taxonomic information to address any gap withinthe datasets. For species not automatically matched (matchType - Table S1), we manually searched for correct synonyms when available.Species conservation status - Using the species names, we retrieved their conservation status and also vernacular names by cross-referencing with the database downloaded from the IUCNRed List of Threatened Species (https://www.iucnredlist.org). Species without a match were categorized as "Not Evaluated".Data RecordsGLOBAL ROADKILL DATA is available at Figshare27 https://doi.org/10.6084/m9.figshare.25714233. The dataset incorporates opportunistic (collected incidentally without data collection efforts) and systematic data (collected through planned, structured, and controlled methods designed to ensure consistency and reliability). In total, it comprises 208,570 roadkill records across 177,428 different locations(Fig. 2). Data were collected from the road network of 54 countries from 6 continents: Europe (n = 19), Asia (n = 16), South America (n=7), North America (n = 4), Africa (n = 6) and Oceania (n = 2).(Figure 2 goes here)All data are georeferenced in WGS84 decimals with maximum uncertainty of 5000 m. Approximately 92% of records have a location uncertainty of 30 m or less, with only 1138 records having location uncertainties ranging from 1000 to 5000 m. Mammals have the highest number of roadkill records (61%), followed by amphibians (21%), reptiles (10%) and birds (8%). The species with the highest number of records were roe deer (Capreolus capreolus, n = 44,268), pool frog (Pelophylax lessonae, n = 11,999) and European fallow deer (Dama dama, n = 7,426).We collected information on 126 threatened species with a total of 4570 records. Among the threatened species, the giant anteater (Myrmecophaga tridactyla, VULNERABLE) has the highest number of records n = 1199), followed by the common fire salamander (Salamandra salamandra, VULNERABLE, n=1043), and European rabbit (Oryctolagus cuniculus, ENDANGERED, n = 440). Records ranged from 1971 and 2024, comprising 72% of the roadkill recorded since 2013. Over 46% of the records were obtained from systematic surveys, with road length and survey period averaging, respectively, 66 km (min-max: 0.09-855 km) and 780 days (1-25,720 days).Technical ValidationWe employed the OpenStreetMap API through Java todetect location inaccuracies, andvalidate whether the geographic coordinates aligned with the specified country. We calculated the distance of each occurrence to the nearest road using the GRIP global roads database28, ensuring that all records were within the defined coordinate uncertainty. We verified if the survey duration matched the provided initial and final survey dates. We calculated the distance between the provided initial and final road coordinates and cross-checked it with the given road length. We identified and merged duplicate entries within the same dataset (same location, species, and date), aggregating the number of roadkills for each occurrence.Usage NotesThe GLOBAL ROADKILL DATA is a compilation of roadkill records and was designed to serve as a valuable resource for a wide range of analyses. Nevertheless, to prevent the generation of meaningless results, users should be aware of the followinglimitations:- Geographic representation – There is an evident bias in the distribution of records. Data originatedpredominantly from Europe (60% of records), South America (22%), and North America (12%). Conversely, there is a notable lack of records from Asia (5%), Oceania (1%) and Africa (0.3%). This dataset represents 36% of the initial contacts that provided geo-referenced records, which may not necessarily correspond to locations where high-impact roads are present.- Location accuracy - Insufficient location accuracy was observed for 1% of the data (ranging from 1000 to 5000 m), that was associated with various factors, such as survey methods, recording practices, or timing of the survey.- Sampling effort - This dataset comprised both opportunistic data and records from systematic surveys, with a high variability in survey duration and frequency. As a result, the use of both opportunistic and systematic surveys may affect the relative abundance of roadkill making it hard to make sound comparisons among species or areas.- Detectability and carcass removal bias - Although several studies had a high frequency of road surveys,the duration of carcass persistence on roads may vary with species size and environmental conditions, affecting detectability. Accordingly, several approaches account for survey frequency and target speciesto estimate more
The global gender gap index 2025
statista.com
Updated Jun 24, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2025). The global gender gap index 2025 [Dataset]. https://www.statista.com/statistics/244387/the-global-gender-gap-index/
Explore at:
Dataset updated
Jun 24, 2025
Dataset authored and provided by
Statistahttp://statista.com/
Time period covered
2025
Area covered
Worldwide
Description
The global gender gap index benchmarks national gender gaps on economic, political, education, and health-based criteria. In 2025, the country offering most gender equal conditions was Iceland, with a score of 0.93. Overall, the Nordic countries make up 3 of the 5 most gender equal countries in the world. The Nordic countries are known for their high levels of gender equality, including high female employment rates and evenly divided parental leave. Sudan is the second-least gender equal country Pakistan is found on the other end of the scale, ranked as the least gender equal country in the world. Conditions for civilians in the North African country have worsened significantly after a civil war broke out in April 2023. Especially girls and women are suffering and have become victims of sexual violence. Moreover, nearly 9 million people are estimated to be at acute risk of famine. The Middle East and North Africa has the largest gender gap Looking at the different world regions, the Middle East and North Africa has the largest gender gap as of 2023, just ahead of South Asia. Moreover, it is estimated that it will take another 152 years before the gender gap in the Middle East and North Africa is closed. On the other hand, Europe has the lowest gender gap in the world.
T
MANUFACTURING PMI by Country Dataset
tradingeconomics.com
csv, excel, json, xml
Updated Jan 2, 2014
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
TRADING ECONOMICS (2014). MANUFACTURING PMI by Country Dataset [Dataset]. https://tradingeconomics.com/country-list/manufacturing-pmi
Explore at:
xml, json, csv, excelAvailable download formats
Dataset updated
Jan 2, 2014
Dataset authored and provided by
TRADING ECONOMICS
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
2025
Area covered
World
Description
This dataset provides values for MANUFACTURING PMI reported in several countries. The data includes current values, previous releases, historical highs and record lows, release frequency, reported unit and currency.
T
GDP PER CAPITA PPP by Country in EUROPE
tradingeconomics.com
csv, excel, json, xml
Updated May 28, 2017
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
TRADING ECONOMICS (2017). GDP PER CAPITA PPP by Country in EUROPE [Dataset]. https://tradingeconomics.com/country-list/gdp-per-capita-ppp?continent=europe
Explore at:
excel, json, csv, xmlAvailable download formats
Dataset updated
May 28, 2017
Dataset authored and provided by
TRADING ECONOMICS
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
2025
Area covered
Europe
Description
This dataset provides values for GDP PER CAPITA PPP reported in several countries. The data includes current values, previous releases, historical highs and record lows, release frequency, reported unit and currency.

Facebook

Twitter

Click to copy link

Link copied

Cite

Google BigQuery (2020). census-bureau-international [Dataset]. https://www.kaggle.com/bigquery/census-bureau-international

census-bureau-international

World population estimates 1950 through 2050

Explore at:

zip(0 bytes)Available download formats

Dataset updated

May 6, 2020

Dataset provided by

BigQueryhttps://cloud.google.com/bigquery

Authors

Google BigQuery

Description

Context

The United States Census Bureau’s international dataset provides estimates of country populations since 1950 and projections through 2050. Specifically, the dataset includes midyear population figures broken down by age and gender assignment at birth. Additionally, time-series data is provided for attributes including fertility rates, birth rates, death rates, and migration rates.

Querying BigQuery tables

You can use the BigQuery Python client library to query tables in this dataset in Kernels. Note that methods available in Kernels are limited to querying data. Tables are at bigquery-public-data.census_bureau_international.

Sample Query 1

What countries have the longest life expectancy? In this query, 2016 census information is retrieved by joining the mortality_life_expectancy and country_names_area tables for countries larger than 25,000 km2. Without the size constraint, Monaco is the top result with an average life expectancy of over 89 years!

standardSQL

SELECT age.country_name, age.life_expectancy, size.country_area FROM ( SELECT country_name, life_expectancy FROM bigquery-public-data.census_bureau_international.mortality_life_expectancy WHERE year = 2016) age INNER JOIN ( SELECT country_name, country_area FROM bigquery-public-data.census_bureau_international.country_names_area where country_area > 25000) size ON age.country_name = size.country_name ORDER BY 2 DESC /* Limit removed for Data Studio Visualization */ LIMIT 10

Sample Query 2

Which countries have the largest proportion of their population under 25? Over 40% of the world’s population is under 25 and greater than 50% of the world’s population is under 30! This query retrieves the countries with the largest proportion of young people by joining the age-specific population table with the midyear (total) population table.

standardSQL

SELECT age.country_name, SUM(age.population) AS under_25, pop.midyear_population AS total, ROUND((SUM(age.population) / pop.midyear_population) * 100,2) AS pct_under_25 FROM ( SELECT country_name, population, country_code FROM bigquery-public-data.census_bureau_international.midyear_population_agespecific WHERE year =2017 AND age < 25) age INNER JOIN ( SELECT midyear_population, country_code FROM bigquery-public-data.census_bureau_international.midyear_population WHERE year = 2017) pop ON age.country_code = pop.country_code GROUP BY 1, 3 ORDER BY 4 DESC /* Remove limit for visualization*/ LIMIT 10

Sample Query 3

The International Census dataset contains growth information in the form of birth rates, death rates, and migration rates. Net migration is the net number of migrants per 1,000 population, an important component of total population and one that often drives the work of the United Nations Refugee Agency. This query joins the growth rate table with the area table to retrieve 2017 data for countries greater than 500 km2.

SELECT growth.country_name, growth.net_migration, CAST(area.country_area AS INT64) AS country_area FROM ( SELECT country_name, net_migration, country_code FROM bigquery-public-data.census_bureau_international.birth_death_growth_rates WHERE year = 2017) growth INNER JOIN ( SELECT country_area, country_code FROM bigquery-public-data.census_bureau_international.country_names_area

Update frequency

Historic (none)

Dataset source

United States Census Bureau

Terms of use: This dataset is publicly available for anyone to use under the following terms provided by the Dataset Source - http://www.data.gov/privacy-policy#data_policy - and is provided "AS IS" without any warranty, express or implied, from Google. Google disclaims all liability for any damages, direct or indirect, resulting from the use of the dataset.

See the GCP Marketplace listing for more details and sample queries: https://console.cloud.google.com/marketplace/details/united-states-census-bureau/international-census-data

Clear search

Close search

Google apps

Main menu

census-bureau-international

Context

Querying BigQuery tables

Sample Query 1

standardSQL

Sample Query 2

standardSQL

Sample Query 3

Update frequency

Dataset source

Geonames - All Cities with a population > 1000

Bank Rankings by Total Assets

Bank Rankings by Total Assets

Tracking the Financial Performance of the Top Banks

About this dataset

More Datasets

Featured Notebooks

How to use the dataset

Research Ideas

Acknowledgements

License

Columns

Large Scale International Boundaries

Population Density Around the Globe

GOLD RESERVES by Country Dataset

GDP by Country in ASIA

Country Codes

Dataset for: "Big data suggest strong constraints of linguistic similarity...

Political stability by country, around the world | TheGlobalEconomy.com

Data for: Public preferences on policies for climate, local pollution, and...

Scimago Country Rankings

MGD: Music Genre Dataset

Data from: A large synthetic dataset for machine learning applications in...

Data generation algorithm

Network

Time series

Usage

Selecting a particular country

Averaging over time

Source code

Funding

‘Netflix "Top 10" TV Shows and Films’ analyzed by Analyst-2

Ranking of happiest countries worldwide 2024, by score

Data from: Global Roadkill Data: a dataset on terrestrial vertebrate...

The global gender gap index 2025

MANUFACTURING PMI by Country Dataset

GDP PER CAPITA PPP by Country in EUROPE

census-bureau-internationalSee More Versions

World population estimates 1950 through 2050

Context

Querying BigQuery tables

Sample Query 1

standardSQL

Sample Query 2

standardSQL

Sample Query 3

Update frequency

Dataset source

census-bureau-international