69 datasets found
  1. Z

    Global Country Information 2023

    • data.niaid.nih.gov
    • zenodo.org
    Updated Jun 15, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Elgiriyewithana, Nidula (2024). Global Country Information 2023 [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_8165228
    Explore at:
    Dataset updated
    Jun 15, 2024
    Dataset authored and provided by
    Elgiriyewithana, Nidula
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Description

    This comprehensive dataset provides a wealth of information about all countries worldwide, covering a wide range of indicators and attributes. It encompasses demographic statistics, economic indicators, environmental factors, healthcare metrics, education statistics, and much more. With every country represented, this dataset offers a complete global perspective on various aspects of nations, enabling in-depth analyses and cross-country comparisons.

    Key Features

    Country: Name of the country.

    Density (P/Km2): Population density measured in persons per square kilometer.

    Abbreviation: Abbreviation or code representing the country.

    Agricultural Land (%): Percentage of land area used for agricultural purposes.

    Land Area (Km2): Total land area of the country in square kilometers.

    Armed Forces Size: Size of the armed forces in the country.

    Birth Rate: Number of births per 1,000 population per year.

    Calling Code: International calling code for the country.

    Capital/Major City: Name of the capital or major city.

    CO2 Emissions: Carbon dioxide emissions in tons.

    CPI: Consumer Price Index, a measure of inflation and purchasing power.

    CPI Change (%): Percentage change in the Consumer Price Index compared to the previous year.

    Currency_Code: Currency code used in the country.

    Fertility Rate: Average number of children born to a woman during her lifetime.

    Forested Area (%): Percentage of land area covered by forests.

    Gasoline_Price: Price of gasoline per liter in local currency.

    GDP: Gross Domestic Product, the total value of goods and services produced in the country.

    Gross Primary Education Enrollment (%): Gross enrollment ratio for primary education.

    Gross Tertiary Education Enrollment (%): Gross enrollment ratio for tertiary education.

    Infant Mortality: Number of deaths per 1,000 live births before reaching one year of age.

    Largest City: Name of the country's largest city.

    Life Expectancy: Average number of years a newborn is expected to live.

    Maternal Mortality Ratio: Number of maternal deaths per 100,000 live births.

    Minimum Wage: Minimum wage level in local currency.

    Official Language: Official language(s) spoken in the country.

    Out of Pocket Health Expenditure (%): Percentage of total health expenditure paid out-of-pocket by individuals.

    Physicians per Thousand: Number of physicians per thousand people.

    Population: Total population of the country.

    Population: Labor Force Participation (%): Percentage of the population that is part of the labor force.

    Tax Revenue (%): Tax revenue as a percentage of GDP.

    Total Tax Rate: Overall tax burden as a percentage of commercial profits.

    Unemployment Rate: Percentage of the labor force that is unemployed.

    Urban Population: Percentage of the population living in urban areas.

    Latitude: Latitude coordinate of the country's location.

    Longitude: Longitude coordinate of the country's location.

    Potential Use Cases

    Analyze population density and land area to study spatial distribution patterns.

    Investigate the relationship between agricultural land and food security.

    Examine carbon dioxide emissions and their impact on climate change.

    Explore correlations between economic indicators such as GDP and various socio-economic factors.

    Investigate educational enrollment rates and their implications for human capital development.

    Analyze healthcare metrics such as infant mortality and life expectancy to assess overall well-being.

    Study labor market dynamics through indicators such as labor force participation and unemployment rates.

    Investigate the role of taxation and its impact on economic development.

    Explore urbanization trends and their social and environmental consequences.

  2. Country metadata

    • kaggle.com
    Updated May 26, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Treich (2020). Country metadata [Dataset]. https://www.kaggle.com/datasets/treich/country-metadata/discussion
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 26, 2020
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Treich
    Description

    Context

    This dataset simply combines publicly available data to characterise a country based on healthcare factors, economy, government and demographics.

    Content

    All data are given per 100.000 inhabitants where this is appropriate scores are given as absolute values and so are spending and demographics. Each row represents one country. Data that is included covers the following topics:

    Healthcare: - Staff including: Nurses and Physicians per 100.000 inhabitants - Infrastructure including: Beds, Chnage of beds between 2018 and 2019 and the change of bed numbers since 2013, Intensive Care Unit (ICU) beds, ventilators and Extra Corporal Membrane Oxygenation (ECMO), machines per 100.000 inhabitants - Total spending on healthcare in US dollars per capita.

    Demographics: - The median age for entire population and each gender - The percentage of the population within age brackets - Total population - Population per km2 - Population change between 2018 and 2019

    Government The used scores are from the Economist intelligence unit and describe how democratic a country is and how the government works. These can be used to compare countries based on their government type.

    Acknowledgements

    All data is publicly available and just has been brought together in one place. The sources are:

    Inspiration

    These data are meant as metadata to decide which countries are comparable. I am working on healthcare data so the inspiration is to compare health statistics between countries and make an informed decision about how comparable they are. Could be used for any non healthcare related task as well.

  3. census-bureau-international

    • kaggle.com
    zip
    Updated May 6, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Google BigQuery (2020). census-bureau-international [Dataset]. https://www.kaggle.com/bigquery/census-bureau-international
    Explore at:
    zip(0 bytes)Available download formats
    Dataset updated
    May 6, 2020
    Dataset provided by
    BigQueryhttps://cloud.google.com/bigquery
    Authors
    Google BigQuery
    Description

    Context

    The United States Census Bureau’s international dataset provides estimates of country populations since 1950 and projections through 2050. Specifically, the dataset includes midyear population figures broken down by age and gender assignment at birth. Additionally, time-series data is provided for attributes including fertility rates, birth rates, death rates, and migration rates.

    Querying BigQuery tables

    You can use the BigQuery Python client library to query tables in this dataset in Kernels. Note that methods available in Kernels are limited to querying data. Tables are at bigquery-public-data.census_bureau_international.

    Sample Query 1

    What countries have the longest life expectancy? In this query, 2016 census information is retrieved by joining the mortality_life_expectancy and country_names_area tables for countries larger than 25,000 km2. Without the size constraint, Monaco is the top result with an average life expectancy of over 89 years!

    standardSQL

    SELECT age.country_name, age.life_expectancy, size.country_area FROM ( SELECT country_name, life_expectancy FROM bigquery-public-data.census_bureau_international.mortality_life_expectancy WHERE year = 2016) age INNER JOIN ( SELECT country_name, country_area FROM bigquery-public-data.census_bureau_international.country_names_area where country_area > 25000) size ON age.country_name = size.country_name ORDER BY 2 DESC /* Limit removed for Data Studio Visualization */ LIMIT 10

    Sample Query 2

    Which countries have the largest proportion of their population under 25? Over 40% of the world’s population is under 25 and greater than 50% of the world’s population is under 30! This query retrieves the countries with the largest proportion of young people by joining the age-specific population table with the midyear (total) population table.

    standardSQL

    SELECT age.country_name, SUM(age.population) AS under_25, pop.midyear_population AS total, ROUND((SUM(age.population) / pop.midyear_population) * 100,2) AS pct_under_25 FROM ( SELECT country_name, population, country_code FROM bigquery-public-data.census_bureau_international.midyear_population_agespecific WHERE year =2017 AND age < 25) age INNER JOIN ( SELECT midyear_population, country_code FROM bigquery-public-data.census_bureau_international.midyear_population WHERE year = 2017) pop ON age.country_code = pop.country_code GROUP BY 1, 3 ORDER BY 4 DESC /* Remove limit for visualization*/ LIMIT 10

    Sample Query 3

    The International Census dataset contains growth information in the form of birth rates, death rates, and migration rates. Net migration is the net number of migrants per 1,000 population, an important component of total population and one that often drives the work of the United Nations Refugee Agency. This query joins the growth rate table with the area table to retrieve 2017 data for countries greater than 500 km2.

    SELECT growth.country_name, growth.net_migration, CAST(area.country_area AS INT64) AS country_area FROM ( SELECT country_name, net_migration, country_code FROM bigquery-public-data.census_bureau_international.birth_death_growth_rates WHERE year = 2017) growth INNER JOIN ( SELECT country_area, country_code FROM bigquery-public-data.census_bureau_international.country_names_area

    Update frequency

    Historic (none)

    Dataset source

    United States Census Bureau

    Terms of use: This dataset is publicly available for anyone to use under the following terms provided by the Dataset Source - http://www.data.gov/privacy-policy#data_policy - and is provided "AS IS" without any warranty, express or implied, from Google. Google disclaims all liability for any damages, direct or indirect, resulting from the use of the dataset.

    See the GCP Marketplace listing for more details and sample queries: https://console.cloud.google.com/marketplace/details/united-states-census-bureau/international-census-data

  4. o

    Geonames - All Cities with a population > 1000

    • public.opendatasoft.com
    • data.smartidf.services
    • +2more
    csv, excel, geojson +1
    Updated Mar 10, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). Geonames - All Cities with a population > 1000 [Dataset]. https://public.opendatasoft.com/explore/dataset/geonames-all-cities-with-a-population-1000/
    Explore at:
    csv, json, geojson, excelAvailable download formats
    Dataset updated
    Mar 10, 2024
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    All cities with a population > 1000 or seats of adm div (ca 80.000)Sources and ContributionsSources : GeoNames is aggregating over hundred different data sources. Ambassadors : GeoNames Ambassadors help in many countries. Wiki : A wiki allows to view the data and quickly fix error and add missing places. Donations and Sponsoring : Costs for running GeoNames are covered by donations and sponsoring.Enrichment:add country name

  5. Large Scale International Boundaries

    • catalog.data.gov
    • geodata.state.gov
    • +1more
    Updated Jul 22, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Department of State (Point of Contact) (2025). Large Scale International Boundaries [Dataset]. https://catalog.data.gov/dataset/large-scale-international-boundaries
    Explore at:
    Dataset updated
    Jul 22, 2025
    Dataset provided by
    United States Department of Statehttp://state.gov/
    Description

    Overview The Office of the Geographer and Global Issues at the U.S. Department of State produces the Large Scale International Boundaries (LSIB) dataset. The current edition is version 11.4 (published 24 February 2025). The 11.4 release contains updated boundary lines and data refinements designed to extend the functionality of the dataset. These data and generalized derivatives are the only international boundary lines approved for U.S. Government use. The contents of this dataset reflect U.S. Government policy on international boundary alignment, political recognition, and dispute status. They do not necessarily reflect de facto limits of control. National Geospatial Data Asset This dataset is a National Geospatial Data Asset (NGDAID 194) managed by the Department of State. It is a part of the International Boundaries Theme created by the Federal Geographic Data Committee. Dataset Source Details Sources for these data include treaties, relevant maps, and data from boundary commissions, as well as national mapping agencies. Where available and applicable, the dataset incorporates information from courts, tribunals, and international arbitrations. The research and recovery process includes analysis of satellite imagery and elevation data. Due to the limitations of source materials and processing techniques, most lines are within 100 meters of their true position on the ground. Cartographic Visualization The LSIB is a geospatial dataset that, when used for cartographic purposes, requires additional styling. The LSIB download package contains example style files for commonly used software applications. The attribute table also contains embedded information to guide the cartographic representation. Additional discussion of these considerations can be found in the Use of Core Attributes in Cartographic Visualization section below. Additional cartographic information pertaining to the depiction and description of international boundaries or areas of special sovereignty can be found in Guidance Bulletins published by the Office of the Geographer and Global Issues: https://data.geodata.state.gov/guidance/index.html Contact Direct inquiries to internationalboundaries@state.gov. Direct download: https://data.geodata.state.gov/LSIB.zip Attribute Structure The dataset uses the following attributes divided into two categories: ATTRIBUTE NAME | ATTRIBUTE STATUS CC1 | Core CC1_GENC3 | Extension CC1_WPID | Extension COUNTRY1 | Core CC2 | Core CC2_GENC3 | Extension CC2_WPID | Extension COUNTRY2 | Core RANK | Core LABEL | Core STATUS | Core NOTES | Core LSIB_ID | Extension ANTECIDS | Extension PREVIDS | Extension PARENTID | Extension PARENTSEG | Extension These attributes have external data sources that update separately from the LSIB: ATTRIBUTE NAME | ATTRIBUTE STATUS CC1 | GENC CC1_GENC3 | GENC CC1_WPID | World Polygons COUNTRY1 | DoS Lists CC2 | GENC CC2_GENC3 | GENC CC2_WPID | World Polygons COUNTRY2 | DoS Lists LSIB_ID | BASE ANTECIDS | BASE PREVIDS | BASE PARENTID | BASE PARENTSEG | BASE The core attributes listed above describe the boundary lines contained within the LSIB dataset. Removal of core attributes from the dataset will change the meaning of the lines. An attribute status of “Extension” represents a field containing data interoperability information. Other attributes not listed above include “FID”, “Shape_length” and “Shape.” These are components of the shapefile format and do not form an intrinsic part of the LSIB. Core Attributes The eight core attributes listed above contain unique information which, when combined with the line geometry, comprise the LSIB dataset. These Core Attributes are further divided into Country Code and Name Fields and Descriptive Fields. County Code and Country Name Fields “CC1” and “CC2” fields are machine readable fields that contain political entity codes. These are two-character codes derived from the Geopolitical Entities, Names, and Codes Standard (GENC), Edition 3 Update 18. “CC1_GENC3” and “CC2_GENC3” fields contain the corresponding three-character GENC codes and are extension attributes discussed below. The codes “Q2” or “QX2” denote a line in the LSIB representing a boundary associated with areas not contained within the GENC standard. The “COUNTRY1” and “COUNTRY2” fields contain the names of corresponding political entities. These fields contain names approved by the U.S. Board on Geographic Names (BGN) as incorporated in the ‘"Independent States in the World" and "Dependencies and Areas of Special Sovereignty" lists maintained by the Department of State. To ensure maximum compatibility, names are presented without diacritics and certain names are rendered using common cartographic abbreviations. Names for lines associated with the code "Q2" are descriptive and not necessarily BGN-approved. Names rendered in all CAPITAL LETTERS denote independent states. Names rendered in normal text represent dependencies, areas of special sovereignty, or are otherwise presented for the convenience of the user. Descriptive Fields The following text fields are a part of the core attributes of the LSIB dataset and do not update from external sources. They provide additional information about each of the lines and are as follows: ATTRIBUTE NAME | CONTAINS NULLS RANK | No STATUS | No LABEL | Yes NOTES | Yes Neither the "RANK" nor "STATUS" fields contain null values; the "LABEL" and "NOTES" fields do. The "RANK" field is a numeric expression of the "STATUS" field. Combined with the line geometry, these fields encode the views of the United States Government on the political status of the boundary line. ATTRIBUTE NAME | | VALUE | RANK | 1 | 2 | 3 STATUS | International Boundary | Other Line of International Separation | Special Line A value of “1” in the “RANK” field corresponds to an "International Boundary" value in the “STATUS” field. Values of ”2” and “3” correspond to “Other Line of International Separation” and “Special Line,” respectively. The “LABEL” field contains required text to describe the line segment on all finished cartographic products, including but not limited to print and interactive maps. The “NOTES” field contains an explanation of special circumstances modifying the lines. This information can pertain to the origins of the boundary lines, limitations regarding the purpose of the lines, or the original source of the line. Use of Core Attributes in Cartographic Visualization Several of the Core Attributes provide information required for the proper cartographic representation of the LSIB dataset. The cartographic usage of the LSIB requires a visual differentiation between the three categories of boundary lines. Specifically, this differentiation must be between: International Boundaries (Rank 1); Other Lines of International Separation (Rank 2); and Special Lines (Rank 3). Rank 1 lines must be the most visually prominent. Rank 2 lines must be less visually prominent than Rank 1 lines. Rank 3 lines must be shown in a manner visually subordinate to Ranks 1 and 2. Where scale permits, Rank 2 and 3 lines must be labeled in accordance with the “Label” field. Data marked with a Rank 2 or 3 designation does not necessarily correspond to a disputed boundary. Please consult the style files in the download package for examples of this depiction. The requirement to incorporate the contents of the "LABEL" field on cartographic products is scale dependent. If a label is legible at the scale of a given static product, a proper use of this dataset would encourage the application of that label. Using the contents of the "COUNTRY1" and "COUNTRY2" fields in the generation of a line segment label is not required. The "STATUS" field contains the preferred description for the three LSIB line types when they are incorporated into a map legend but is otherwise not to be used for labeling. Use of the “CC1,” “CC1_GENC3,” “CC2,” “CC2_GENC3,” “RANK,” or “NOTES” fields for cartographic labeling purposes is prohibited. Extension Attributes Certain elements of the attributes within the LSIB dataset extend data functionality to make the data more interoperable or to provide clearer linkages to other datasets. The fields “CC1_GENC3” and “CC2_GENC” contain the corresponding three-character GENC code to the “CC1” and “CC2” attributes. The code “QX2” is the three-character counterpart of the code “Q2,” which denotes a line in the LSIB representing a boundary associated with a geographic area not contained within the GENC standard. To allow for linkage between individual lines in the LSIB and World Polygons dataset, the “CC1_WPID” and “CC2_WPID” fields contain a Universally Unique Identifier (UUID), version 4, which provides a stable description of each geographic entity in a boundary pair relationship. Each UUID corresponds to a geographic entity listed in the World Polygons dataset. These fields allow for linkage between individual lines in the LSIB and the overall World Polygons dataset. Five additional fields in the LSIB expand on the UUID concept and either describe features that have changed across space and time or indicate relationships between previous versions of the feature. The “LSIB_ID” attribute is a UUID value that defines a specific instance of a feature. Any change to the feature in a lineset requires a new “LSIB_ID.” The “ANTECIDS,” or antecedent ID, is a UUID that references line geometries from which a given line is descended in time. It is used when there is a feature that is entirely new, not when there is a new version of a previous feature. This is generally used to reference countries that have dissolved. The “PREVIDS,” or Previous ID, is a UUID field that contains old versions of a line. This is an additive field, that houses all Previous IDs. A new version of a feature is defined by any change to the

  6. o

    European Business Performance Database

    • openicpsr.org
    Updated Sep 15, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Youssef Cassis; Harm Schroeter; Andrea Colli (2018). European Business Performance Database [Dataset]. http://doi.org/10.3886/E106060V2
    Explore at:
    Dataset updated
    Sep 15, 2018
    Dataset provided by
    Bocconi University
    EUI, Florence
    Bergen University
    Authors
    Youssef Cassis; Harm Schroeter; Andrea Colli
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The European Business Performance database describes the performance of the largest enterprises in the twentieth century. It covers eight countries that together consistently account for above 80 per cent of western European GDP: Great Britain, Germany, France, Belgium, Italy, Spain, Sweden, and Finland. Data have been collected for five benchmark years, namely on the eve of WWI (1913), before the Great Depression (1927), at the extremes of the golden age (1954 and 1972), and in 2000.The database is comprised of two distinct datasets. The Small Sample (625 firms) includes the largest enterprises in each country across all industries (economy-wide). To avoid over-representation of certain countries and sectors, countries contribute a number of firms that is roughly proportionate to the size of the economy: 30 firms from Great Britain, 25 from Germany, 20 from France, 15 from Italy, 10 from Belgium, Spain, and Sweden, and 5 from Finland. By the same token, a cap has been set on the number of financial firms entering the sample, so that they range between up to 6 for Britain and 1 for Finland.The second dataset, or Large Sample (1,167 firms), is made up of the largest firms per industry. Here industries are so selected as to take into account long-term technological developments and the rise of entirely new products and services. Firms have been individually classified using the two-digit ISIC Rev. 3.1 codes, then grouped under a manageable number of industries. To some extent and broadly speaking, the two samples have a rather distinct focus: the Small Sample is biased in favour of sheer bigness, whereas the Large Sample emphasizes industries.As far as size and performance indicators are concerned, total assets has been picked as the main size measure in the first three benchmarks, turnover in 1972 and 2000 (financial intermediaries, though, are ranked by total assets throughout the database). Performance is gauged by means of two financial ratios, namely return on equity and shareholders’ return, i.e. the percentage year-on-year change in share price based on year-end values. In order to smooth out volatility, at each benchmark performance figures have been averaged over three consecutive years (for instance, performance in 1913 reflects average performance in 1911, 1912, and 1913).All figures were collected in national currency and converted to US dollars at current year-average exchange rates.

  7. T

    GDP by Country Dataset

    • tradingeconomics.com
    csv, excel, json, xml
    Updated Jun 29, 2011
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    TRADING ECONOMICS (2011). GDP by Country Dataset [Dataset]. https://tradingeconomics.com/country-list/gdp
    Explore at:
    csv, json, xml, excelAvailable download formats
    Dataset updated
    Jun 29, 2011
    Dataset authored and provided by
    TRADING ECONOMICS
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    2025
    Area covered
    World
    Description

    This dataset provides values for GDP reported in several countries. The data includes current values, previous releases, historical highs and record lows, release frequency, reported unit and currency.

  8. T

    CORONAVIRUS DEATHS by Country Dataset

    • tradingeconomics.com
    csv, excel, json, xml
    Updated Mar 4, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    TRADING ECONOMICS (2020). CORONAVIRUS DEATHS by Country Dataset [Dataset]. https://tradingeconomics.com/country-list/coronavirus-deaths
    Explore at:
    csv, excel, xml, jsonAvailable download formats
    Dataset updated
    Mar 4, 2020
    Dataset authored and provided by
    TRADING ECONOMICS
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    2025
    Area covered
    World
    Description

    This dataset provides values for CORONAVIRUS DEATHS reported in several countries. The data includes current values, previous releases, historical highs and record lows, release frequency, reported unit and currency.

  9. o

    All Postal Code - All countries (Geonames)

    • public.opendatasoft.com
    • data.smartidf.services
    • +1more
    csv, excel, geojson +1
    Updated Jul 9, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). All Postal Code - All countries (Geonames) [Dataset]. https://public.opendatasoft.com/explore/dataset/geonames-postal-code/
    Explore at:
    excel, json, geojson, csvAvailable download formats
    Dataset updated
    Jul 9, 2025
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    For many countries lat/lng are determined with an algorithm that searches the place names in the main geonames database using administrative divisions and numerical vicinity of the postal codes as factors in the disambiguation of place names. For postal codes and place name for which no corresponding toponym in the main geonames database could be found an average lat/lng of 'neighbouring' postal codes is calculated. Please let us know if you find any errors in the data set. ThanksFor Canada we have only the first letters of the full postal codes (for copyright reasons)For Ireland we have only the first letters of the full postal codes (for copyright reasons)For Malta we have only the first letters of the full postal codes (for copyright reasons)The Argentina data file contains 4-digit postal codes which were replaced with a new system in 1999.For Brazil only major postal codes are available (only the codes ending with -000 and the major code per municipality).For India the lat/lng accuracy is not yet comparable to other countries.Update frequency: 1 month

  10. m

    Dataset of development of business during the COVID-19 crisis

    • data.mendeley.com
    • narcis.nl
    Updated Nov 9, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Tatiana N. Litvinova (2020). Dataset of development of business during the COVID-19 crisis [Dataset]. http://doi.org/10.17632/9vvrd34f8t.1
    Explore at:
    Dataset updated
    Nov 9, 2020
    Authors
    Tatiana N. Litvinova
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    To create the dataset, the top 10 countries leading in the incidence of COVID-19 in the world were selected as of October 22, 2020 (on the eve of the second full of pandemics), which are presented in the Global 500 ranking for 2020: USA, India, Brazil, Russia, Spain, France and Mexico. For each of these countries, no more than 10 of the largest transnational corporations included in the Global 500 rating for 2020 and 2019 were selected separately. The arithmetic averages were calculated and the change (increase) in indicators such as profitability and profitability of enterprises, their ranking position (competitiveness), asset value and number of employees. The arithmetic mean values of these indicators for all countries of the sample were found, characterizing the situation in international entrepreneurship as a whole in the context of the COVID-19 crisis in 2020 on the eve of the second wave of the pandemic. The data is collected in a general Microsoft Excel table. Dataset is a unique database that combines COVID-19 statistics and entrepreneurship statistics. The dataset is flexible data that can be supplemented with data from other countries and newer statistics on the COVID-19 pandemic. Due to the fact that the data in the dataset are not ready-made numbers, but formulas, when adding and / or changing the values in the original table at the beginning of the dataset, most of the subsequent tables will be automatically recalculated and the graphs will be updated. This allows the dataset to be used not just as an array of data, but as an analytical tool for automating scientific research on the impact of the COVID-19 pandemic and crisis on international entrepreneurship. The dataset includes not only tabular data, but also charts that provide data visualization. The dataset contains not only actual, but also forecast data on morbidity and mortality from COVID-19 for the period of the second wave of the pandemic in 2020. The forecasts are presented in the form of a normal distribution of predicted values and the probability of their occurrence in practice. This allows for a broad scenario analysis of the impact of the COVID-19 pandemic and crisis on international entrepreneurship, substituting various predicted morbidity and mortality rates in risk assessment tables and obtaining automatically calculated consequences (changes) on the characteristics of international entrepreneurship. It is also possible to substitute the actual values identified in the process and following the results of the second wave of the pandemic to check the reliability of pre-made forecasts and conduct a plan-fact analysis. The dataset contains not only the numerical values of the initial and predicted values of the set of studied indicators, but also their qualitative interpretation, reflecting the presence and level of risks of a pandemic and COVID-19 crisis for international entrepreneurship.

  11. w

    Afrobarometer Survey 1 1999-2000, Merged 7 Country - Botswana, Lesotho,...

    • microdata.worldbank.org
    • catalog.ihsn.org
    • +1more
    Updated Apr 27, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Institute for Democracy in South Africa (IDASA) (2021). Afrobarometer Survey 1 1999-2000, Merged 7 Country - Botswana, Lesotho, Malawi, Namibia, South Africa, Zambia, Zimbabwe [Dataset]. https://microdata.worldbank.org/index.php/catalog/889
    Explore at:
    Dataset updated
    Apr 27, 2021
    Dataset provided by
    Ghana Centre for Democratic Development (CDD-Ghana)
    Institute for Democracy in South Africa (IDASA)
    Michigan State University (MSU)
    Time period covered
    1999 - 2000
    Area covered
    Africa, Namibia, South Africa, Botswana, Zambia, Lesotho, Malawi, Zimbabwe
    Description

    Abstract

    Round 1 of the Afrobarometer survey was conducted from July 1999 through June 2001 in 12 African countries, to solicit public opinion on democracy, governance, markets, and national identity. The full 12 country dataset released was pieced together out of different projects, Round 1 of the Afrobarometer survey,the old Southern African Democracy Barometer, and similar surveys done in West and East Africa.

    The 7 country dataset is a subset of the Round 1 survey dataset, and consists of a combined dataset for the 7 Southern African countries surveyed with other African countries in Round 1, 1999-2000 (Botswana, Lesotho, Malawi, Namibia, South Africa, Zambia and Zimbabwe). It is a useful dataset because, in contrast to the full 12 country Round 1 dataset, all countries in this dataset were surveyed with the identical questionnaire

    Geographic coverage

    Botswana Lesotho Malawi Namibia South Africa Zambia Zimbabwe

    Analysis unit

    Basic units of analysis that the study investigates include: individuals and groups

    Kind of data

    Sample survey data [ssd]

    Sampling procedure

    A new sample has to be drawn for each round of Afrobarometer surveys. Whereas the standard sample size for Round 3 surveys will be 1200 cases, a larger sample size will be required in societies that are extremely heterogeneous (such as South Africa and Nigeria), where the sample size will be increased to 2400. Other adaptations may be necessary within some countries to account for the varying quality of the census data or the availability of census maps.

    The sample is designed as a representative cross-section of all citizens of voting age in a given country. The goal is to give every adult citizen an equal and known chance of selection for interview. We strive to reach this objective by (a) strictly applying random selection methods at every stage of sampling and by (b) applying sampling with probability proportionate to population size wherever possible. A randomly selected sample of 1200 cases allows inferences to national adult populations with a margin of sampling error of no more than plus or minus 2.5 percent with a confidence level of 95 percent. If the sample size is increased to 2400, the confidence interval shrinks to plus or minus 2 percent.

    Sample Universe

    The sample universe for Afrobarometer surveys includes all citizens of voting age within the country. In other words, we exclude anyone who is not a citizen and anyone who has not attained this age (usually 18 years) on the day of the survey. Also excluded are areas determined to be either inaccessible or not relevant to the study, such as those experiencing armed conflict or natural disasters, as well as national parks and game reserves. As a matter of practice, we have also excluded people living in institutionalized settings, such as students in dormitories and persons in prisons or nursing homes.

    What to do about areas experiencing political unrest? On the one hand we want to include them because they are politically important. On the other hand, we want to avoid stretching out the fieldwork over many months while we wait for the situation to settle down. It was agreed at the 2002 Cape Town Planning Workshop that it is difficult to come up with a general rule that will fit all imaginable circumstances. We will therefore make judgments on a case-by-case basis on whether or not to proceed with fieldwork or to exclude or substitute areas of conflict. National Partners are requested to consult Core Partners on any major delays, exclusions or substitutions of this sort.

    Sample Design

    The sample design is a clustered, stratified, multi-stage, area probability sample.

    To repeat the main sampling principle, the objective of the design is to give every sample element (i.e. adult citizen) an equal and known chance of being chosen for inclusion in the sample. We strive to reach this objective by (a) strictly applying random selection methods at every stage of sampling and by (b) applying sampling with probability proportionate to population size wherever possible.

    In a series of stages, geographically defined sampling units of decreasing size are selected. To ensure that the sample is representative, the probability of selection at various stages is adjusted as follows:

    The sample is stratified by key social characteristics in the population such as sub-national area (e.g. region/province) and residential locality (urban or rural). The area stratification reduces the likelihood that distinctive ethnic or language groups are left out of the sample. And the urban/rural stratification is a means to make sure that these localities are represented in their correct proportions. Wherever possible, and always in the first stage of sampling, random sampling is conducted with probability proportionate to population size (PPPS). The purpose is to guarantee that larger (i.e., more populated) geographical units have a proportionally greater probability of being chosen into the sample. The sampling design has four stages

    A first-stage to stratify and randomly select primary sampling units;

    A second-stage to randomly select sampling start-points;

    A third stage to randomly choose households;

    A final-stage involving the random selection of individual respondents

    We shall deal with each of these stages in turn.

    STAGE ONE: Selection of Primary Sampling Units (PSUs)

    The primary sampling units (PSU's) are the smallest, well-defined geographic units for which reliable population data are available. In most countries, these will be Census Enumeration Areas (or EAs). Most national census data and maps are broken down to the EA level. In the text that follows we will use the acronyms PSU and EA interchangeably because, when census data are employed, they refer to the same unit.

    We strongly recommend that NIs use official national census data as the sampling frame for Afrobarometer surveys. Where recent or reliable census data are not available, NIs are asked to inform the relevant Core Partner before they substitute any other demographic data. Where the census is out of date, NIs should consult a demographer to obtain the best possible estimates of population growth rates. These should be applied to the outdated census data in order to make projections of population figures for the year of the survey. It is important to bear in mind that population growth rates vary by area (region) and (especially) between rural and urban localities. Therefore, any projected census data should include adjustments to take such variations into account.

    Indeed, we urge NIs to establish collegial working relationships within professionals in the national census bureau, not only to obtain the most recent census data, projections, and maps, but to gain access to sampling expertise. NIs may even commission a census statistician to draw the sample to Afrobarometer specifications, provided that provision for this service has been made in the survey budget.

    Regardless of who draws the sample, the NIs should thoroughly acquaint themselves with the strengths and weaknesses of the available census data and the availability and quality of EA maps. The country and methodology reports should cite the exact census data used, its known shortcomings, if any, and any projections made from the data. At minimum, the NI must know the size of the population and the urban/rural population divide in each region in order to specify how to distribute population and PSU's in the first stage of sampling. National investigators should obtain this written data before they attempt to stratify the sample.

    Once this data is obtained, the sample population (either 1200 or 2400) should be stratified, first by area (region/province) and then by residential locality (urban or rural). In each case, the proportion of the sample in each locality in each region should be the same as its proportion in the national population as indicated by the updated census figures.

    Having stratified the sample, it is then possible to determine how many PSU's should be selected for the country as a whole, for each region, and for each urban or rural locality.

    The total number of PSU's to be selected for the whole country is determined by calculating the maximum degree of clustering of interviews one can accept in any PSU. Because PSUs (which are usually geographically small EAs) tend to be socially homogenous we do not want to select too many people in any one place. Thus, the Afrobarometer has established a standard of no more than 8 interviews per PSU. For a sample size of 1200, the sample must therefore contain 150 PSUs/EAs (1200 divided by 8). For a sample size of 2400, there must be 300 PSUs/EAs.

    These PSUs should then be allocated proportionally to the urban and rural localities within each regional stratum of the sample. Let's take a couple of examples from a country with a sample size of 1200. If the urban locality of Region X in this country constitutes 10 percent of the current national population, then the sample for this stratum should be 15 PSUs (calculated as 10 percent of 150 PSUs). If the rural population of Region Y constitutes 4 percent of the current national population, then the sample for this stratum should be 6 PSU's.

    The next step is to select particular PSUs/EAs using random methods. Using the above example of the rural localities in Region Y, let us say that you need to pick 6 sample EAs out of a census list that contains a total of 240 rural EAs in Region Y. But which 6? If the EAs created by the national census bureau are of equal or roughly equal population size, then selection is relatively straightforward. Just number all EAs consecutively, then make six selections using a table of random numbers. This procedure, known as simple random sampling (SRS), will

  12. e

    Global - Roads Open Access Data Set - Dataset - ENERGYDATA.INFO

    • energydata.info
    Updated Jul 25, 2018
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2018). Global - Roads Open Access Data Set - Dataset - ENERGYDATA.INFO [Dataset]. https://energydata.info/dataset/global-roads-open-access-data-set-2010
    Explore at:
    Dataset updated
    Jul 25, 2018
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    The Global Roads Open Access Data Set, Version 1 (gROADSv1) was developed under the auspices of the CODATA Global Roads Data Development Task Group. The data set combines the best available roads data by country into a global roads coverage, using the UN Spatial Data Infrastructure Transport (UNSDI-T) version 2 as a common data model. All country road networks have been joined topologically at the borders, and many countries have been edited for internal topology. Source data for each country are provided in the documentation, and users are encouraged to refer to the readme file for use constraints that apply to a small number of countries. Because the data are compiled from multiple sources, the date range for road network representations ranges from the 1980s to 2010 depending on the country (most countries have no confirmed date), and spatial accuracy varies. The baseline global data set was compiled by the Information Technology Outreach Services (ITOS) of the University of Georgia. Updated data for 27 countries and 6 smaller geographic entities were assembled by Columbia University's Center for International Earth Science Information Network (CIESIN), with a focus largely on developing countries with the poorest data coverage.

  13. Data from: GYPSUM LICHENS: a global data set of lichen species from gypsum...

    • gbif.org
    • demo.gbif.org
    Updated Jan 22, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sergio Muriel; Gregorio Aragón; Isabel Martínez; María Prieto; Sergio Muriel; Gregorio Aragón; Isabel Martínez; María Prieto (2024). GYPSUM LICHENS: a global data set of lichen species from gypsum ecosystems [Dataset]. http://doi.org/10.15470/6yne3u
    Explore at:
    Dataset updated
    Jan 22, 2024
    Dataset provided by
    Global Biodiversity Information Facilityhttps://www.gbif.org/
    Universidad Rey Juan Carlos
    Authors
    Sergio Muriel; Gregorio Aragón; Isabel Martínez; María Prieto; Sergio Muriel; Gregorio Aragón; Isabel Martínez; María Prieto
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Time period covered
    Jan 1, 1881 - Dec 12, 2018
    Area covered
    Description

    Lichens are significant components of the Biological Soil Crust (BSC) communities in gypsum ecosystems and are involved in several processes related to ecosystem functioning. Although numerous studies centered on lichen taxonomy and ecology have been performed in these habitats, global information about the lichen species from gypsum ecosystems or their distributional ranges at a global scale is missing. Thus, a global data set of lichen species growing on gypsum has been compiled.

    A total of 321 studies were finally retained for the review. This data set is composed of 6114 specimen records, belonging to 336 lichen species from 26 countries throughout the World. Spain and Germany hosted the highest number of species (160 and 114 species respectively). Outside the European continent, only a few countries had a significant number of species: Morocco (46), United States (42), and Iran (37). Remarkably, countries from the southern hemisphere (i.e. Australia, Chile, Namibia, and South Africa) showed a low number of studies from gypsum lands. The number of records per country showed a similar pattern as the species number, having Spain and Germany the highest number of records (3863 and 1075 respectively).

    Thirty one families are present in the data set. Teloschistaceae (56 species), Verrucariaceae (38 species), and Cladoniaceae (37 species) are the most represented families showing the highest number of species. Regarding the number of records, Cladoniaceae (1267 records), Teloschistaceae (972 records), and Psoraceae (539 records) are the most abundant families. Psora decipiens (Psoraceae) and Squamarina lentigera (Cladoniaceae) are the most widespread species concerning the number of countries in which the species were present, and the most abundant regarding the total number of records.

  14. w

    Research Database on Infrastructure Economic Performance 1980-2004 - Aruba,...

    • microdata.worldbank.org
    • catalog.ihsn.org
    • +3more
    Updated Oct 26, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Antonio Estache and Ana Goicoechea (2023). Research Database on Infrastructure Economic Performance 1980-2004 - Aruba, Afghanistan, Angola...and 190 more [Dataset]. https://microdata.worldbank.org/index.php/catalog/1780
    Explore at:
    Dataset updated
    Oct 26, 2023
    Dataset authored and provided by
    Antonio Estache and Ana Goicoechea
    Time period covered
    1980 - 2004
    Area covered
    Angola
    Description

    Abstract

    Estache and Goicoechea present an infrastructure database that was assembled from multiple sources. Its main purposes are: (i) to provide a snapshot of the sector as of the end of 2004; and (ii) to facilitate quantitative analytical research on infrastructure sectors. The related working paper includes definitions, source information and the data available for 37 performance indicators that proxy access, affordability and quality of service (most recent data as of June 2005). Additionally, the database includes a snapshot of 15 reform indicators across infrastructure sectors.

    This is a first attempt, since the effort made in the World Development Report 1994, at generating a database on infrastructure sectors and it needs to be recognized as such. This database is not a state of the art output—this is being worked on by sector experts on a different time table. The effort has however generated a significant amount of new information. The database already provides enough information to launch a much more quantitative debate on the state of infrastructure. But much more is needed and by circulating this information at this stage, we hope to be able to generate feedback and fill the major knowledge gaps and inconsistencies we have identified.

    Geographic coverage

    The database covers the following countries: - Afghanistan - Albania - Algeria - American Samoa - Andorra - Angola - Antigua and Barbuda - Argentina - Armenia - Aruba - Australia - Austria - Azerbaijan - Bahamas, The - Bahrain - Bangladesh - Barbados - Belarus - Belgium - Belize - Benin - Bermuda - Bhutan - Bolivia - Bosnia and Herzegovina - Botswana - Brazil - Brunei - Bulgaria - Burkina Faso - Burundi - Cambodia - Cameroon - Canada - Cape Verde - Cayman Islands - Central African Republic - Chad - Channel Islands - Chile - China - Colombia - Comoros - Congo, Dem. Rep. - Congo, Rep. - Costa Rica - Cote d'Ivoire - Croatia - Cuba - Cyprus - Czech Republic - Denmark - Djibouti - Dominica - Dominican Republic - Ecuador - Egypt, Arab Rep. - El Salvador - Equatorial Guinea - Eritrea - Estonia - Ethiopia - Faeroe Islands - Fiji - Finland - France - French Polynesia - Gabon - Gambia, The - Georgia - Germany - Ghana - Greece - Greenland - Grenada - Guam - Guatemala - Guinea - Guinea-Bissau - Guyana - Haiti - Honduras - Hong Kong, China - Hungary - Iceland - India - Indonesia - Iran, Islamic Rep. - Iraq - Ireland - Isle of Man - Israel - Italy - Jamaica - Japan - Jordan - Kazakhstan - Kenya - Kiribati - Korea, Dem. Rep. - Korea, Rep. - Kuwait - Kyrgyz Republic - Lao PDR - Latvia - Lebanon - Lesotho - Liberia - Libya - Liechtenstein - Lithuania - Luxembourg - Macao, China - Macedonia, FYR - Madagascar - Malawi - Malaysia - Maldives - Mali - Malta - Marshall Islands - Mauritania - Mauritius - Mayotte - Mexico - Micronesia, Fed. Sts. - Moldova - Monaco - Mongolia - Morocco - Mozambique - Myanmar - Namibia - Nepal - Netherlands - Netherlands Antilles - New Caledonia - New Zealand - Nicaragua - Niger - Nigeria - Northern Mariana Islands - Norway - Oman - Pakistan - Palau - Panama - Papua New Guinea - Paraguay - Peru - Philippines - Poland - Portugal - Puerto Rico - Qatar - Romania - Russian Federation - Rwanda - Samoa - San Marino - Sao Tome and Principe - Saudi Arabia - Senegal - Seychelles - Sierra Leone - Singapore - Slovak Republic - Slovenia - Solomon Islands - Somalia - South Africa - Spain - Sri Lanka - St. Kitts and Nevis - St. Lucia - St. Vincent and the Grenadines - Sudan - Suriname - Swaziland - Sweden - Switzerland - Syrian Arab Republic - Tajikistan - Tanzania - Thailand - Togo - Tonga - Trinidad and Tobago - Tunisia - Turkey - Turkmenistan - Uganda - Ukraine - United Arab Emirates - United Kingdom - United States - Uruguay - Uzbekistan - Vanuatu - Venezuela, RB - Vietnam - Virgin Islands (U.S.) - West Bank and Gaza - Yemen, Rep. - Yugoslavia, FR (Serbia/Montenegro) - Zambia - Zimbabwe

    Kind of data

    Aggregate data [agg]

    Mode of data collection

    Face-to-face [f2f]

    Response rate

    Sector Performance Indicators

    Energy The energy sector is relatively well covered by the database, at least in terms of providing a relatively recent snapshot for the main policy areas. The best covered area is access where data are available for 2000 for about 61% of the 207 countries included in the database. The technical quality indicator is available for 60% of the countries, and at least one of the perceived quality indicators is available for 40% of the countries. Price information is available for about 41% of the countries, distinguishing between residential and non residential.

    Water & Sanitation Because the sector is part of the Millennium Development Goals (MDGs), it enjoys a lot of effort on data generation in terms of the access rates. The WHO is the main engine behind this effort in collaboration with the multilateral and bilateral aid agencies. The coverage is actually quite high -some national, urban and rural information is available for 75 to 85% of the countries- but there are significant concerns among the research community about the fact that access rates have been measured without much consideration to the quality of access level. The data on technical quality are only available for 27% of the countries. There are data on perceived quality for roughly 39% of the countries but it cannot be used to qualify the information provided by the raw access rates (i.e. access 3 hours a day is not equivalent to access 24 hours a day).

    Information and Communication Technology The ICT sector is probably the best covered among the infrastructure sub-sectors to a large extent thanks to the fact that the International Telecommunications Union (ITU) has taken on the responsibility to collect the data. ITU covers a wide spectrum of activity under the communications heading and its coverage ranges from 85 to 99% for all national access indicators. The information on prices needed to make assessments of affordability is also quite extensive since it covers roughly 85 to 95% of the 207 countries. With respect to quality, the coverage of technical indicators is over 88% while the information on perceived quality is only available for roughly 40% of the countries.

    Transport The transport sector is possibly the least well covered in terms of the service orientation of infrastructure indicators. Regarding access, network density is the closest approximation to access to the service and is covered at a rate close to 90% for roads but only at a rate of 50% for rail. The relevant data on prices only cover about 30% of the sample for railways. Some type of technical quality information is available for 86% of the countries. Quality perception is only available for about 40% of the countries.

    Institutional Reform Indicators

    Electricity The data on electricity policy reform were collected from the following sources: ABS Electricity Deregulation Report (2004), AEI-Brookings telecommunications and electricity regulation database (2003), Bacon (1999), Estache and Gassner (2004), Estache, Trujillo, and Tovar de la Fe (2004), Global Regulatory Network Program (2004), Henisz et al. (2003), International Porwer Finance Review (2003-04), International Power and Utilities Finance Review (2004-05), Kikukawa (2004), Wallsten et al. (2004), World Bank Caribbean Infrastructure Assessment (2004), World Bank Global Energy Sector Reform in Developing Countries (1999), World Bank staff, and country regulators. The coverage for the three types of institutional indicators is quite good for the electricity sector. For regulatory institutions and private participation in generation and distribution, the coverage is about 80% of the 207 counties. It is somewhat lower on the market structure with only 58%.

    Water & Sanitation The data on water policy reform were collected from the following sources: ABS Water and Waste Utilities of the World (2004), Asian Developing Bank (2000), Bayliss (2002), Benoit (2004), Budds and McGranahan (2003), Hall, Bayliss, and Lobina (2002), Hall and Lobina (2002), Hall, Lobina, and De La Mote (2002), Halpern (2002), Lobina (2001), World Bank Caribbean Infrastructure Assessment (2004), World Bank Sector Note on Water Supply and Sanitation for Infrastructure in EAP (2004), and World Bank staff. The coverage for institutional reforms in W&S is not as exhaustive as for the other utilities. Information on the regulatory institutions responsible for large utilities is available for about 67% of the countries. Ownership data are available for about 70% of the countries. There is no information on the market structure good enough to be reported here at this stage. In most countries small scale operators are important private actors but there is no systematic record of their existence. Most of the information available on their role and importance is only anecdotal.

    Information and Communication Technology The report Trends in Telecommunications Reform from ITU (revised by World Bank staff) is the main source of information for this sector. The information on institutional reforms in the sector is however not as exhaustive as it is for its sector performance indicators. While the coverage on the regulatory institutions is 100%, it varies between 76 and 90% of the countries for more of the other indicators. Quite surprisingly also, in contrast to what is available for other sectors, it proved difficult to obtain data on the timing of reforms and of the creation of the regulatory agencies.

    Transport Information on transport institutions and reforms is not systematically generated by any agency. Even though more data are needed to have a more comprenhensive picture of the transport sector, it was possible to collect data on railways policy reform from Janes World Railways (2003-04) and complement it with

  15. Data from: A large synthetic dataset for machine learning applications in...

    • zenodo.org
    csv, json, png, zip
    Updated Mar 25, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Marc Gillioz; Marc Gillioz; Guillaume Dubuis; Philippe Jacquod; Philippe Jacquod; Guillaume Dubuis (2025). A large synthetic dataset for machine learning applications in power transmission grids [Dataset]. http://doi.org/10.5281/zenodo.13378476
    Explore at:
    zip, png, csv, jsonAvailable download formats
    Dataset updated
    Mar 25, 2025
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Marc Gillioz; Marc Gillioz; Guillaume Dubuis; Philippe Jacquod; Philippe Jacquod; Guillaume Dubuis
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    With the ongoing energy transition, power grids are evolving fast. They operate more and more often close to their technical limit, under more and more volatile conditions. Fast, essentially real-time computational approaches to evaluate their operational safety, stability and reliability are therefore highly desirable. Machine Learning methods have been advocated to solve this challenge, however they are heavy consumers of training and testing data, while historical operational data for real-world power grids are hard if not impossible to access.

    This dataset contains long time series for production, consumption, and line flows, amounting to 20 years of data with a time resolution of one hour, for several thousands of loads and several hundreds of generators of various types representing the ultra-high-voltage transmission grid of continental Europe. The synthetic time series have been statistically validated agains real-world data.

    Data generation algorithm

    The algorithm is described in a Nature Scientific Data paper. It relies on the PanTaGruEl model of the European transmission network -- the admittance of its lines as well as the location, type and capacity of its power generators -- and aggregated data gathered from the ENTSO-E transparency platform, such as power consumption aggregated at the national level.

    Network

    The network information is encoded in the file europe_network.json. It is given in PowerModels format, which it itself derived from MatPower and compatible with PandaPower. The network features 7822 power lines and 553 transformers connecting 4097 buses, to which are attached 815 generators of various types.

    Time series

    The time series forming the core of this dataset are given in CSV format. Each CSV file is a table with 8736 rows, one for each hourly time step of a 364-day year. All years are truncated to exactly 52 weeks of 7 days, and start on a Monday (the load profiles are typically different during weekdays and weekends). The number of columns depends on the type of table: there are 4097 columns in load files, 815 for generators, and 8375 for lines (including transformers). Each column is described by a header corresponding to the element identifier in the network file. All values are given in per-unit, both in the model file and in the tables, i.e. they are multiples of a base unit taken to be 100 MW.

    There are 20 tables of each type, labeled with a reference year (2016 to 2020) and an index (1 to 4), zipped into archive files arranged by year. This amount to a total of 20 years of synthetic data. When using loads, generators, and lines profiles together, it is important to use the same label: for instance, the files loads_2020_1.csv, gens_2020_1.csv, and lines_2020_1.csv represent a same year of the dataset, whereas gens_2020_2.csv is unrelated (it actually shares some features, such as nuclear profiles, but it is based on a dispatch with distinct loads).

    Usage

    The time series can be used without a reference to the network file, simply using all or a selection of columns of the CSV files, depending on the needs. We show below how to select series from a particular country, or how to aggregate hourly time steps into days or weeks. These examples use Python and the data analyis library pandas, but other frameworks can be used as well (Matlab, Julia). Since all the yearly time series are periodic, it is always possible to define a coherent time window modulo the length of the series.

    Selecting a particular country

    This example illustrates how to select generation data for Switzerland in Python. This can be done without parsing the network file, but using instead gens_by_country.csv, which contains a list of all generators for any country in the network. We start by importing the pandas library, and read the column of the file corresponding to Switzerland (country code CH):

    import pandas as pd
    CH_gens = pd.read_csv('gens_by_country.csv', usecols=['CH'], dtype=str)

    The object created in this way is Dataframe with some null values (not all countries have the same number of generators). It can be turned into a list with:

    CH_gens_list = CH_gens.dropna().squeeze().to_list()

    Finally, we can import all the time series of Swiss generators from a given data table with

    pd.read_csv('gens_2016_1.csv', usecols=CH_gens_list)

    The same procedure can be applied to loads using the list contained in the file loads_by_country.csv.

    Averaging over time

    This second example shows how to change the time resolution of the series. Suppose that we are interested in all the loads from a given table, which are given by default with a one-hour resolution:

    hourly_loads = pd.read_csv('loads_2018_3.csv')

    To get a daily average of the loads, we can use:

    daily_loads = hourly_loads.groupby([t // 24 for t in range(24 * 364)]).mean()

    This results in series of length 364. To average further over entire weeks and get series of length 52, we use:

    weekly_loads = hourly_loads.groupby([t // (24 * 7) for t in range(24 * 364)]).mean()

    Source code

    The code used to generate the dataset is freely available at https://github.com/GeeeHesso/PowerData. It consists in two packages and several documentation notebooks. The first package, written in Python, provides functions to handle the data and to generate synthetic series based on historical data. The second package, written in Julia, is used to perform the optimal power flow. The documentation in the form of Jupyter notebooks contains numerous examples on how to use both packages. The entire workflow used to create this dataset is also provided, starting from raw ENTSO-E data files and ending with the synthetic dataset given in the repository.

    Funding

    This work was supported by the Cyber-Defence Campus of armasuisse and by an internal research grant of the Engineering and Architecture domain of HES-SO.

  16. g

    RED – The Relational Export Dataset

    • search.gesis.org
    Updated Apr 1, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Lischka, Michael; Besche-Truthe, Fabian (2022). RED – The Relational Export Dataset [Dataset]. http://doi.org/10.7802/2578
    Explore at:
    Dataset updated
    Apr 1, 2022
    Dataset provided by
    GESIS search
    Globale Entwicklungsdynamiken von Sozialpolitik (SFB 1342)
    Authors
    Lischka, Michael; Besche-Truthe, Fabian
    License

    https://www.gesis.org/en/institute/data-usage-termshttps://www.gesis.org/en/institute/data-usage-terms

    Description

    The Relational Export Dataset “RED” provides comparable dyadic trade data between nation-states for the period 1870 - present. This dataset is built in accordance with the analytical focus of the DFG-funded "Collaborative Research Centre 1342 - Global Dynamics of Social Policy" (CRC 1342). In principle, this large-scale project follows an interdependence-centered approach to explain the diffusion of governmental social policies from 1880 to the present. Trade linkages are an explanatory variable in this respect (Windzio et al., 2022). This requires temporally consistent data on interstate linkages for the largest possible sample of countries. So far, there has been no data set that meets these requirements. We, therefore, introduce a dataset which combines trade data from UN Comtrade (Comtrade, 2022), UNCTAD (UNCTAD, 2021), and the Correlates of War (COW) Project (Barbieri and Keshk, 2016). Unlike most databases, the data here does not represent absolute monetary trade volumes in a given currency. Rather, the data depicts the ratio of trade flows between two countries and the total exports of the specific exporting country. Hence, we measure trade in relational terms weighted by the respective importance of trading partners for one another. These relations are estimated from both an export and an import-oriented point of view; in this technical description, however, we focus on the ratios estimated solely with export values.

  17. S

    A dataset of China’s overseas power projects (2000–2019)

    • scidb.cn
    Updated Sep 23, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    蒋瑜; 邬明权; 黄长军; 牛铮 (2019). A dataset of China’s overseas power projects (2000–2019) [Dataset]. http://doi.org/10.11922/sciencedb.893
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Sep 23, 2019
    Dataset provided by
    Science Data Bank
    Authors
    蒋瑜; 邬明权; 黄长军; 牛铮
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    China
    Description

    Power shortages are faced by developing countries in the Belt and Road region. Since the Belt and Road initiative was put forward, Chinese companies have invested and built a large number of electrical power projects in countries and areas with power shortages in this region. Due to the large number and wide distribution of projects and continuous increases in the number of new power projects, a large amount of project information has been generated. Accordingly, it is urgent to collect and summarize One Belt and One Road overseas power project information. In this study, web crawler technology was used to obtain overseas power project information. A One Belt and One Road dataset of overseas power projects was formed by further supplementing and improving the project information using documents from ministries, embassies, counselors of the ministry of economy and commerce, local news reports in Chinese and English, and case studies and field studies conducted by scholars and non-governmental organizations. The dataset includes information on 376 power projects from 80 countries in Asia, Africa, Europe, America, and Oceania. Each project’s information includes the project number, project name, construction status, enterprise name, installed capacity, continent, country, project category, and bid information. The collection and improvement of this dataset will help with understanding the distribution of One Belt, One Road overseas power projects, as well as development trends in overseas power investment and construction in recent years. This can provide a basis for China’s power companies to “go global” and become “One Belt, One Road” overseas. It also provides a reference for power project development planning and government department decision-making.

  18. r

    QoG Basic Dataset - Time-Series Data

    • researchdata.se
    • demo.researchdata.se
    Updated Aug 6, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Stefan Dahlberg; Aksel Sundström; Sören Holmberg; Bo Rothstein; Natalia Alvarado Pachon; Cem Mert Dalli (2024). QoG Basic Dataset - Time-Series Data [Dataset]. http://doi.org/10.18157/qogbasjan22
    Explore at:
    Dataset updated
    Aug 6, 2024
    Dataset provided by
    University of Gothenburg
    Authors
    Stefan Dahlberg; Aksel Sundström; Sören Holmberg; Bo Rothstein; Natalia Alvarado Pachon; Cem Mert Dalli
    Time period covered
    1946 - 2021
    Description

    The QoG Institute is an independent research institute within the Department of Political Science at the University of Gothenburg. Overall 30 researchers conduct and promote research on the causes, consequences and nature of Good Governance and the Quality of Government - that is, trustworthy, reliable, impartial, uncorrupted and competent government institutions.

    The main objective of our research is to address the theoretical and empirical problem of how political institutions of high quality can be created and maintained. A second objective is to study the effects of Quality of Government on a number of policy areas, such as health, the environment, social policy, and poverty.

    QoG Basic Dataset, which consists of approximately the 300 most used variables from QoG Standard Dataset, is a selection of variables that cover the most important concepts related to Quality of Government.

    In the QoG Basic CS dataset, data from and around 2018 is included. Data from 2018 is prioritized, however, if no data is available for a country for 2018, data for 2019 is included. If no data exists for 2019, data for 2017 is included, and so on up to a maximum of +/- 3 years.

    In the QoG Basic TS dataset, data from 1946 to 2021 is included and the unit of analysis is country-year (e.g., Sweden-1946, Sweden-1947, etc.).

    Purpose:

    The primary aim of QoG is to conduct and promote research on corruption. One aim of the QoG Institute is to make publicly available cross-national comparative data on QoG and its correlates.

    In the QoG Basic TS dataset, data from 1946 to 2021 is included and the unit of analysis is country-year (e.g., Sweden-1946, Sweden-1947, etc.).

    Historical countries are in most cases denoted with a do-date (e.g. Ethiopia (-1992) and a from-date (Ethiopia (1993-)).

  19. G

    LSIB 2017: Large Scale International Boundary Polygons, Detailed

    • developers.google.com
    Updated Dec 29, 2017
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    United States Department of State, Office of the Geographer (2017). LSIB 2017: Large Scale International Boundary Polygons, Detailed [Dataset]. https://developers.google.com/earth-engine/datasets/catalog/USDOS_LSIB_2017
    Explore at:
    Dataset updated
    Dec 29, 2017
    Dataset provided by
    United States Department of State, Office of the Geographer
    Time period covered
    Dec 29, 2017
    Area covered
    Earth
    Description

    The United States Office of the Geographer provides the Large Scale International Boundary (LSIB) dataset. It is derived from two other datasets: a LSIB line vector file and the World Vector Shorelines (WVS) from the National Geospatial-Intelligence Agency (NGA). The interior boundaries reflect U.S. government policies on boundaries, boundary disputes, and sovereignty. The exterior boundaries are derived from the WVS; however, the WVS coastline data is outdated and generally shifted from between several hundred meters to over a kilometer. Each feature is the polygonal area enclosed by interior boundaries and exterior coastlines where applicable, and many countries consist of multiple features, one per disjoint region. Each of the 180,741 features is a part of the geometry of one of the 284 countries described in this dataset.

  20. e

    Social assistance in low and middle income countries 2000-2015 - Dataset -...

    • b2find.eudat.eu
    Updated Mar 1, 2016
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2016). Social assistance in low and middle income countries 2000-2015 - Dataset - B2FIND [Dataset]. https://b2find.eudat.eu/dataset/74f2dfa6-c2a5-5d99-a157-e32ec11591af
    Explore at:
    Dataset updated
    Mar 1, 2016
    Description

    The social assistance explorer contains a harmonised panel dataset of social assistance indicators spanning 2000-2015. It has been developed to support comparative research on emerging welfare institutions. Comparative analysis of social protection institutions in low and middle income countries is scarce. Yet social assistance accounts for most of the recent expansion of welfare institutions. The project collected data on programme design and objectives, institutionalisation, reach, and financial resources. Key indicators can be aggregated at country and region levels.Since the turn of the century low and middle income countries have introduced or expanded programmes providing direct transfers to families in poverty or extreme poverty as a means of strengthening their capacity to exit poverty. The rationale underpinning these programmes is that stabilising and enhancing family income through transfers in cash and in kind will enable programme participants to improve their nutrition, ensure investment in children's schooling and health, and help overcome economic and social exclusion. The expansion of antipoverty transfer programmes has accelerated. Estimates suggest that around 1 billion people in developing countries reside with someone in receipt of a transfer. As would be expected, the spread of social assistance has been slower and more tentative in low income countries due to implementation and finance constraints and limited elite political support. Antipoverty transfer programmes in developing countries show large variation in design, effectiveness, scale, and objectives. In most countries, there are several interventions running alongside one another with diverse priorities and designs, and often targeting different groups. In many countries social public assistance programmes work alongside social insurance programmes for formal sector workers and humanitarian or emergency assistance. Social assistance focuses on groups in poverty, provides medium term support, and is budget-financed. The spread of social assistance in developing countries has revealed significant gaps in the knowledge, for example as regards their effectiveness, reach, and sustainability. Comparative analysis is essential to fill in these gaps and improve national, regional and global policy. For example, achieving a zero target for extreme poverty, as has been suggested in the context of the post-2015 international development agenda, would require effective and permanent institutions ensuring the benefits from economic growth reach the poorest. Social assistance is essential to achieving this goal. This research project focuses on improving research infrastructure on social assistance, in terms of concepts, indicators and data. This is urgently needed to support comparative analysis of emerging social assistance institutions. The project will identify indicators to assess social assistance programmes and will collect information on these for 2000 to 2015 for all developing countries. The database will be made available online to researchers and policy makers globally. As part of the project, the database will be analysed to examine patterns or configurations in social assistance programmes and institutions. Our interest is in identifying ideal types, broad features of social assistance programmes or institutions which enable reducing the large diversity of programmes and interventions to their core characteristics. These ideal types are social assistance regimes. Further analysis will test for potential combinations of political, demographic, economic and social factors linked to specific social assistance regimes. This analysis will allow us to examine what conditions can help explain the expansion of social assistance in developing countries; what factors influence the specific configuration of social assistance institutions in different countries and regions; and what conditions are needed for their effectiveness and sustainability. This research will throw light on the contribution of social assistance to the reduction of poverty and vulnerability and to economic and social development. The data collection included all countries defined as low and middle income in the 2016 version of the World Bank Country Classification. An inventory of potential social assistance programmes was developed for each country. The definition described above was then applied to identify social assistance programmes. For some countries with a large number of small or localised programmes, the data collection focused on nationwide, large-scale, and/or leading programmes. For example, some states in India have localised programmes. These were excluded from the data collection. In sub-Saharan Africa some programmes are very small in scale but they are significant in leading the expansion of social assistance. They were included. Where programmes consolidate pre-existing programmes, for example Brazil's Bolsa Família, the dataset includes Bolsa Família as well as its component programmes. Data were collected from a variety of sources: global and regional datasets (ASPIRE, ODI, CEPAL, ADB's SPI, IPC-PG); national government websites; programme agency reports; research papers; evaluation reports; policy documents; IFIs project documentation and reports; personal communication with programme agencies. The collection of the data was organised around a codebook, describing each of the variables and the specific coding of the information. The codebook was constructed after extensive consultation with specialist researchers. The codebook is available from the data webpage in the website. Specialist consultants supported data collection in had-to-reach areas. The data collected were checked against alternative sources of information where available.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Elgiriyewithana, Nidula (2024). Global Country Information 2023 [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_8165228

Global Country Information 2023

Explore at:
Dataset updated
Jun 15, 2024
Dataset authored and provided by
Elgiriyewithana, Nidula
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Description

This comprehensive dataset provides a wealth of information about all countries worldwide, covering a wide range of indicators and attributes. It encompasses demographic statistics, economic indicators, environmental factors, healthcare metrics, education statistics, and much more. With every country represented, this dataset offers a complete global perspective on various aspects of nations, enabling in-depth analyses and cross-country comparisons.

Key Features

Country: Name of the country.

Density (P/Km2): Population density measured in persons per square kilometer.

Abbreviation: Abbreviation or code representing the country.

Agricultural Land (%): Percentage of land area used for agricultural purposes.

Land Area (Km2): Total land area of the country in square kilometers.

Armed Forces Size: Size of the armed forces in the country.

Birth Rate: Number of births per 1,000 population per year.

Calling Code: International calling code for the country.

Capital/Major City: Name of the capital or major city.

CO2 Emissions: Carbon dioxide emissions in tons.

CPI: Consumer Price Index, a measure of inflation and purchasing power.

CPI Change (%): Percentage change in the Consumer Price Index compared to the previous year.

Currency_Code: Currency code used in the country.

Fertility Rate: Average number of children born to a woman during her lifetime.

Forested Area (%): Percentage of land area covered by forests.

Gasoline_Price: Price of gasoline per liter in local currency.

GDP: Gross Domestic Product, the total value of goods and services produced in the country.

Gross Primary Education Enrollment (%): Gross enrollment ratio for primary education.

Gross Tertiary Education Enrollment (%): Gross enrollment ratio for tertiary education.

Infant Mortality: Number of deaths per 1,000 live births before reaching one year of age.

Largest City: Name of the country's largest city.

Life Expectancy: Average number of years a newborn is expected to live.

Maternal Mortality Ratio: Number of maternal deaths per 100,000 live births.

Minimum Wage: Minimum wage level in local currency.

Official Language: Official language(s) spoken in the country.

Out of Pocket Health Expenditure (%): Percentage of total health expenditure paid out-of-pocket by individuals.

Physicians per Thousand: Number of physicians per thousand people.

Population: Total population of the country.

Population: Labor Force Participation (%): Percentage of the population that is part of the labor force.

Tax Revenue (%): Tax revenue as a percentage of GDP.

Total Tax Rate: Overall tax burden as a percentage of commercial profits.

Unemployment Rate: Percentage of the labor force that is unemployed.

Urban Population: Percentage of the population living in urban areas.

Latitude: Latitude coordinate of the country's location.

Longitude: Longitude coordinate of the country's location.

Potential Use Cases

Analyze population density and land area to study spatial distribution patterns.

Investigate the relationship between agricultural land and food security.

Examine carbon dioxide emissions and their impact on climate change.

Explore correlations between economic indicators such as GDP and various socio-economic factors.

Investigate educational enrollment rates and their implications for human capital development.

Analyze healthcare metrics such as infant mortality and life expectancy to assess overall well-being.

Study labor market dynamics through indicators such as labor force participation and unemployment rates.

Investigate the role of taxation and its impact on economic development.

Explore urbanization trends and their social and environmental consequences.

Search
Clear search
Close search
Google apps
Main menu