91 datasets found
  1. d

    List of all countries with their 2 digit codes (ISO 3166-1)

    • datahub.io
    Updated Aug 29, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2017). List of all countries with their 2 digit codes (ISO 3166-1) [Dataset]. https://datahub.io/core/country-list
    Explore at:
    Dataset updated
    Aug 29, 2017
    License

    ODC Public Domain Dedication and Licence (PDDL) v1.0http://www.opendatacommons.org/licenses/pddl/1.0/
    License information was derived automatically

    Description

    ISO 3166-1-alpha-2 English country names and code elements. This list states the country names (official short names in English) in alphabetical order as given in ISO 3166-1 and the corresponding ISO 3166-1-alpha-2 code elements.

  2. List_of_countries_by_population_in_1800

    • kaggle.com
    zip
    Updated Jul 17, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mathurin Aché (2020). List_of_countries_by_population_in_1800 [Dataset]. https://www.kaggle.com/datasets/mathurinache/list-of-countries-by-population-in-1800
    Explore at:
    zip(355 bytes)Available download formats
    Dataset updated
    Jul 17, 2020
    Authors
    Mathurin Aché
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    This dataset is extracted from https://en.wikipedia.org/wiki/List_of_countries_by_population_in_1800. Context: There s a story behind every dataset and heres your opportunity to share yours.Content: What s inside is more than just rows and columns. Make it easy for others to get started by describing how you acquired the data and what time period it represents, too. Acknowledgements:We wouldn t be here without the help of others. If you owe any attributions or thanks, include them here along with any citations of past research.Inspiration: Your data will be in front of the world s largest data science community. What questions do you want to see answered?

  3. Countries and territories Named Authority List

    • data.europa.eu
    rdf xml, xml, zip
    Updated Dec 3, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Publications Office of the European Union (2021). Countries and territories Named Authority List [Dataset]. https://data.europa.eu/data/datasets/country?locale=en
    Explore at:
    xml, rdf xml, zipAvailable download formats
    Dataset updated
    Dec 3, 2021
    Dataset provided by
    Publications Office of the European Unionhttp://op.europa.eu/
    European Union-
    Authors
    Publications Office of the European Union
    License

    http://data.europa.eu/eli/dec/2011/833/ojhttp://data.europa.eu/eli/dec/2011/833/oj

    Description

    Countries and territories is a controlled vocabulary that lists concepts associated with names of countries and territories. It is a corporate reference data asset covered by the Corporate Reference Data Management policy of the European Commission. It provides codes and names of geospatial and geopolitical entities in all official EU languages and is the result of a combination of multiple relevant standards, created to serve the requirements and use cases of the EU institutions services. Its main scope is to support documentary metadata activities. The codes of the concepts included are correlated with the ISO 3166 international standard. The authority code relies where possible on the ISO 3166-1 alpha-3 code. Additional user-assigned alpha-3 codes have been used to cover entities that are not included in the ISO 3166-1 standard. The corporate list contains mappings with the ISO 3166-1 two-letter codes, the Interinstitutional Style Guide codes and with other internal and external identifiers including ISO 3166-1 numeric, ISO 3166-3, UNSD M49, UNSD Geoscheme, IBAN, TIR, IANA domain. For the names of countries and territories, the corporate list synchronises with the Interinstitutional Style Guide (ISG, Section 7.1 and Annexes A5 and A6) and with the IATE terminology database. Membership and classification properties provide possibilities to group concepts, e.g., UN, EU, EEA, EFTA, Schengen area, Euro area, NATO, OECD, UCPM, ENP-EAST, ENP-SOUTH, EU candidate countries and potential candidates. Countries and territories is maintained by the Publications Office of the European Union and disseminated on the EU Vocabularies website. Regular updates are foreseen based on its stakeholders’ needs. Downloads in human-readable formats (.csv, .html) are also available.

  4. o

    Country Codes

    • public.opendatasoft.com
    • data.smartidf.services
    • +6more
    csv, excel, geojson +1
    Updated Aug 25, 2015
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2015). Country Codes [Dataset]. https://public.opendatasoft.com/explore/dataset/countries-codes/
    Explore at:
    geojson, json, excel, csvAvailable download formats
    Dataset updated
    Aug 25, 2015
    License

    https://en.wikipedia.org/wiki/Public_domainhttps://en.wikipedia.org/wiki/Public_domain

    Description

    Country codes: ISO 2ISO 3UNLANGLABEL (EN, FR, SP)

  5. z

    CY-Bench: A comprehensive benchmark dataset for subnational crop yield...

    • zenodo.org
    • explore.openaire.eu
    zip
    Updated Sep 25, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dilli Paudel; Dilli Paudel; Hilmy Baja; Hilmy Baja; Ron van Bree; Michiel Kallenberg; Michiel Kallenberg; Stella Ofori-Ampofo; Aike Potze; Pratishtha Poudel; Pratishtha Poudel; Abdelrahman Saleh; Weston Anderson; Weston Anderson; Malte von Bloh; Andres Castellano; Oumnia Ennaji; Raed Hamed; Rahel Laudien; Donghoon Lee; Inti Luna; Dainius Masiliūnas; Dainius Masiliūnas; Michele Meroni; Janet Mumo Mutuku; Siyabusa Mkuhlani; Jonathan Richetti; Alex C. Ruane; Ritvik Sahajpal; Guanyuan Shuai; Vasileios Sitokonstantinou; Rogerio de Souza Noia Junior; Amit Kumar Srivastava; Robert Strong; Lily-belle Sweet; Lily-belle Sweet; Petar Vojnović; Allard de Wit; Allard de Wit; Maximilian Zachow; Ioannis N. Athanasiadis; Ron van Bree; Stella Ofori-Ampofo; Aike Potze; Abdelrahman Saleh; Malte von Bloh; Andres Castellano; Oumnia Ennaji; Raed Hamed; Rahel Laudien; Donghoon Lee; Inti Luna; Michele Meroni; Janet Mumo Mutuku; Siyabusa Mkuhlani; Jonathan Richetti; Alex C. Ruane; Ritvik Sahajpal; Guanyuan Shuai; Vasileios Sitokonstantinou; Rogerio de Souza Noia Junior; Amit Kumar Srivastava; Robert Strong; Petar Vojnović; Maximilian Zachow; Ioannis N. Athanasiadis (2024). CY-Bench: A comprehensive benchmark dataset for subnational crop yield forecasting [Dataset]. http://doi.org/10.5281/zenodo.13798797
    Explore at:
    zipAvailable download formats
    Dataset updated
    Sep 25, 2024
    Dataset provided by
    AgML (https://www.agml.org/)
    Authors
    Dilli Paudel; Dilli Paudel; Hilmy Baja; Hilmy Baja; Ron van Bree; Michiel Kallenberg; Michiel Kallenberg; Stella Ofori-Ampofo; Aike Potze; Pratishtha Poudel; Pratishtha Poudel; Abdelrahman Saleh; Weston Anderson; Weston Anderson; Malte von Bloh; Andres Castellano; Oumnia Ennaji; Raed Hamed; Rahel Laudien; Donghoon Lee; Inti Luna; Dainius Masiliūnas; Dainius Masiliūnas; Michele Meroni; Janet Mumo Mutuku; Siyabusa Mkuhlani; Jonathan Richetti; Alex C. Ruane; Ritvik Sahajpal; Guanyuan Shuai; Vasileios Sitokonstantinou; Rogerio de Souza Noia Junior; Amit Kumar Srivastava; Robert Strong; Lily-belle Sweet; Lily-belle Sweet; Petar Vojnović; Allard de Wit; Allard de Wit; Maximilian Zachow; Ioannis N. Athanasiadis; Ron van Bree; Stella Ofori-Ampofo; Aike Potze; Abdelrahman Saleh; Malte von Bloh; Andres Castellano; Oumnia Ennaji; Raed Hamed; Rahel Laudien; Donghoon Lee; Inti Luna; Michele Meroni; Janet Mumo Mutuku; Siyabusa Mkuhlani; Jonathan Richetti; Alex C. Ruane; Ritvik Sahajpal; Guanyuan Shuai; Vasileios Sitokonstantinou; Rogerio de Souza Noia Junior; Amit Kumar Srivastava; Robert Strong; Petar Vojnović; Maximilian Zachow; Ioannis N. Athanasiadis
    License

    https://joinup.ec.europa.eu/page/eupl-text-11-12https://joinup.ec.europa.eu/page/eupl-text-11-12

    Description

    CY-Bench: A comprehensive benchmark dataset for sub-national crop yield forecasting


    Overview

    CY-Bench is a dataset and benchmark for subnational crop yield forecasting, with coverage of major crop growing countries of the world for maize and wheat. By subnational, we mean the administrative level where yield statistics are published. When statistics are available for multiple levels, we pick the highest resolution. The dataset combines sub-national yield statistics with relevant predictors, such as growing-season weather indicators, remote sensing indicators, evapotranspiration, soil moisture indicators, and static soil properties. CY-Bench has been designed and curated by agricultural experts, climate scientists, and machine learning researchers from the AgML Community, with the aim of facilitating model intercomparison across the diverse agricultural systems around the globe in conditions as close as possible to real-world operationalization. Ultimately, by lowering the barrier to entry for ML researchers in this crucial application area, CY-Bench will facilitate the development of improved crop forecasting tools that can be used to support decision-makers in food security planning worldwide.

    * Crops : Wheat & Maize
    * Spatial Coverage : Wheat (29 countries), Maize (38).
    See CY-Bench paper appendix for the list of countries.
    * Temporal Coverage : Varies. See country-specific data

    Data

    Data format


    The benchmark data is organized as a collection of CSV files, with each file representing a specific category of variable for a particular country. Each CSV file is named according to the category and the country it pertains to, facilitating easy identification and retrieval. The data within each CSV file is structured in tabular format, where rows represent observations and columns represent different predictors related to a category of variable.

    Data content

    All data files are provided as .csv.

    DataDescriptionVariables (units)Temporal ResolutionData Source (Reference)
    crop_calendarStart and end of growing seasonsos (day of the year), eos (day of the year)StaticWorld Cereal (Franch et al, 2022)
    fparfraction of absorbed photosynthetically active radiationfpar (%)Dekadal (3 times a month; 1-10, 11-20, 21-31)European Commission's Joint Research Centre (EC-JRC, 2024)
    ndvinormalized difference vegetation index-approximately weeklyMOD09CMG (Vermote, 2015)
    meteotemperature, precipitation (prec), radiation, potential evapotranspiration (et0), climatic water balance (= prec - et0) tmin (C), tmax (C), tavg (C), prec (mm0, et0 (mm), cwb (mm), rad (J m-2 day-1)dailyAgERA5 (Boogaard et al, 2022), FAO-AQUASTAT for et0 (FAO-AQUASTAT, 2024)
    soil_moisturesurface soil moisture, rootzone soil moisturessm (kg m-2), rsm (kg m-2)dailyGLDAS (Rodell et al, 2004)
    soilavailable water capacity, bulk density, drainage classawc (c m-1), bulk_density (kg dm-3), drainage class (category)staticWISE Soil database (Batjes, 2016)
    yieldend-of-season yieldyield (t ha-1)yearlyVarious country or region specific sources (see crop_statistics_... in https://github.com/BigDataWUR/AgML-CY-Bench/tree/main/data_preparation)

    Folder structure


    The CY-Bench dataset has been structure at first level by crop type and subsequently by country. For each country, the folder name follows the ISO 3166-1 alpha-2 two-character code. A separate .csv is available for each predictor data and crop calendar as shown below. The csv files are named to reflect the corresponding country and crop type e.g. **variable_croptype_country.csv**.
    ```
    CY-Bench

    └─── maize
    │ │
    │ └─── AO
    │ │ -- crop_calendar_maize_AO.csv
    │ │ -- fpar_maize_AO.csv
    │ │ -- meteo_maize_AO.csv
    │ │ -- ndvi_maize_AO.csv
    │ │ -- soil_maize_AO.csv
    │ │ -- soil_moisture_maize_AO.csv
    │ │ -- yield_maize_AO.csv
    │ │
    │ └─── AR
    │ -- crop_calendar_maize_AR.csv
    │ -- fpar_maize_AR.csv
    │ -- ...

    └─── wheat
    │ │
    │ └─── AR
    │ │ -- crop_calendar_wheat_AR.csv
    │ │ -- fpar_wheat_AR.csv
    │ │ ...
    ```

    Example : CSV data content for maize in country X

    ```
    X
    └─── crop_calendar_maize_X.csv
    │ -- crop_name (name of the crop)
    │ -- adm_id (unique identifier for a subnational unit)
    │ -- sos (start of crop season)
    │ -- eos (end of crop season)

    └─── fpar_maize_X.csv
    │ -- crop_name
    │ -- adm_id
    │ -- date (in the format YYYYMMdd)
    │ -- fpar

    └─── meteo_maize_X.csv
    │ -- crop_name
    │ -- adm_id
    │ -- date (in the format YYYYMMdd)

    │ -- tmin (minimum temperature)
    │ -- tmax (maximum temperature)
    │ -- prec (precipitation)
    │ -- rad (radiation)
    │ -- tavg (average temperature)
    │ -- et0 (evapotranspiration)
    │ -- cwb (crop water balance)

    └─── ndvi_maize_X.csv
    │ -- crop_name
    │ -- adm_id
    │ -- date (in the format YYYYMMdd)
    │ -- ndvi

    └─── soil_maize_X.csv
    │ -- crop_name
    │ -- adm_id
    │ -- awc (available water capacity)
    │ -- bulk_density
    │ -- drainage_class

    └─── soil_moisture_maize_X.csv
    │ -- crop_name
    │ -- adm_id
    │ -- date (in the format YYYYMMdd)
    │ -- ssm (surface soil moisture)
    │ -- rsm ()

    └─── yield_maize_X.csv
    │ -- crop_name
    │ -- country_code
    │ -- adm_id
    │ -- harvest_year
    │ -- yield
    │ -- harvest_area
    │ -- production

    Data access

    The full dataset can be downloaded directly from Zenodo or using the ```zenodo_get``` library


    License and citation


    We kindly ask all users of CY-Bench to properly respect licensing and citation conditions of the datasets included.

  6. T

    GDP by Country Dataset

    • tradingeconomics.com
    csv, excel, json, xml
    Updated Jun 29, 2011
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    TRADING ECONOMICS (2011). GDP by Country Dataset [Dataset]. https://tradingeconomics.com/country-list/gdp
    Explore at:
    csv, json, xml, excelAvailable download formats
    Dataset updated
    Jun 29, 2011
    Dataset authored and provided by
    TRADING ECONOMICS
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    2025
    Area covered
    World
    Description

    This dataset provides values for GDP reported in several countries. The data includes current values, previous releases, historical highs and record lows, release frequency, reported unit and currency.

  7. COVID-19 useful features by country

    • kaggle.com
    Updated May 3, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ouassim Adnane (2020). COVID-19 useful features by country [Dataset]. https://www.kaggle.com/ishivinal/covid19-useful-features-by-country/activity
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 3, 2020
    Dataset provided by
    Kaggle
    Authors
    Ouassim Adnane
    License

    http://www.gnu.org/licenses/old-licenses/gpl-2.0.en.htmlhttp://www.gnu.org/licenses/old-licenses/gpl-2.0.en.html

    Description

    Context

    This dataset provides Country name created to match the COVID19 Global Forecasting (Week 4) challenge If you found this helpful an upvote would be very much appreciated. Let me know if you find any mistakes, so I can correct them.

    Content

    The dataset consists of one main CSV file: Countries_usefulFeatures.csv that contains 12 columns see the descriptions below for more detailed information.

    Column Description

    1. Country_Region: Name of the country
    2. Population_Size: the population size 2018 stats
    3. Tourism: International tourism, number of arrivals 2018
    4. Date_FirstFatality: Date of the first Fatality of the COVID-19
    5. Date_FirstConfirmedCase: Date of the first confirmed case of the COVID-19
    6. Latitude
    7. Longitude
    8. Mean_Age: mean age of the population 2018 stats
    9. Lockdown_Date: date of the lockdown
    10. Lockdown_Type: type of the lockdown
    11. Country_Code: 3 digit country code

    Acknowledgements

    Data is collected from :

  8. p

    Luxembourgish Country Border - 5k Coordinates

    • data.public.lu
    csv
    Updated May 8, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Pit Schneider (2023). Luxembourgish Country Border - 5k Coordinates [Dataset]. https://data.public.lu/en/datasets/luxembourgish-country-border-5k-coordinates/
    Explore at:
    csv(110022)Available download formats
    Dataset updated
    May 8, 2023
    Dataset authored and provided by
    Pit Schneider
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Luxembourgish country border expressed as a CSV list of 5000 coordinates: First list entry contains northmost coordinates. Last list entry (row 5001) is identical to first entry. List sequence follows border in a clockwise way. All coordinates have a precision of seven decimal digits. Data was manually derived from Apple Maps, thus not representing legal/official border data.

  9. f

    country_list.csv and country_period_validation.csv files used in the...

    • figshare.com
    txt
    Updated Jun 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Guy Abel (2023). country_list.csv and country_period_validation.csv files used in the bilateral international migration flow estimates by sex [Dataset]. http://doi.org/10.6084/m9.figshare.18737768.v4
    Explore at:
    txtAvailable download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    figshare
    Authors
    Guy Abel
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Supplementary data files for Abel & Cohen (2022) including1. country_list.csv with country codes, country names, first and last period covered and availability of reported data used for validation exercise.2. country_period_validation.csv with types and sources of reported migration statistics for each country and period in each of collections used for the validation exercise.

  10. d

    Population figures for countries, regions (e.g. Asia) and the world

    • datahub.io
    Updated Aug 29, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2017). Population figures for countries, regions (e.g. Asia) and the world [Dataset]. https://datahub.io/core/population
    Explore at:
    Dataset updated
    Aug 29, 2017
    License

    ODC Public Domain Dedication and Licence (PDDL) v1.0http://www.opendatacommons.org/licenses/pddl/1.0/
    License information was derived automatically

    Area covered
    Asia, World
    Description

    Population figures for countries, regions (e.g. Asia) and the world. Data comes originally from World Bank and has been converted into standard CSV.

  11. Geographical names index

    • gov.uk
    • s3.amazonaws.com
    Updated Mar 25, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Foreign, Commonwealth & Development Office (2024). Geographical names index [Dataset]. https://www.gov.uk/government/publications/geographical-names-and-information
    Explore at:
    Dataset updated
    Mar 25, 2024
    Dataset provided by
    GOV.UKhttp://gov.uk/
    Authors
    Foreign, Commonwealth & Development Office
    Description

    These are the British English-language names and descriptive terms for sovereign countries, UK Crown Dependencies and UK Overseas Territories, as well as their citizens. ‘Sovereign’ means that they are independent states, recognised under international law.

    The Foreign, Commonwealth & Development Office (FCDO) approved these names. The FCDO leads on geographical names for the UK government, working closely with the Permanent Committee on Geographical Names.

    In these lists:

    All UK government departments and other public bodies must use the approved country and territory names in these datasets. Using these names ensures consistency and clarity across public and internal communications, guidance and services.

    • the full ‘official name’ is also provided for use when the formal version of a country’s name is needed

    • citizen names in the lists are not the legal names for the citizen, they do not relate to the citizen’s ethnicity

    You can also view the Welsh language version of the geographical names index on https://www.gov.wales/bydtermcymru/international-place-names" class="govuk-link">GOV.WALES: international place-names.

  12. e

    Data from: The Tropical Andes Biodiversity Hotspot: A Comprehensive Dataset...

    • knb.ecoinformatics.org
    • dataone.org
    • +3more
    Updated May 30, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Pablo Jarrín-V.; Mario H Yánez-Muñoz (2024). The Tropical Andes Biodiversity Hotspot: A Comprehensive Dataset for the Mira-Mataje Binational Basins [Dataset]. http://doi.org/10.5063/F14F1P6H
    Explore at:
    Dataset updated
    May 30, 2024
    Dataset provided by
    Knowledge Network for Biocomplexity
    Authors
    Pablo Jarrín-V.; Mario H Yánez-Muñoz
    Time period covered
    Jun 11, 2022 - Jun 11, 2023
    Area covered
    Description

    We present a flora and fauna dataset for the Mira-Mataje binational basins. This is an area shared between southwestern Colombia and northwestern Ecuador, where both the Chocó and Tropical Andes biodiversity hotspots converge. Information from 120 sources was systematized in the Darwin Core Archive (DwC-A) standard and geospatial vector data format for geographic information systems (GIS) (shapefiles). Sources included natural history museums, published literature, and citizen science repositories across 18 countries. The resulting database has 33,460 records from 5,281 species, of which 1,083 are endemic and 680 threatened. The diversity represented in the dataset is equivalent to 10\% of the total plant species and 26\% of the total terrestrial vertebrate species in the hotspots. It corresponds to 0.07\% of their total area. The dataset can be used to estimate and compare biodiversity patterns with environmental parameters and provide value to ecosystems, ecoregions, and protected areas. The dataset is a baseline for future assessments of biodiversity in the face of environmental degradation, climate change, and accelerated extinction processes. The data has been formally presented in the manuscript entitled "The Tropical Andes Biodiversity Hotspot: A Comprehensive Dataset for the Mira-Mataje Binational Basins" in the journal "Scientific Data". To maintain DOI integrity, this version will not change after publication of the manuscript and therefore we cannot provide further references on volume, issue, and DOI of manuscript publication. - Data format 1: The .rds file extension saves a single object to be read in R and provides better compression, serialization, and integration within the R environment, than simple .csv files. The description of file names is in the original manuscript. -- m_m_flora_2021_voucher_ecuador.rds -- m_m_flora_2021_observation_ecuador.rds -- m_m_flora_2021_total_ecuador.rds -- m_m_fauna_2021_ecuador.rds - Data format 2: The .csv file has been encoded in UTF-8, and is an ASCII file with text separated by commas. The description of file names is in the original manuscript. -- m_m_flora_fauna_2021_all.zip. This file includes all biodiversity datasets. -- m_m_flora_2021_voucher_ecuador.csv -- m_m_flora_2021_observation_ecuador.csv -- m_m_flora_2021_total_ecuador.csv -- m_m_fauna_2021_ecuador.csv - Data format 3: We consolidated a shapefile for the basin containing layers for vegetation ecosystems and the total number of occurrences, species, and endemic and threatened species for each ecosystem. -- biodiversity_measures_mira_mataje.zip. This file includes the .shp file and accessory geomatic files. - A set of 3D shaded-relief map representations of the data in the shapefile can be found at https://doi.org/10.6084/m9.figshare.23499180.v4 Three taxonomic data tables were used in our technical validation of the presented dataset. These three files are: 1) the_catalog_of_life.tsv (Source: Bánki, O. et al. Catalogue of life checklist (version 2024-03-26). https://doi.org/10.48580/dfz8d (2024)) 2) world_checklist_of_vascular_plants_names.csv (we are also including ancillary tables "world_checklist_of_vascular_plants_distribution.csv", and "README_world_checklist_of_vascular_plants_.xlsx") (Source: Govaerts, R., Lughadha, E. N., Black, N., Turner, R. & Paton, A. The World Checklist of Vascular Plants is a continuously updated resource for exploring global plant diversity. Sci. Data 8, 215, 10.1038/s41597-021-00997-6 (2021).) 3) world_flora_online.csv (Source: The World Flora Online Consortium et al. World flora online plant list December 2023, 10.5281/zenodo.10425161 (2023).)

  13. d

    Country, Regional and World GDP (Gross Domestic Product)

    • datahub.io
    Updated Aug 29, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2017). Country, Regional and World GDP (Gross Domestic Product) [Dataset]. https://datahub.io/core/gdp
    Explore at:
    Dataset updated
    Aug 29, 2017
    License

    ODC Public Domain Dedication and Licence (PDDL) v1.0http://www.opendatacommons.org/licenses/pddl/1.0/
    License information was derived automatically

    Area covered
    World
    Description

    Country, regional and world GDP in current US Dollars ($). Regional means collections of countries e.g. Europe & Central Asia. Data is sourced from the World Bank and turned into a standard normalized CSV.

  14. List_of_countries_by_wheat_exports

    • kaggle.com
    Updated Jul 17, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mathurin Aché (2020). List_of_countries_by_wheat_exports [Dataset]. https://www.kaggle.com/mathurinache/list-of-countries-by-wheat-exports/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 17, 2020
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Mathurin Aché
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    This dataset is extracted from https://en.wikipedia.org/wiki/List_of_countries_by_wheat_exports. Context: There s a story behind every dataset and heres your opportunity to share yours.Content: What s inside is more than just rows and columns. Make it easy for others to get started by describing how you acquired the data and what time period it represents, too. Acknowledgements:We wouldn t be here without the help of others. If you owe any attributions or thanks, include them here along with any citations of past research.Inspiration: Your data will be in front of the world s largest data science community. What questions do you want to see answered?

  15. d

    Addresses RÚIAN data distributed by the country in the CSV format

    • data.gov.cz
    Updated Feb 23, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Český úřad zeměměřický a katastrální (2024). Addresses RÚIAN data distributed by the country in the CSV format [Dataset]. https://data.gov.cz/dataset?iri=https%3A%2F%2Fdata.gov.cz%2Fzdroj%2Fdatov%C3%A9-sady%2F00025712%2F3eac0278ad025b9a9015465571fdb907
    Explore at:
    Dataset updated
    Feb 23, 2024
    Dataset authored and provided by
    Český úřad zeměměřický a katastrální
    Description

    Dataset contains list of address points for the whole Czech Republic in CSV format. For each address point following attributes are specified: address point code, municipality code and name, code and name of town district (for territorialy structured statutory cities only), code and name of Prague city district (for Prague only), municipality part code and name, street code and name (in case it is specified), type of building object (with description/registration house number), house number, orientation number (if it is specified), character of orientation number (if it is specified), postal code, Y and X coordinates of pointer of address point (in JTSK coordinate system) and the date of validity. Dataset is provided as Open Data (licence CC-BY 4.0). Data is based on RÚIAN (Register of Territorial Identification, Addresses and Real Estates). Data covers the whole territory of the Czech Republic. Data is provided in a compressed form (ZIP archive). File is created during the first day of each month with data valid to the last day of previous month. More in the Act No. 111/2009 Coll., on the Basic Registers, in Decree No. 359/2011 Coll., on the Basic Register of Territorial Identification, Addresses and Real Estates.

  16. Film Circulation dataset

    • zenodo.org
    • data.niaid.nih.gov
    bin, csv, png
    Updated Jul 12, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Skadi Loist; Skadi Loist; Evgenia (Zhenya) Samoilova; Evgenia (Zhenya) Samoilova (2024). Film Circulation dataset [Dataset]. http://doi.org/10.5281/zenodo.7887672
    Explore at:
    csv, png, binAvailable download formats
    Dataset updated
    Jul 12, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Skadi Loist; Skadi Loist; Evgenia (Zhenya) Samoilova; Evgenia (Zhenya) Samoilova
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Complete dataset of “Film Circulation on the International Film Festival Network and the Impact on Global Film Culture”

    A peer-reviewed data paper for this dataset is in review to be published in NECSUS_European Journal of Media Studies - an open access journal aiming at enhancing data transparency and reusability, and will be available from https://necsus-ejms.org/ and https://mediarep.org

    Please cite this when using the dataset.


    Detailed description of the dataset:

    1 Film Dataset: Festival Programs

    The Film Dataset consists a data scheme image file, a codebook and two dataset tables in csv format.

    The codebook (csv file “1_codebook_film-dataset_festival-program”) offers a detailed description of all variables within the Film Dataset. Along with the definition of variables it lists explanations for the units of measurement, data sources, coding and information on missing data.

    The csv file “1_film-dataset_festival-program_long” comprises a dataset of all films and the festivals, festival sections, and the year of the festival edition that they were sampled from. The dataset is structured in the long format, i.e. the same film can appear in several rows when it appeared in more than one sample festival. However, films are identifiable via their unique ID.

    The csv file “1_film-dataset_festival-program_wide” consists of the dataset listing only unique films (n=9,348). The dataset is in the wide format, i.e. each row corresponds to a unique film, identifiable via its unique ID. For easy analysis, and since the overlap is only six percent, in this dataset the variable sample festival (fest) corresponds to the first sample festival where the film appeared. For instance, if a film was first shown at Berlinale (in February) and then at Frameline (in June of the same year), the sample festival will list “Berlinale”. This file includes information on unique and IMDb IDs, the film title, production year, length, categorization in length, production countries, regional attribution, director names, genre attribution, the festival, festival section and festival edition the film was sampled from, and information whether there is festival run information available through the IMDb data.


    2 Survey Dataset

    The Survey Dataset consists of a data scheme image file, a codebook and two dataset tables in csv format.

    The codebook “2_codebook_survey-dataset” includes coding information for both survey datasets. It lists the definition of the variables or survey questions (corresponding to Samoilova/Loist 2019), units of measurement, data source, variable type, range and coding, and information on missing data.

    The csv file “2_survey-dataset_long-festivals_shared-consent” consists of a subset (n=161) of the original survey dataset (n=454), where respondents provided festival run data for films (n=206) and gave consent to share their data for research purposes. This dataset consists of the festival data in a long format, so that each row corresponds to the festival appearance of a film.

    The csv file “2_survey-dataset_wide-no-festivals_shared-consent” consists of a subset (n=372) of the original dataset (n=454) of survey responses corresponding to sample films. It includes data only for those films for which respondents provided consent to share their data for research purposes. This dataset is shown in wide format of the survey data, i.e. information for each response corresponding to a film is listed in one row. This includes data on film IDs, film title, survey questions regarding completeness and availability of provided information, information on number of festival screenings, screening fees, budgets, marketing costs, market screenings, and distribution. As the file name suggests, no data on festival screenings is included in the wide format dataset.


    3 IMDb & Scripts

    The IMDb dataset consists of a data scheme image file, one codebook and eight datasets, all in csv format. It also includes the R scripts that we used for scraping and matching.

    The codebook “3_codebook_imdb-dataset” includes information for all IMDb datasets. This includes ID information and their data source, coding and value ranges, and information on missing data.

    The csv file “3_imdb-dataset_aka-titles_long” contains film title data in different languages scraped from IMDb in a long format, i.e. each row corresponds to a title in a given language.

    The csv file “3_imdb-dataset_awards_long” contains film award data in a long format, i.e. each row corresponds to an award of a given film.

    The csv file “3_imdb-dataset_companies_long” contains data on production and distribution companies of films. The dataset is in a long format, so that each row corresponds to a particular company of a particular film.

    The csv file “3_imdb-dataset_crew_long” contains data on names and roles of crew members in a long format, i.e. each row corresponds to each crew member. The file also contains binary gender assigned to directors based on their first names using the GenderizeR application.

    The csv file “3_imdb-dataset_festival-runs_long” contains festival run data scraped from IMDb in a long format, i.e. each row corresponds to the festival appearance of a given film. The dataset does not include each film screening, but the first screening of a film at a festival within a given year. The data includes festival runs up to 2019.

    The csv file “3_imdb-dataset_general-info_wide” contains general information about films such as genre as defined by IMDb, languages in which a film was shown, ratings, and budget. The dataset is in wide format, so that each row corresponds to a unique film.

    The csv file “3_imdb-dataset_release-info_long” contains data about non-festival release (e.g., theatrical, digital, tv, dvd/blueray). The dataset is in a long format, so that each row corresponds to a particular release of a particular film.

    The csv file “3_imdb-dataset_websites_long” contains data on available websites (official websites, miscellaneous, photos, video clips). The dataset is in a long format, so that each row corresponds to a website of a particular film.

    The dataset includes 8 text files containing the script for webscraping. They were written using the R-3.6.3 version for Windows.

    The R script “r_1_unite_data” demonstrates the structure of the dataset, that we use in the following steps to identify, scrape, and match the film data.

    The R script “r_2_scrape_matches” reads in the dataset with the film characteristics described in the “r_1_unite_data” and uses various R packages to create a search URL for each film from the core dataset on the IMDb website. The script attempts to match each film from the core dataset to IMDb records by first conducting an advanced search based on the movie title and year, and then potentially using an alternative title and a basic search if no matches are found in the advanced search. The script scrapes the title, release year, directors, running time, genre, and IMDb film URL from the first page of the suggested records from the IMDb website. The script then defines a loop that matches (including matching scores) each film in the core dataset with suggested films on the IMDb search page. Matching was done using data on directors, production year (+/- one year), and title, a fuzzy matching approach with two methods: “cosine” and “osa.” where the cosine similarity is used to match titles with a high degree of similarity, and the OSA algorithm is used to match titles that may have typos or minor variations.

    The script “r_3_matching” creates a dataset with the matches for a manual check. Each pair of films (original film from the core dataset and the suggested match from the IMDb website was categorized in the following five categories: a) 100% match: perfect match on title, year, and director; b) likely good match; c) maybe match; d) unlikely match; and e) no match). The script also checks for possible doubles in the dataset and identifies them for a manual check.

    The script “r_4_scraping_functions” creates a function for scraping the data from the identified matches (based on the scripts described above and manually checked). These functions are used for scraping the data in the next script.

    The script “r_5a_extracting_info_sample” uses the function defined in the “r_4_scraping_functions”, in order to scrape the IMDb data for the identified matches. This script does that for the first 100 films, to check, if everything works. Scraping for the entire dataset took a few hours. Therefore, a test with a subsample of 100 films is advisable.

    The script “r_5b_extracting_info_all” extracts the data for the entire dataset of the identified matches.

    The script “r_5c_extracting_info_skipped” checks the films with missing data (where data was not scraped) and tried to extract data one more time to make sure that the errors were not caused by disruptions in the internet connection or other technical issues.

    The script “r_check_logs” is used for troubleshooting and tracking the progress of all of the R scripts used. It gives information on the amount of missing values and errors.


    4 Festival Library Dataset

    The Festival Library Dataset consists of a data scheme image file, one codebook and one dataset, all in csv format.

    The codebook (csv file “4_codebook_festival-library_dataset”) offers a detailed description of all variables within the Library Dataset. It lists the definition of variables, such as location and festival name, and festival categories,

  17. d

    [Eco-Movement] EV Charging Station DC Hardware Data - CSV updated daily

    • datarade.ai
    .csv
    Updated Feb 26, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Eco-Movement (2021). [Eco-Movement] EV Charging Station DC Hardware Data - CSV updated daily [Dataset]. https://datarade.ai/data-products/eco-movement-ev-charge-point-data-complete-coverage-of-euro-eco-movement
    Explore at:
    .csvAvailable download formats
    Dataset updated
    Feb 26, 2021
    Dataset authored and provided by
    Eco-Movement
    Area covered
    Liechtenstein, Réunion, Slovenia, Netherlands, Turkey, Chile, Isle of Man, Lithuania, Guadeloupe, Monaco
    Description

    Eco-Movement is the leading source for EV charging station data. We offer full coverage of all (semi)public EV chargers across Europe, North & Latin America, Oceania, and ever more additional countries. Our real-time database now contains about 1,000,000 unique plugs. Eco-Movement is a specialised B2B data provider focusing 100% on EV charging station data quality and enrichment. Hundreds of quality checks are performed through our proprietary quality dashboard, IT architecture and AI. With the highest quality on the market, we are the trusted choice of mobility industry leaders such as Google, Tesla, Bloomberg, and the European Commission’s EAFO portal.

    Eco-Movement integrates data from 3000+ direct connections with EV Charge Point Operators into a uniform, accurate and complete database. We have an unparalleled set of charge point related attributes, all available on individual charging plug level: from Geolocation to Max Power and from Operator to Hardware and Pricing details. Simple, reliable, and up-to-date: The Eco-Movement database is refreshed every day.

    Whether you are in need of insights, building new products or conducting research, high quality data is more important than ever. Our online Data Retrieval Platform is the easy solution to all your EV Charging Station related data needs. Our DC Hardware Data is an unique dataset developed by Eco-Movement, providing hardware information on individual DC charging station level. This report is for your organisation if you want to gain access to accurate data on the manufacturer and model of charging stations, for example as an essential input for your R&D strategy or competitive analysis.

    The hardware report includes full geolocation, operator/brand, and technical information for each individual station, as well as two specific hardware attributes: DC Hardware Manufacturer and DC Hardware Model. This report is available for all countries in our database (see full list of territories below). The price of the data is dependent on the geographies chosen, the length of the subscription, and the intended use.

    Check out our other Data Offerings available, and gain more valuable market insights on EV charging directly from the experts.

    ALSO AVAILABLE We also offer EV Charging Station Location & Tariffs Data via API (JSON) or online download (CSV). Get detailed insights on Charging Station Locations as well as the prices paid at individual chargers, whether payment is done directly to the CPO or with one of the 200+ eMSP products in our database.

    ABOUT US Eco-Movement's mission is providing the EV ecosystem with the best and most relevant Charging Station information. Based in Utrecht, the Netherlands, Eco-Movement is completely independent from other industry players. We are an active and trusted player in the EV ecosystem and the exclusive source for European Commission charging infrastructure data (EAFO).

  18. C

    Replication data for "High life satisfaction reported among small-scale...

    • dataverse.csuc.cat
    csv, txt
    Updated Feb 7, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Eric Galbraith; Eric Galbraith; Victoria Reyes Garcia; Victoria Reyes Garcia (2024). Replication data for "High life satisfaction reported among small-scale societies with low incomes" [Dataset]. http://doi.org/10.34810/data904
    Explore at:
    csv(1620), csv(7829), txt(7017), csv(227502)Available download formats
    Dataset updated
    Feb 7, 2024
    Dataset provided by
    CORA.Repositori de Dades de Recerca
    Authors
    Eric Galbraith; Eric Galbraith; Victoria Reyes Garcia; Victoria Reyes Garcia
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Jan 1, 2021 - Oct 24, 2023
    Area covered
    Darjeeling, India, Ba, Fiji, Bulgan soum, Mongolia, Bassari country, Senegal, Laprak, Nepal, Puna, Argentina, United Republic of, Tanzania, Mafia Island, China, Shangri-la, Western highlands, Guatemala, Ghana, Kumbungu
    Dataset funded by
    European Commission
    Description

    This dataset was created in order to document self-reported life evaluations among small-scale societies that exist on the fringes of mainstream industrialized socieities. The data were produced as part of the LICCI project, through fieldwork carried out by LICCI partners. The data include individual responses to a life satisfaction question, and household asset values. Data from Gallup World Poll and the World Values Survey are also included, as used for comparison. TABULAR DATA-SPECIFIC INFORMATION --------------------------------- 1. File name: LICCI_individual.csv Number of rows and columns: 2814,7 Variable list: Variable names: User, Site, village Description: identification of investigator and location Variable name: Well.being.general Description: numerical score for life satisfaction question Variable names: HH_Assets_US, HH_Assets_USD_capita Description: estimated value of representative assets in the household of respondent, total and per capita (accounting for number of household inhabitants) 2. File name: LICCI_bySite.csv Number of rows and columns: 19,8 Variable list: Variable names: Site, N Description: site name and number of respondents at the site Variable names: SWB_mean, SWB_SD Description: mean and standard deviation of life satisfaction score Variable names: HHAssets_USD_mean, HHAssets_USD_sd Description: Site mean and standard deviation of household asset value Variable names: PerCapAssets_USD_mean, PerCapAssets_USD_sd Description: Site mean and standard deviation of per capita asset value 3. File name: gallup_WVS_GDP_pk.csv Number of rows and columns: 146,8 Variable list: Variable name: Happiness Score, Whisker-high, Whisker-low Description: from Gallup World Poll as documented in World Happiness Report 2022. Variable name: GDP-PPP2017 Description: Gross Domestic Product per capita for year 2020 at PPP (constant 2017 international $). Accessed May 2022. Variable name: pk Description: Produced capital per capita for year 2018 (in 2018 US$) for available countries, as estimated by the World Bank (accessed February 2022). Variable names: WVS7_mean, WVS7_std Description: Results of Question 49 in the World Values Survey, Wave 7.

  19. List_of_countries_by_traffic-related_death_rate

    • kaggle.com
    Updated Jul 17, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mathurin Aché (2020). List_of_countries_by_traffic-related_death_rate [Dataset]. https://www.kaggle.com/mathurinache/list-of-countries-by-traffic-related-death-rate/discussion
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 17, 2020
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Mathurin Aché
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    This dataset is extracted from https://en.wikipedia.org/wiki/List_of_countries_by_traffic-related_death_rate. Context: There s a story behind every dataset and heres your opportunity to share yours.Content: What s inside is more than just rows and columns. Make it easy for others to get started by describing how you acquired the data and what time period it represents, too. Acknowledgements:We wouldn t be here without the help of others. If you owe any attributions or thanks, include them here along with any citations of past research.Inspiration: Your data will be in front of the world s largest data science community. What questions do you want to see answered?

  20. d

    Key generic technology prediction in patent citation using graph neural...

    • dataone.org
    • data.niaid.nih.gov
    • +1more
    Updated Jun 5, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    M. L. Ding (2024). Key generic technology prediction in patent citation using graph neural networks [Dataset]. http://doi.org/10.5061/dryad.nk98sf803
    Explore at:
    Dataset updated
    Jun 5, 2024
    Dataset provided by
    Dryad Digital Repository
    Authors
    M. L. Ding
    Time period covered
    Jan 11, 2024
    Description

    With the rapid advancement of the Fourth Industrial Revolution, international competition in technology and industry is intensifying. However, in the era of big data and large-scale science, making accurate judgments about the key areas of technology and innovative trends has become exceptionally difficult. This paper constructs a patent indicator evaluation system based on the dimensions of key and generic patent citation, integrates graph neural network modeling to predict key common technologies, and confirms the effectiveness of the method using the field of genetic engineering as an example. According to the LDA topic model, the main technical R&D directions in genetic engineering are genetic analysis and detection technologies, the application of microorganisms in industrial production, virology research involving vaccine development and immune responses, high-throughput sequencing and analysis technologies in genomics, targeted drug design and molecular therapeutic strategies..., These datasets were obtained by the Incopat patent database for cited patents (2013-2022) in the field of genetic engineering. Details for the datasets are provided in the README file. This directory contains the selection of the patent datasets. 1) Table of key generic indicators for nodes (partial 1).csv This file consists of 10 indicators of patents: technical coverage, patent families, patent family citation, patent cooperation, enterprise-enterprise cooperation, industry-university-research cooperation, claims, citation frequency, layout countries, and layout countries. 2) Table of key generic indicators for nodes (partial 2).csv This file consists of 10 indicators of patents: technical convergence, cited countries, inventors, citations, homologous countries/areas, degree centrality, closeness centrality, betweenness centrality, eigenvector centrality, and PageRank. 3) patent.content The content file contains descriptions of the patents in the following format:

    This README file was generated on 2023-11-25 by Mingli Ding.

    GENERAL INFORMATION

    1. Author Information Investigators Contact Information Name: Mingli Ding; Wangke Yu; Shuhua Wang Institution: Jingdezhen Ceramic University Address: Jingdezhen, Jiangxi, China Email: mlding1@163.com
    2. Date of data collection:2013-2022

    DATA & FILE OVERVIEW

    1. File List:

    A) Table of key generic indicators for nodes (partial 1).csv

    B) Table of key generic indicators for nodes (partial 2).csv

    C) patent.content

    D) patent.cites

    E) Graph neural network modeling highest accuracy for different dimensions.csv

    F) Prediction effects of key generic technologies.csv

    DATA-SPECIFIC INFORMATION FOR: Table of key generic indicators for nodes (partial 1).csv

    1. Number of variables: 10
    2. Number of cases/rows: 72489
    3. Variable List:
    • technical coverage: number ...
Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
(2017). List of all countries with their 2 digit codes (ISO 3166-1) [Dataset]. https://datahub.io/core/country-list

List of all countries with their 2 digit codes (ISO 3166-1)

Explore at:
3 scholarly articles cite this dataset (View in Google Scholar)
Dataset updated
Aug 29, 2017
License

ODC Public Domain Dedication and Licence (PDDL) v1.0http://www.opendatacommons.org/licenses/pddl/1.0/
License information was derived automatically

Description

ISO 3166-1-alpha-2 English country names and code elements. This list states the country names (official short names in English) in alphabetical order as given in ISO 3166-1 and the corresponding ISO 3166-1-alpha-2 code elements.

Search
Clear search
Close search
Google apps
Main menu