While the presence of foreign-born footballers in national teams has a long history, it is often believed that the World Cup has become more migratory over time. The presumed increases in the volume and diversity of foreign-born footballers have, however, remained empirically untested. In this article, we empirically test whether the presence of foreign-born footballers at the World Cup has changed over time in respect to these two dimensions of migration. We conducted an analysis on 4.761 footballers, derived from the fifteen national teams that competed in at least ten editions of the World Cup between 1930 and 2018, which comprises of 301 foreign-born football players. We argue that countries’ different histories of migration, in combination with historically used citizenship regimes, largely influence the migratory dimensions of their representative football teams. Our outcomes show that the (absolute) volume of foreign-born footballers in World Cups is indeed increasing over time. Moreover, foreign-born footballers seem to come from an increasingly diverse range of countries. We, therefore, conclude that the World Cup has become more migratory in terms of volume and diversity from an immigration perspective.
The ICC Men's T20 World Cup first took place in 2007 and has been held on a two or four-year basis ever since. The West Indies, England, and India are the most successful teams in the history of the tournament, having all lifted the trophy on two occasions. India won the most recent T20 World Cup in 2024, beating South Africa in the final.
Although there is a common belief that more footballers are representing countries other than their native ones in recent World Cup editions, a historical overview on migrant footballers representing national teams is lacking. To fill this gap, a database consisting of 10,137 football players who participated in the FIFA World Cup (1930-2018) was created. In order to count the number of migrant footballers in national teams over time, we critically reflect on the term migrant and the commonly used foreign-born proxies in mainstream migration research. A foreign-born approach to migrants overlooks historical-geopolitical changes like the redrawing of international boundaries and colonial relationships, and tends to shy away from citizenship complexities, leading to an overestimation of the number of migrant footballers in a database. Therefore, we offer an alternative approach that through historical contextualization with an emphasis on citizenship, results in more accurate data on migrant footballers--contextual-nationality approach. By comparing outcomes, a foreign-born approach seems to indicate an increase in the volume of migrant footballers since the mid-1990s, while the contextual-nationality approach illustrates that the presence of migrant footballers is primarily a reflection of trends in international migration
As of 2022, the FIFA World Cup in Qatar was the most expensive World Cup of all time, costing the hosts an estimated 220 billion U.S. dollars. This was nearly 19 times more expensive than the previous World Cup, in part due to high infrastructure costs.
The absolute economic contribution of tourism in Qatar was forecast to continuously increase between 2024 and 2029 by in total 6.6 billion U.S. dollars (+35.76 percent). After the ninth consecutive increasing year, the economic contribution is estimated to reach 24.9 billion U.S. dollars and therefore a new peak in 2029. Depited is the economic contribution of the tourism sector in the country or region at hand.The shown data are an excerpt of Statista's Key Market Indicators (KMI). The KMI are a collection of primary and secondary indicators on the macro-economic, demographic and technological environment in up to 150 countries and regions worldwide. All indicators are sourced from international and national statistical offices, trade associations and the trade press and they are processed to generate comparable data sets (see supplementary notes under details for more information).Find more key insights for the absolute economic contribution of tourism in countries like Saudi Arabia and Oman.
Dataset for growth of Olympic Games (Summer and Winter) and World Cup, 1964-2020 Eleven variables are included for each event: - number of athletes - number of events - number of participating countries - accredited media - Countries broadcast - Tickets sold - Ticketing revenue - Broadcast revenue - Sponsorship revenue - Cost of venues - Cost of organisation Monetary values are all deflated and converted to USD2018.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
This paper contributes to the current literature investigating whether hosting sports mega-events brings tangible economic benefits to the host country. Specifically, we examine whether staging the Olympic Games and the FIFA World Cups leads to observable economic growth. The research has been conducted through a quasi-experimental study in the spirit of the difference-in-differences method. The research subject includes states in which the Olympic Games and FIFA World Cup were held between 2010 and 2016: Canada, South Africa, Great Britain, and Brazil. We found that there is no significant effect of hosting sports mega-events on economic growth.
Midyear population estimates and projections for all countries and areas of the world with a population of 5,000 or more // Source: U.S. Census Bureau, Population Division, International Programs Center// Note: Total population available from 1950 to 2100 for 227 countries and areas. Other demographic variables available from base year to 2100. Base year varies by country and therefore data are not available for all years for all countries. For the United States, total population available from 1950-2060, and other demographic variables available from 1980-2060. See methodology at https://www.census.gov/programs-surveys/international-programs/about/idb.html
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Project Tycho datasets contain case counts for reported disease conditions for countries around the world. The Project Tycho data curation team extracts these case counts from various reputable sources, typically from national or international health authorities, such as the US Centers for Disease Control or the World Health Organization. These original data sources include both open- and restricted-access sources. For restricted-access sources, the Project Tycho team has obtained permission for redistribution from data contributors. All datasets contain case count data that are identical to counts published in the original source and no counts have been modified in any way by the Project Tycho team. The Project Tycho team has pre-processed datasets by adding new variables, such as standard disease and location identifiers, that improve data interpretabilty. We also formatted the data into a standard data format.
Each Project Tycho dataset contains case counts for a specific condition (e.g. measles) and for a specific country (e.g. The United States). Case counts are reported per time interval. In addition to case counts, datsets include information about these counts (attributes), such as the location, age group, subpopulation, diagnostic certainty, place of aquisition, and the source from which we extracted case counts. One dataset can include many series of case count time intervals, such as "US measles cases as reported by CDC", or "US measles cases reported by WHO", or "US measles cases that originated abroad", etc.
Depending on the intended use of a dataset, we recommend a few data processing steps before analysis:
The Global Population Count Grid Time Series Estimates provide a back-cast time series of population grids based on the year 2000 population grid from SEDAC's Global Rural-Urban Mapping Project, Version 1 (GRUMPv1) data set. The grids were created by using rates of population change between decades from the coarser resolution History Database of the Global Environment (HYDE) database to back-cast the GRUMPv1 population count grids. Mismatches between the spatial extent of the HYDE calculated rates and GRUMPv1 population data were resolved via infilling rate cells based on a focal mean of values. Finally, the grids were adjusted so that the population totals for each country equaled the UN World Population Prospects (2008 Revision) estimates for that country for the respective year (1970, 1980, 1990, and 2000). These data do not represent census observations for the years prior to 2000, and therefore can at best be thought of as estimations of the populations in given locations. The population grids are consistent internally within the time series, but are not recommended for use in creating longer time series with any other population grids, including GRUMPv1, Gridded Population of the World, Version 4 (GPWv4), or non-SEDAC developed population grids. These population grids served as an input to SEDAC's Global Estimated Net Migration Grids by Decade: 1970-2000 data set.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This paper presents a benchmark dataset called EO4WildFires; a multi-sensor (multi spectral; Sentinel-2, Synthetic-Aperture Radar - SAR; Sentinel-1, meteorological parameters; NASA Power) time-series dataset that spans 45 countries, which can be used for developing machine learning and deep learning methods targeted for the estimation of the area that a forest wildfire might cover.
This novel EO4WildFires dataset is annotated using EFFIS (European Forest Fire Information System) as forest fire detection and size estimation data source. A total of 31,742 wildfire events are gathered from 2018 to 2022. For each event, Sentinel-2 (multispectral), Sentinel-1 (SAR) and meteorological data are assembled into a single data cube. The meteorological parameters that are included in the data cube are: ratio of actual partial pressure of water vapor to the partial pressure at saturation, average temperature, bias corrected average total precipitation, average wind speed, fraction of land covered by snowfall, percent of root zone soil wetness, snow depth, snow precipitation, as well as percent of soil moisture.
The main problem that this dataset is designed to address, is the severity forecasting before wildfires occur. The dataset is not used to predict wildfire events, but rather to predict the severity (size of area damaged by fire) of a wildfire event, if that happens in a specific place under the current and historical forest status, as recorded from multispectral and SAR images, and meteorological data.
Using the data cube for the collected wildfire events, the EO4WildFires dataset is used to realize three (3) different preliminary experiments, in order to evaluate the contributing factors for wildfire severity prediction. The first experiment evaluates wildfire size using only the meteorological parameters, the second one utilizes both the multispectral and SAR parts of the dataset, while the third exploits all dataset parts. In each experiment, machine learning models are developed, and their accuracy is evaluated.
Published by Collins Bartholomew in partnership with Global System for Mobile Communications (GSMA), the Mobile Coverage Explorer is a raster data representation of the area covered by mobile cellular networks around the world. The dataset series is supplied as raster Data_MCE (operators) and Data_OCI (OpenCellID database). OCI dataset series has been created using OpenCellID tower locations. These derived locations have been used as the centre points of a radius of coverage: 12 kilometres for GSM networks, and 4km for 3G and 4G networks. No 5G data yet exists in the OpenCellID database. These circles of coverage from each tower have then been merged to create an overall representation of network coverage. The OCI dataset series is available at Global and National level. Global dataset series - sub hierarchy levels - contain three datasets representing cellular mobile radio technologies ‘2G’, ‘3G’ and ‘4G’ The file naming convention is as follows: OCI_Global
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Although the reliability of the information on Wikipedia-pages can be questioned, we used this source because the data we needed was pretty straightforward and not readily accessible at other, perhaps more trustworthy, online football databases like Transfermarkt.co.uk or Footballdatabase.eu. In case a footballer was foreign-born or (possibly) a migrant, we verified the Wikipedia-data with information from (inter)national newspapers and football magazines. Reliable data on the genealogy of players was often harder to find, as the majority of (grand-) parents are, or were, not internationally famous themselves.The depositor provided the data file in XLSX format. DANS added the ODS format of this file.On April 16th 2018, a small correction was made in the rows related to football player Tony Cascarino.
Project Tycho datasets contain case counts for reported disease conditions for countries around the world. The Project Tycho data curation team extracts these case counts from various reputable sources, typically from national or international health authorities, such as the US Centers for Disease Control or the World Health Organization. These original data sources include both open- and restricted-access sources. For restricted-access sources, the Project Tycho team has obtained permission for redistribution from data contributors. All datasets contain case count data that are identical to counts published in the original source and no counts have been modified in any way by the Project Tycho team. The Project Tycho team has pre-processed datasets by adding new variables, such as standard disease and location identifiers, that improve data interpretability. We also formatted the data into a standard data format.
Each Project Tycho dataset contains case counts for a specific condition (e.g. measles) and for a specific country (e.g. The United States). Case counts are reported per time interval. In addition to case counts, datasets include information about these counts (attributes), such as the location, age group, subpopulation, diagnostic certainty, place of acquisition, and the source from which we extracted case counts. One dataset can include many series of case count time intervals, such as "US measles cases as reported by CDC", or "US measles cases reported by WHO", or "US measles cases that originated abroad", etc.
Depending on the intended use of a dataset, we recommend a few data processing steps before analysis: - Analyze missing data: Project Tycho datasets do not include time intervals for which no case count was reported (for many datasets, time series of case counts are incomplete, due to incompleteness of source documents) and users will need to add time intervals for which no count value is available. Project Tycho datasets do include time intervals for which a case count value of zero was reported. - Separate cumulative from non-cumulative time interval series. Case count time series in Project Tycho datasets can be "cumulative" or "fixed-intervals". Cumulative case count time series consist of overlapping case count intervals starting on the same date, but ending on different dates. For example, each interval in a cumulative count time series can start on January 1st, but end on January 7th, 14th, 21st, etc. It is common practice among public health agencies to report cases for cumulative time intervals. Case count series with fixed time intervals consist of mutually exclusive time intervals that all start and end on different dates and all have identical length (day, week, month, year). Given the different nature of these two types of case count data, we indicated this with an attribute for each count value, named "PartOfCumulativeCountSeries".
Project Tycho datasets contain case counts for reported disease conditions for countries around the world. The Project Tycho data curation team extracts these case counts from various reputable sources, typically from national or international health authorities, such as the US Centers for Disease Control or the World Health Organization. These original data sources include both open- and restricted-access sources. For restricted-access sources, the Project Tycho team has obtained permission for redistribution from data contributors. All datasets contain case count data that are identical to counts published in the original source and no counts have been modified in any way by the Project Tycho team. The Project Tycho team has pre-processed datasets by adding new variables, such as standard disease and location identifiers, that improve data interpretability. We also formatted the data into a standard data format.
Each Project Tycho dataset contains case counts for a specific condition (e.g. measles) and for a specific country (e.g. The United States). Case counts are reported per time interval. In addition to case counts, datasets include information about these counts (attributes), such as the location, age group, subpopulation, diagnostic certainty, place of acquisition, and the source from which we extracted case counts. One dataset can include many series of case count time intervals, such as "US measles cases as reported by CDC", or "US measles cases reported by WHO", or "US measles cases that originated abroad", etc.
Depending on the intended use of a dataset, we recommend a few data processing steps before analysis: - Analyze missing data: Project Tycho datasets do not include time intervals for which no case count was reported (for many datasets, time series of case counts are incomplete, due to incompleteness of source documents) and users will need to add time intervals for which no count value is available. Project Tycho datasets do include time intervals for which a case count value of zero was reported. - Separate cumulative from non-cumulative time interval series. Case count time series in Project Tycho datasets can be "cumulative" or "fixed-intervals". Cumulative case count time series consist of overlapping case count intervals starting on the same date, but ending on different dates. For example, each interval in a cumulative count time series can start on January 1st, but end on January 7th, 14th, 21st, etc. It is common practice among public health agencies to report cases for cumulative time intervals. Case count series with fixed time intervals consist of mutually exclusive time intervals that all start and end on different dates and all have identical length (day, week, month, year). Given the different nature of these two types of case count data, we indicated this with an attribute for each count value, named "PartOfCumulativeCountSeries".
Project Tycho datasets contain case counts for reported disease conditions for countries around the world. The Project Tycho data curation team extracts these case counts from various reputable sources, typically from national or international health authorities, such as the US Centers for Disease Control or the World Health Organization. These original data sources include both open- and restricted-access sources. For restricted-access sources, the Project Tycho team has obtained permission for redistribution from data contributors. All datasets contain case count data that are identical to counts published in the original source and no counts have been modified in any way by the Project Tycho team. The Project Tycho team has pre-processed datasets by adding new variables, such as standard disease and location identifiers, that improve data interpretability. We also formatted the data into a standard data format.
Each Project Tycho dataset contains case counts for a specific condition (e.g. measles) and for a specific country (e.g. The United States). Case counts are reported per time interval. In addition to case counts, datasets include information about these counts (attributes), such as the location, age group, subpopulation, diagnostic certainty, place of acquisition, and the source from which we extracted case counts. One dataset can include many series of case count time intervals, such as "US measles cases as reported by CDC", or "US measles cases reported by WHO", or "US measles cases that originated abroad", etc.
Depending on the intended use of a dataset, we recommend a few data processing steps before analysis: - Analyze missing data: Project Tycho datasets do not include time intervals for which no case count was reported (for many datasets, time series of case counts are incomplete, due to incompleteness of source documents) and users will need to add time intervals for which no count value is available. Project Tycho datasets do include time intervals for which a case count value of zero was reported. - Separate cumulative from non-cumulative time interval series. Case count time series in Project Tycho datasets can be "cumulative" or "fixed-intervals". Cumulative case count time series consist of overlapping case count intervals starting on the same date, but ending on different dates. For example, each interval in a cumulative count time series can start on January 1st, but end on January 7th, 14th, 21st, etc. It is common practice among public health agencies to report cases for cumulative time intervals. Case count series with fixed time intervals consist of mutually exclusive time intervals that all start and end on different dates and all have identical length (day, week, month, year). Given the different nature of these two types of case count data, we indicated this with an attribute for each count value, named "PartOfCumulativeCountSeries".
Project Tycho datasets contain case counts for reported disease conditions for countries around the world. The Project Tycho data curation team extracts these case counts from various reputable sources, typically from national or international health authorities, such as the US Centers for Disease Control or the World Health Organization. These original data sources include both open- and restricted-access sources. For restricted-access sources, the Project Tycho team has obtained permission for redistribution from data contributors. All datasets contain case count data that are identical to counts published in the original source and no counts have been modified in any way by the Project Tycho team. The Project Tycho team has pre-processed datasets by adding new variables, such as standard disease and location identifiers, that improve data interpretability. We also formatted the data into a standard data format.
Each Project Tycho dataset contains case counts for a specific condition (e.g. measles) and for a specific country (e.g. The United States). Case counts are reported per time interval. In addition to case counts, datasets include information about these counts (attributes), such as the location, age group, subpopulation, diagnostic certainty, place of acquisition, and the source from which we extracted case counts. One dataset can include many series of case count time intervals, such as "US measles cases as reported by CDC", or "US measles cases reported by WHO", or "US measles cases that originated abroad", etc.
Depending on the intended use of a dataset, we recommend a few data processing steps before analysis: - Analyze missing data: Project Tycho datasets do not include time intervals for which no case count was reported (for many datasets, time series of case counts are incomplete, due to incompleteness of source documents) and users will need to add time intervals for which no count value is available. Project Tycho datasets do include time intervals for which a case count value of zero was reported. - Separate cumulative from non-cumulative time interval series. Case count time series in Project Tycho datasets can be "cumulative" or "fixed-intervals". Cumulative case count time series consist of overlapping case count intervals starting on the same date, but ending on different dates. For example, each interval in a cumulative count time series can start on January 1st, but end on January 7th, 14th, 21st, etc. It is common practice among public health agencies to report cases for cumulative time intervals. Case count series with fixed time intervals consist of mutually exclusive time intervals that all start and end on different dates and all have identical length (day, week, month, year). Given the different nature of these two types of case count data, we indicated this with an attribute for each count value, named "PartOfCumulativeCountSeries".
Project Tycho datasets contain case counts for reported disease conditions for countries around the world. The Project Tycho data curation team extracts these case counts from various reputable sources, typically from national or international health authorities, such as the US Centers for Disease Control or the World Health Organization. These original data sources include both open- and restricted-access sources. For restricted-access sources, the Project Tycho team has obtained permission for redistribution from data contributors. All datasets contain case count data that are identical to counts published in the original source and no counts have been modified in any way by the Project Tycho team. The Project Tycho team has pre-processed datasets by adding new variables, such as standard disease and location identifiers, that improve data interpretability. We also formatted the data into a standard data format.
Each Project Tycho dataset contains case counts for a specific condition (e.g. measles) and for a specific country (e.g. The United States). Case counts are reported per time interval. In addition to case counts, datasets include information about these counts (attributes), such as the location, age group, subpopulation, diagnostic certainty, place of acquisition, and the source from which we extracted case counts. One dataset can include many series of case count time intervals, such as "US measles cases as reported by CDC", or "US measles cases reported by WHO", or "US measles cases that originated abroad", etc.
Depending on the intended use of a dataset, we recommend a few data processing steps before analysis: - Analyze missing data: Project Tycho datasets do not include time intervals for which no case count was reported (for many datasets, time series of case counts are incomplete, due to incompleteness of source documents) and users will need to add time intervals for which no count value is available. Project Tycho datasets do include time intervals for which a case count value of zero was reported. - Separate cumulative from non-cumulative time interval series. Case count time series in Project Tycho datasets can be "cumulative" or "fixed-intervals". Cumulative case count time series consist of overlapping case count intervals starting on the same date, but ending on different dates. For example, each interval in a cumulative count time series can start on January 1st, but end on January 7th, 14th, 21st, etc. It is common practice among public health agencies to report cases for cumulative time intervals. Case count series with fixed time intervals consist of mutually exclusive time intervals that all start and end on different dates and all have identical length (day, week, month, year). Given the different nature of these two types of case count data, we indicated this with an attribute for each count value, named "PartOfCumulativeCountSeries".
https://object-store.os-api.cci2.ecmwf.int:443/cci2-prod-catalogue/licences/insitu-gridded-observations-global-and-regional/insitu-gridded-observations-global-and-regional_15437b363f02bf5e6f41fc2995e3d19a590eb4daff5a7ce67d1ef6c269d81d68.pdfhttps://object-store.os-api.cci2.ecmwf.int:443/cci2-prod-catalogue/licences/insitu-gridded-observations-global-and-regional/insitu-gridded-observations-global-and-regional_15437b363f02bf5e6f41fc2995e3d19a590eb4daff5a7ce67d1ef6c269d81d68.pdf
This dataset provides high-resolution gridded temperature and precipitation observations from a selection of sources. Additionally the dataset contains daily global average near-surface temperature anomalies. All fields are defined on either daily or monthly frequency. The datasets are regularly updated to incorporate recent observations. The included data sources are commonly known as GISTEMP, Berkeley Earth, CPC and CPC-CONUS, CHIRPS, IMERG, CMORPH, GPCC and CRU, where the abbreviations are explained below. These data have been constructed from high-quality analyses of meteorological station series and rain gauges around the world, and as such provide a reliable source for the analysis of weather extremes and climate trends. The regular update cycle makes these data suitable for a rapid study of recently occurred phenomena or events. The NASA Goddard Institute for Space Studies temperature analysis dataset (GISTEMP-v4) combines station data of the Global Historical Climatology Network (GHCN) with the Extended Reconstructed Sea Surface Temperature (ERSST) to construct a global temperature change estimate. The Berkeley Earth Foundation dataset (BERKEARTH) merges temperature records from 16 archives into a single coherent dataset. The NOAA Climate Prediction Center datasets (CPC and CPC-CONUS) define a suite of unified precipitation products with consistent quantity and improved quality by combining all information sources available at CPC and by taking advantage of the optimal interpolation (OI) objective analysis technique. The Climate Hazards Group InfraRed Precipitation with Station dataset (CHIRPS-v2) incorporates 0.05° resolution satellite imagery and in-situ station data to create gridded rainfall time series over the African continent, suitable for trend analysis and seasonal drought monitoring. The Integrated Multi-satellitE Retrievals dataset (IMERG) by NASA uses an algorithm to intercalibrate, merge, and interpolate “all'' satellite microwave precipitation estimates, together with microwave-calibrated infrared (IR) satellite estimates, precipitation gauge analyses, and potentially other precipitation estimators over the entire globe at fine time and space scales for the Tropical Rainfall Measuring Mission (TRMM) and its successor, Global Precipitation Measurement (GPM) satellite-based precipitation products. The Climate Prediction Center morphing technique dataset (CMORPH) by NOAA has been created using precipitation estimates that have been derived from low orbiter satellite microwave observations exclusively. Then, geostationary IR data are used as a means to transport the microwave-derived precipitation features during periods when microwave data are not available at a location. The Global Precipitation Climatology Centre dataset (GPCC) is a centennial product of monthly global land-surface precipitation based on the ~80,000 stations world-wide that feature record durations of 10 years or longer. The data coverage per month varies from ~6,000 (before 1900) to more than 50,000 stations. The Climatic Research Unit dataset (CRU v4) features an improved interpolation process, which delivers full traceability back to station measurements. The station measurements of temperature and precipitation are public, as well as the gridded dataset and national averages for each country. Cross-validation was performed at a station level, and the results have been published as a guide to the accuracy of the interpolation. This catalogue entry complements the E-OBS record in many aspects, as it intends to provide high-resolution gridded meteorological observations at a global rather than continental scale. These data may be suitable as a baseline for model comparisons or extreme event analysis in the CMIP5 and CMIP6 dataset.
Attribution-NonCommercial-ShareAlike 3.0 (CC BY-NC-SA 3.0)https://creativecommons.org/licenses/by-nc-sa/3.0/
License information was derived automatically
The Administrative boundaries at the level 1 dataset are part of the Global Administrative Areas (GADM) 3.6 vector dataset series which includes distinct datasets representing administrative boundaries for all countries in the world. The Administrative level 1 distinguishes Countries, Provinces and equivalent. GADM makes use of high spatial resolution images, and an extensive set of attributes to map administrative areas at all levels of political sub-division. Information on administrative units associated attributes includes official names in Latin and non-Latin scripts, variant names, administrative type in local and English. Please read the GADM 3.6 - Global Administrative Areas dataset series metadata for more information.
Data publication: 2018-01-01
Supplemental Information:
The dataset was originally produced for the BioGeomancer project, with collaboration from the International Rice Research Institute and of California, Berkeley, Museum of Vertebrate Zoology. The development of GADM was partly supported by the Gordon and Betty Moore foundation for the BioGeomancer project.
Citation:
© 2009-2018 GADM
Contact points:
Resource Contact: Robert Hijmans
Metadata Contact: Robert Hijmans
Data lineage:
GADM unique ID (GID) starts with the three-letter ISO 3166-1 alpha-3 country code. If there are subdivisions these are identified by a number from 1 to n, where n is the number of subdivisions at level 1. This value is concatenated with the country code, using a dot to delimit the two. For example, AFG.1, AFG.2, ..., AFG.n. If there are second-level subdivisions, numeric codes are assigned within each first-level subdivision and these are concatenated with the first level identifier, using a dot as a delimiter. For example, AFG.1.1, AFG.1.2, AFG.1.3, ..., and AFG.2.1, AFG.2.2, .... And so forth for the third, fourth, and fifth levels. Finally, there is an underscore followed by a version number appended to the code. For example, AFG.3_1 and AFG.3.2_1. The GID codes are persistent after version 3.6 (there were errors in the codes in version 3.4). If an area changes, for example, if it splits into two new areas, two new codes will be assigned, and the old code will not be used anymore. The version only changes when there is a major overhaul of the divisions in a country, for example when a whole new set of subdivisions is introduced.
Resource constraints:
The data are freely available for academic use and other non-commercial use. Redistribution, or commercial use is not allowed without prior permission. See the license for more details.
Online resources:
While the presence of foreign-born footballers in national teams has a long history, it is often believed that the World Cup has become more migratory over time. The presumed increases in the volume and diversity of foreign-born footballers have, however, remained empirically untested. In this article, we empirically test whether the presence of foreign-born footballers at the World Cup has changed over time in respect to these two dimensions of migration. We conducted an analysis on 4.761 footballers, derived from the fifteen national teams that competed in at least ten editions of the World Cup between 1930 and 2018, which comprises of 301 foreign-born football players. We argue that countries’ different histories of migration, in combination with historically used citizenship regimes, largely influence the migratory dimensions of their representative football teams. Our outcomes show that the (absolute) volume of foreign-born footballers in World Cups is indeed increasing over time. Moreover, foreign-born footballers seem to come from an increasingly diverse range of countries. We, therefore, conclude that the World Cup has become more migratory in terms of volume and diversity from an immigration perspective.