In 2021, the number of mobile devices operating worldwide stood at almost 15 billion, up from just over 14 billion in the previous year. The number of mobile devices is expected to reach 18.22 billion by 2025, an increase of 4.2 billion devices compared to 2020 levels.
Moving forward with 5G
As the number of devices grows, so does our dependence on them to fulfill daily functions and activities. The use cases for mobile devices increasingly demand faster connection speeds and lower latency. The 5G network will be critical to fulfilling those demands, operating at significantly faster rates than 4G. In North America, for example, it is expected that there will be 218 million 5G connections, up from just ten million in 2020. This means around 48 percent of all mobile connections in North America. Globally, this figure should reach 20.1 percent by 2025.
6G: looking beyond 5G
While 5G has entered commercialization and is already creating new opportunities, researchers and engineers are already experimenting with 6G. Not only will the number of mobile devices continue to grow but cellular internet-of-things (IoT) devices are set to permeate more industrial sectors in the coming years, meaning a solution will eventually be required for network congestion and data transfer speeds.
6G ought to be capable of solving those problems before they arise, potentially enabling a network connection density ten times greater than that of 5G, and peak data rates up to fifty times faster than the rate of 5G. The Federal Communications Commission in the United States has opened spectrum for experimentation, and China have already launched what is described as a 6G satellite, so that actual potential of 6G should be revealed over the coming decade.
The global number of smartphone users in was forecast to continuously increase between 2024 and 2029 by in total 1.8 billion users (+42.62 percent). After the ninth consecutive increasing year, the smartphone user base is estimated to reach 6.1 billion users and therefore a new peak in 2029. Notably, the number of smartphone users of was continuously increasing over the past years.Smartphone users here are limited to internet users of any age using a smartphone. The shown figures have been derived from survey data that has been processed to estimate missing demographics.The shown data are an excerpt of Statista's Key Market Indicators (KMI). The KMI are a collection of primary and secondary indicators on the macro-economic, demographic and technological environment in up to 150 countries and regions worldwide. All indicators are sourced from international and national statistical offices, trade associations and the trade press and they are processed to generate comparable data sets (see supplementary notes under details for more information).Find more key insights for the number of smartphone users in countries like Australia & Oceania and Asia.
As of June 7, 2015, Sierra Leone had reported more than 12,900 cases of Ebola Virus Disease (EVD), and over 3,900 deaths since the outbreak began. The Government of Sierra Leone, with support from the World Bank Group, has been conducting mobile phone surveys with the aim of capturing the key socio-economic effects of the virus. Three rounds of data collection have been conducted, in November 2014, January-February 2015, and May 2015. The survey was given to household heads for whom cell phone numbers were recorded during the nationally representative Labor Force Survey conducted in July and August 2014. Overall, 66 percent of the 4,199 households sampled in that survey had cell phones, although this coverage was uneven across the country, with higher levels in urban areas (82 percent) than rural areas (43 percent). Of those with cell phones, 51 percent were surveyed in all three rounds, and 79 percent were reached in at least one round.
The main focus of the data collection was to capture impacts of EVD on labor market indicators, agricultural production, food security, migration, and utilization of non-Ebola essential health services.
Due to differing characteristics between responding and non-responding households, the results should be considered “descriptive” rather than representative of the Sierra Leonean population. Overall the response rate was higher than expected given the nature of the survey and the difficult conditions under which it was conducted. In Sierra Leone, of the 4,199 households interviewed in the LFS, 65.8 percent (2,764 households) recorded a cell phone number for the household head, and, of those, 80.0 percent responded to at least one round of the cell phone survey. The unweighted sample was 59.1 percent urban (2,483 households) and 40.9 percent rural (1,716 households). Of urban households, 81.4 percent (2,021 households) listed a cell phone number for the household head, and, of those, 88.1 percent (1,780 households) responded in at least one of the three rounds of the cell phone survey. Of rural households, 43.1 percent (740 households) listed a cell phone number for the household head, and, of those, 58.1 percent (430 households) responded in at least one of the three rounds.
All households from the 2014 Sierra Leone Labor Force Survey which provided cell phone numbers.
Sample survey data [ssd]
The sampling frame for the cell phone survey was the Sierra Leone Labor Force Survey (LFS) 2014. The LFS is a nationally representative stratified cluster sample survey conducted in July and August 2014, and includes the oversampling of urban areas. As part of the LFS, a total of 4199 households in 280 enumeration areas were interviewed. Interviewers collected the phone number, if available, for the head of household, and 2,764 households interviewed in the LFS included phone numbers. All available numbers from the LFS were included in the cell phone survey. The phone numbers were reported for 43 percent of rural households and 82 percent of urban households. Those households reporting numbers are unevenly distributed across the sample though there is at least partial coverage in all districts, ranging from 93 percent in Freetown (Western urban) to 30 percent in Kailahun district.
Computer Assisted Telephone Interview [cati]
As the survey was administered by telephone, the length of the questionnaire was targeted as 20 to 25 minutes. In Round 1, the questionnaire focused on employment and labor market conditions, non-agricultural business operations, agricultural activity, food security, health responses (covering only fever and pregnancy), remittances, travel, trust and knowledge about Ebola. In Round 2, questions were added on social assistance and education on the radio, and there were small changes to the existing questions based on the results from Round 1.
Questions on earnings were revised to match the Labor Force Survey questions more closely, in particular to account for earnings that were expressed in time unit other than months, and questions on the incidence and treatment of child diarrhea were adding using identical wording to the Demographic and Health Survey (DHS). The most substantial changes were to the migration section as the Round 1 analysis found inconsistencies in the migration reporting. Details of these changes can be found in the Round 2 report. In Round 3, the agriculture, social assistance, and education sections were expanded while the trust section was dropped due to limited variation between Rounds 1 and 2.
The only questions on Ebola Virus Disease (EVD) specifically were in Round 1 and focused on whether the respondent had heard of Ebola and what were their main sources of information were. This section was placed at the end of the questionnaire in order to elicit unbiased responses in other sections, since people may be distrustful of the government especially regarding Ebola, at a time of such emergency.
Questions related directly to incidence of EVD within the household were excluded for two reasons. First EVD is a relatively rare event and the sample was unlikely to yield sufficient observations for meaningful analysis, and secondly, the respondents will be called repeatedly as part of the high frequency survey therefore it was necessary to avoid sensitive questions that may increase attrition in later rounds. The included questions were worded in such a way as to facilitate differences-in-differences comparisons. The vast majority of questions were identical in their wording to those asked during the LFS or other nationally representative surveys for which detailed data were available including the DHS, the National Public Services Survey (NPS) and the Agricultural Households Tracking Survey (AHTS).
In a few cases, the time period over which the questions were asked was shortened to make it relevant to the last few months during which the outbreak has been growing. For example, the NPS asked about remittances in the last year whereas in November 2014, respondents were asked about remittances received in the last month.
The datasets were cleaned and compiled by teams from Innovations for Poverty Action and the World Bank's Poverty Global Practice and Social Protection and Labor Global Practice.
Overall the response rate was higher than expected given the nature of the survey and the difficult conditions under which it was conducted. In Sierra Leone, of the 4,199 households interviewed in the LFS, 65.8 percent (2,764 households) recorded a cell phone number for the household head, and, of those, 80.0 percent responded to at least one round of the cell phone survey.
The unweighted sample was 59.1 percent urban (2,483 households) and 40.9 percent rural (1,716 households). Of urban households, 81.4 percent (2,021 households) listed a cell phone number for the household head, and, of those, 88.1 percent (1,780 households) responded in at least one of the three rounds of the cell phone survey. Of rural households, 43.1 percent (740 households) listed a cell phone number for the household head, and, of those, 58.1 percent (430 households) responded in at least one of the three rounds.
The population share with mobile internet access in North America was forecast to increase between 2024 and 2029 by in total 2.9 percentage points. This overall increase does not happen continuously, notably not in 2028 and 2029. The mobile internet penetration is estimated to amount to 84.21 percent in 2029. Notably, the population share with mobile internet access of was continuously increasing over the past years.The penetration rate refers to the share of the total population having access to the internet via a mobile broadband connection.The shown data are an excerpt of Statista's Key Market Indicators (KMI). The KMI are a collection of primary and secondary indicators on the macro-economic, demographic and technological environment in up to 150 countries and regions worldwide. All indicators are sourced from international and national statistical offices, trade associations and the trade press and they are processed to generate comparable data sets (see supplementary notes under details for more information).Find more key insights for the population share with mobile internet access in countries like Caribbean and Europe.
https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
We are publishing a walking activity dataset including inertial and positioning information from 19 volunteers, including reference distance measured using a trundle wheel. The dataset includes a total of 96.7 Km walked by the volunteers, split into 203 separate tracks. The trundle wheel is of two types: it is either an analogue trundle wheel, which provides the total amount of meters walked in a single track, or it is a sensorized trundle wheel, which measures every revolution of the wheel, therefore recording a continuous incremental distance.
Each track has data from the accelerometer and gyroscope embedded in the phones, location information from the Global Navigation Satellite System (GNSS), and the step count obtained by the device. The dataset can be used to implement walking distance estimation algorithms and to explore data quality in the context of walking activity and physical capacity tests, fitness, and pedestrian navigation.
Methods
The proposed dataset is a collection of walks where participants used their own smartphones to capture inertial and positioning information. The participants involved in the data collection come from two sites. The first site is the Oxford University Hospitals NHS Foundation Trust, United Kingdom, where 10 participants (7 affected by cardiovascular diseases and 3 healthy individuals) performed unsupervised 6MWTs in an outdoor environment of their choice (ethical approval obtained by the UK National Health Service Health Research Authority protocol reference numbers: 17/WM/0355). All participants involved provided informed consent. The second site is at Malm ̈o University, in Sweden, where a group of 9 healthy researchers collected data. This dataset can be used by researchers to develop distance estimation algorithms and how data quality impacts the estimation.
All walks were performed by holding a smartphone in one hand, with an app collecting inertial data, the GNSS signal, and the step counting. On the other free hand, participants held a trundle wheel to obtain the ground truth distance. Two different trundle wheels were used: an analogue trundle wheel that allowed the registration of a total single value of walked distance, and a sensorized trundle wheel which collected timestamps and distance at every 1-meter revolution, resulting in continuous incremental distance information. The latter configuration is innovative and allows the use of temporal windows of the IMU data as input to machine learning algorithms to estimate walked distance. In the case of data collected by researchers, if the walks were done simultaneously and at a close distance from each other, only one person used the trundle wheel, and the reference distance was associated with all walks that were collected at the same time.The walked paths are of variable length, duration, and shape. Participants were instructed to walk paths of increasing curvature, from straight to rounded. Irregular paths are particularly useful in determining limitations in the accuracy of walked distance algorithms. Two smartphone applications were developed for collecting the information of interest from the participants' devices, both available for Android and iOS operating systems. The first is a web-application that retrieves inertial data (acceleration, rotation rate, orientation) while connecting to the sensorized trundle wheel to record incremental reference distance [1]. The second app is the Timed Walk app [2], which guides the user in performing a walking test by signalling when to start and when to stop the walk while collecting both inertial and positioning data. All participants in the UK used the Timed Walk app.
The data collected during the walk is from the Inertial Measurement Unit (IMU) of the phone and, when available, the Global Navigation Satellite System (GNSS). In addition, the step count information is retrieved by the sensors embedded in each participant’s smartphone. With the dataset, we provide a descriptive table with the characteristics of each recording, including brand and model of the smartphone, duration, reference total distance, types of signals included and additionally scoring some relevant parameters related to the quality of the various signals. The path curvature is one of the most relevant parameters. Previous literature from our team, in fact, confirmed the negative impact of curved-shaped paths with the use of multiple distance estimation algorithms [3]. We visually inspected the walked paths and clustered them in three groups, a) straight path, i.e. no turns wider than 90 degrees, b) gently curved path, i.e. between one and five turns wider than 90 degrees, and c) curved path, i.e. more than five turns wider than 90 degrees. Other features relevant to the quality of collected signals are the total amount of time above a threshold (0.05s and 6s) where, respectively, inertial and GNSS data were missing due to technical issues or due to the app going in the background thus losing access to the sensors, sampling frequency of different data streams, average walking speed and the smartphone position. The start of each walk is set as 0 ms, thus not reporting time-related information. Walks locations collected in the UK are anonymized using the following approach: the first position is fixed to a central location of the city of Oxford (latitude: 51.7520, longitude: -1.2577) and all other positions are reassigned by applying a translation along the longitudinal and latitudinal axes which maintains the original distance and angle between samples. This way, the exact geographical location is lost, but the path shape and distances between samples are maintained. The difference between consecutive points “as the crow flies” and path curvature was numerically and visually inspected to obtain the same results as the original walks. Computations were made possible by using the Haversine Python library.
Multiple datasets are available regarding walking activity recognition among other daily living tasks. However, few studies are published with datasets that focus on the distance for both indoor and outdoor environments and that provide relevant ground truth information for it. Yan et al. [4] introduced an inertial walking dataset within indoor scenarios using a smartphone placed in 4 positions (on the leg, in a bag, in the hand, and on the body) by six healthy participants. The reference measurement used in this study is a Visual Odometry System embedded in a smartphone that has to be worn at the chest level, using a strap to hold it. While interesting and detailed, this dataset lacks GNSS data, which is likely to be used in outdoor scenarios, and the reference used for localization also suffers from accuracy issues, especially outdoors. Vezovcnik et al. [5] analysed estimation models for step length and provided an open-source dataset for a total of 22 km of only inertial walking data from 15 healthy adults. While relevant, their dataset focuses on steps rather than total distance and was acquired on a treadmill, which limits the validity in real-world scenarios. Kang et al. [6] proposed a way to estimate travelled distance by using an Android app that uses outdoor walking patterns to match them in indoor contexts for each participant. They collect data outdoors by including both inertial and positioning information and they use average values of speed obtained by the GPS data as reference labels. Afterwards, they use deep learning models to estimate walked distance obtaining high performances. Their results share that 3% to 11% of the data for each participant was discarded due to low quality. Unfortunately, the name of the used app is not reported and the paper does not mention if the dataset can be made available.
This dataset is heterogeneous under multiple aspects. It includes a majority of healthy participants, therefore, it is not possible to generalize the outcomes from this dataset to all walking styles or physical conditions. The dataset is heterogeneous also from a technical perspective, given the difference in devices, acquired data, and used smartphone apps (i.e. some tests lack IMU or GNSS, sampling frequency in iPhone was particularly low). We suggest selecting the appropriate track based on desired characteristics to obtain reliable and consistent outcomes.
This dataset allows researchers to develop algorithms to compute walked distance and to explore data quality and reliability in the context of the walking activity. This dataset was initiated to investigate the digitalization of the 6MWT, however, the collected information can also be useful for other physical capacity tests that involve walking (distance- or duration-based), or for other purposes such as fitness, and pedestrian navigation.
The article related to this dataset will be published in the proceedings of the IEEE MetroXRAINE 2024 conference, held in St. Albans, UK, 21-23 October.
This research is partially funded by the Swedish Knowledge Foundation and the Internet of Things and People research center through the Synergy project Intelligent and Trustworthy IoT Systems.
By Amber Thomas [source]
This dataset provides an estimation of broadband usage in the United States, focusing on how many people have access to broadband and how many are actually using it at broadband speeds. Through data collected by Microsoft from our services, including package size and total time of download, we can estimate the throughput speed of devices connecting to the internet across zip codes and counties.
According to Federal Communications Commission (FCC) estimates, 14.5 million people don't have access to any kind of broadband connection. This data set aims to address this contrast between those with estimated availability but no actual use by providing more accurate usage numbers downscaled to county and zip code levels. Who gets counted as having access is vastly important -- it determines who gets included in public funding opportunities dedicated solely toward closing this digital divide gap. The implications can be huge: millions around this country could remain invisible if these number aren't accurately reported or used properly in decision-making processes.
This dataset includes aggregated information about these locations with less than 20 devices for increased accuracy when estimating Broadband Usage in the United States-- allowing others to use it for developing solutions that improve internet access or label problem areas accurately where no real or reliable connectivity exists among citizens within communities large and small throughout the US mainland.. Please review the license terms before using these data so that you may adhere appropriately with stipulations set forth under Microsoft's Open Use Of Data Agreement v1.0 agreement prior to utilizing this dataset for your needs-- both professional and educational endeavors alike!
For more datasets, click here.
- 🚨 Your notebook can be here! 🚨!
How to Use the US Broadband Usage Dataset
This dataset provides broadband usage estimates in the United States by county and zip code. It is ideally suited for research into how broadband connects households, towns and cities. Understanding this information is vital for closing existing disparities in access to high-speed internet, and for devising strategies for making sure all Americans can stay connected in a digital world.
The dataset contains six columns: - County – The name of the county for which usage statistics are provided. - Zip Code (5-Digit) – The 5-digit zip code from which usage data was collected from within that county or metropolitan area/micro area/divisions within states as reported by the US Census Bureau in 2018[2].
- Population (Households) – Estimated number of households defined according to [3] based on data from the US Census Bureau American Community Survey's 5 Year Estimates[4].
- Average Throughput (Mbps)- Average Mbps download speed derived from a combination of data collected anonymous devices connected through Microsoft services such as Windows Update, Office 365, Xbox Live Core Services, etc.[5]
- Percent Fast (> 25 Mbps)- Percentage of machines with throughput greater than 25 Mbps calculated using [6]. 6) Percent Slow (< 3 Mbps)- Percentage of machines with throughput less than 3Mbps calculated using [7].
- Targeting marketing campaigns based on broadband use. Companies can use the geographic and demographic data in this dataset to create targeted advertising campaigns that are tailored to individuals living in areas where broadband access is scarce or lacking.
- Creating an educational platform for those without reliable access to broadband internet. By leveraging existing technologies such as satellite internet, media streaming services like Netflix, and platforms such as Khan Academy or EdX, those with limited access could gain access to new educational options from home.
- Establishing public-private partnerships between local governments and telecom providers need better data about gaps in service coverage and usage levels in order to make decisions about investments into new infrastructure buildouts for better connectivity options for rural communities
If you use this dataset in your research, please credit the original authors. Data Source
See the dataset description for more information.
File: broadband_data_2020October.csv
If you use this dataset in your research,...
Access to up-to-date socio-economic data is a widespread challenge in Solomon Islands and other Pacific Island Countries. To increase data availability and promote evidence-based policymaking, the Pacific Observatory provides innovative solutions and data sources to complement existing survey data and analysis. One of these data sources is a series of High Frequency Phone Surveys (HFPS), which began in 2020 as a way to monitor the socio-economic impacts of the COVID-19 Pandemic, and since 2023 has grown into a series of continuous surveys for socio-economic monitoring. See https://www.worldbank.org/en/country/pacificislands/brief/the-pacific-observatory for further details.
For Solmon Islands, after five rounds of data collection from 2020-2020, in April 2023 a monthly HFPS data collection commenced and continued for 18 months (ending September 2024) –on topics including employment, income, food security, health, food prices, assets and well-being. Fieldwork took place in two non-consecutive weeks of each month. Data for April 2023-December 2023 were a repeated cross section, while January 2024 established the first month of a panel, the was continued to September 2024. Each month has approximately 550 households in the sample and is representative of urban and rural areas, but is not representative at the province level. This dataset contains combined monthly survey data for all months of the continuous HFPS in Solomon Islands. There is one date file for household level data with a unique household ID. and a separate file for individual level data within each household data, that can be matched to the household file using the household ID, and which also has a unique individual ID within the household data which can be used to track individuals over time within households, where the data is panel data.
Urban and rural areas of Solomon Islands.
Household, individual.
Sample survey data [ssd]
The initial sample was drawn through Random Digit Dialing (RDD) with geographic stratification. As an objective of the survey was to measure changes in household economic wellbeing over time, the HFPS sought to contact a consistent number of households across each province month to month. This was initially a repeated cross section from April 2023-Dec 2023. The initial sample was drawn from information provided by a major phone service provider in Solomon Islands, covering all the provinces in the country. It had a probability-based weighted design, with a proportionate stratification to achieve geographical representation. The geographical distribution compared to the 2019 Census is listed below for the first month of the HFPS monthly survey:
Choiseul : Census: 4.3%, HFPS: 5.2% Western : Census: 14.4%, HFPS: 13.7% Isabel : Census: 4.8%, HFPS: 4.7% Central : Census: 3.6%, HFPS: 5.2% Ren Bell : Census: 0.6%, HFPS: 1.4% Guadalcanal: Census: 19.8%, HFPS: 21.1% Malaita : Census: 23.1%, HFPS: 18.7% Makira : Census: 5.6%, HFPS: 5.6% Temotu: Census: 3.0%, HFPS: 3% Honiara: Census: 20.7%, HFPS: 21.3%
Source: Census of Population and Housing 2019
Note: The values in the HFPS column represent the proportion of survey participants residing in each province, based on the raw HFPS data from April.
In April 2023, the geographic distribution of World Bank HFPS participants was generally similar to that of the census data at the province level, though within provinces, areas with less mobile phone connectivity are likely to be underrepresented. One indication of this is that urban areas constituted 38.2 percent of the survey sample, which is a slight overrepresentation, compared to 32.5 percent in the Census 2019.
A monthly panel was established in January 2024, that is ongoing as of March 2025. In each subsequent month after January 2024, the survey firm would first attempt to contact all households from the previous month and then attempt to contact households from earlier months that had dropped out. After previous numbers were exhausted, RDD with geographic stratification was used for replacement households. Across all months of the survey a total of, 9,926 interviews were completed.
Computer Assisted Telephone Interview [cati]
The questionnaire, which can be found in the External Resources of this documentation, is available in English, with Solomons Pijin translation. There were few changes to the questionnaire across the survey months, but some sections were only introduced in 2024, namely energy access questions and questions to inform the baseline data of the Solomon Islands Government Integrated Economic Development and Climate Resilience (IEDCR) project.
The raw data were cleaned by the World Bank team using STATA. This included formatting and correcting errors identified through the survey’s monitoring and quality control process. The data are presented in two datasets: a household dataset and an individual dataset. The total number of observations is 9,926 in the household dataset and 62,054 in the individual dataset. The individual dataset contains information on individual demographics and labor market outcomes of all household members aged 15 and above, and the household data set contains information about household demographics, education, food security, food prices, household income, agriculture activities, social protection, access to services, and durable asset ownership. The household identifier (hhid) is available in both the household dataset and the individual dataset. The individual identifier (id_member) can be found in the individual dataset.
Open Database License (ODbL) v1.0https://www.opendatacommons.org/licenses/odbl/1.0/
License information was derived automatically
Please cite our paper if you publish material based on those datasets
G. Khodabandelou, V. Gauthier, M. El-Yacoubi, M. Fiore, "Estimation of Static and Dynamic Urban Populations with Mobile Network Metadata", in IEEE Trans. on Mobile Computing, 2018 (in Press). 10.1109/TMC.2018.2871156
Abstract
Communication-enabled devices that are physically carried by individuals are today pervasive,
which opens unprecedented opportunities for collecting digital metadata about the mobility of large populations. In this paper, we propose a novel methodology for the estimation of people density at metropolitan scales, using subscriber presence metadata collected by a mobile operator. We show that our approach suits the estimation of static population densities, i.e., of the distribution of dwelling units per urban area contained in traditional censuses. Specifically, it achieves higher accuracy than that granted by previous equivalent solutions. In addition, our approach enables the estimation of dynamic population densities, i.e., the time-varying distributions of people in a conurbation. Our results build on significant real-world mobile network metadata and relevant ground-truth information in multiple urban scenarios.
Dataset Columns
This dataset cover one month of data taken during the month of April 2015 for three Italian cities: Rome, Milan, Turin. The raw data has been provided during the Telecom Italia Big Data Challenge (http://www.telecomitalia.com/tit/en/innovazione/archivio/big-data-challenge-2015.html)
1. grid_id: the coordinate of the grid can be retrieved with the shapefile of a given city
2. date: format Y-M-D H:M:S
4. landuse_label: the land use label has been computed by through method described in [2]
5. population: Census population of a given grid block as defined by the Istituto nazionale di statistica (ISTAT https://www.istat.it/en/censuses) in 2011
6. estimation: Dynamics density population estimation (in person) as the result of the method described in [1]
7. area: surface of the "grid id" considered in km^2
8. geometry: the shape of the area considered with the EPSG:3003 coordinate system (only with quilt)
Note
Due to legal constraints, we cannot share directly the original data from the Telecom Italia Big Data Challenge we used to build this dataset.
Easy access to this dataset with quilt
Install the dataset repository:
$ quilt install vgauthier/DynamicPopEstimate
Use the dataset with a Panda Dataframe
>>> from quilt.data.vgauthier import DynamicPopEstimate
>>> import pandas as pd
>>> df = pd.DataFrame(DynamicPopEstimate.rome())
Use the dataset with a GeoPanda Dataframe
>>> from quilt.data.vgauthier import DynamicPopEstimate
>>> import geopandas as gpd
>>> df = gpd.DataFrame(DynamicPopEstimate.rome())
References
[1] G. Khodabandelou, V. Gauthier, M. El-Yacoubi, M. Fiore, "Population estimation from mobile network traffic metadata", in proc of the 17th International Symposium on A World of Wireless, Mobile and Multimedia Networks (WoWMoM), pp. 1 - 9, 2016.
[2] A. Furno, M. Fiore, R. Stanica, C. Ziemlicki, and Z. Smoreda, "A tale of ten cities: Characterizing signatures of mobile traffic in urban areas," IEEE Transactions on Mobile Computing, Volume: 16, Issue: 10, 2017.
Like the rest of the world, Sudan has been experiencing the unprecedented social and economic impact of the COVID-19 pandemic. From restrictions on movement to school closures and lockdowns, the economic situation worsened, and commodity prices soared across the country. Results from the first six rounds of the High-Frequency Phone survey indicated that household welfare was negatively affected. The situation led to the loss of employment and income, decreased access to essential commodities and services, and food insecurity, particularly among the poor and vulnerable Sudanese. Moreover, the inability to access food and medicine degraded in July/August 2021 despite a slight amelioration in February/April 2021.
After COVID-19 in 2020, Sudan experienced situations that are more likely to compromise the recovery process. Political instability, unrest, and protests occurred before and after the military takeover in October 2021. Meanwhile, Sudan Central Bank devalued the currency, which may increase the already high commodities price. Besides, Sudan encountered historic flooding since the onset of the rainy season between May and June 2022. To monitor and assess the dynamics of the impacts of the country's economic and political situation (high inflation, social unrest, food shortages, asset loss, displacement, etc.) on households' welfare, another round of the Sudan High-Frequency Phone survey took place in June to August 2022.
Similar to the six previous rounds, the survey was conducted using mobile phones and covered all 18 states of Sudan. Round 7 sample is composed of 2816 Households from both urban and rural areas of Sudan. This sample allows us to draw statistical inferences about the Sudanese population at the national and rural/urban levels. The risk of nonresponse was a concern, so efforts were made to minimize this risk, including follow-up with respondents who failed to respond and keep the interviews short (15–20 minutes) to reduce respondent fatigue.
The questions are similar to the previous six rounds of the High-Frequency Phone survey but with added context. Households are asked about the key channels through which individuals and households are expected to be affected by the exchange rate distortions, country political instability, or flooding that occurred in May/June 2022, as well as how they have recovered from the COVID-19 pandemic impacts. Furthermore, questions cover a range of topics/themes including, but not limited to, health conditions, access to health facilities, access to other social services, availability of common food and non-food items (including medicines), nutrition and food security, employment/labor, income, assets, coping strategies, remittances, subjective welfare, climate/weather events, and the safety nets assistance.
National
The sampling methodology adopted for the implementation of this survey is probabilistic. Each of the units in the targeted population of the study must have a nonzero and known probability of selection. The sample was stratified by rural/urban for all 18 states. The distribution of the sub-sample between states and rural/urban is proportional to the size of the individuals owning mobile phones, i.e., not equal allocation. The selection of the individual phones (the households) is random, i.e., with equal probability, using a systematic sample procedure in the list (frame) of phones. This allows for extrapolating the results of the sample to the target population and estimating the precision of the results obtained. However, the implementation of this approach requires the availability of an adequate sampling frame containing all the units of the population without omissions or duplications.
In this survey, the sampling frame is provided by the phone lists. Considerable efforts were made to compile the frame using multiple lists of phone numbers collected during the implementation of various projects/surveys during the last few years at the household level across the country. This reduces the chances of having more than one phone number per household. Moreover, the interviewers double-checked during data collection that only one number was called for each selected surveyed household. Therefore, selecting individual phone numbers is the same as selecting households. It is worth noting that for West Kordofan and Central Darfur, the proportionality of rural/urban cannot be done according to the size of phones since there are no details for rural/urban. So, the size of the rural and urban populations (projection 2020) was used instead.
In Sudan, under the present federal system, the state is considered a semiautonomous entity mandated to take care of the affairs of the citizen, provide governance, and be responsible for planning, policy formulation, and implementation of the annual program. Consequently, the sample needed to cover all 18 states of the country. The sample is conceived to provide reliable estimates for the country (urban and rural) and to give statistically meaningful results at the national level.
Computer Assisted Telephone Interview [cati]
BASELINE (ROUND 1): One questionnaire, the Household Questionnaire, was administered to all households in the sample. The Household Questionnaire provides information on: - Demographics - Knowledge regarding the spread of COVID-19 - Behavior and social distancing - Access to basic goods and services (medicines, staple food, health, education, financial services) - Employment - Income loss - Food insecurity experience - Welfare - Shocks and Coping strategies - Social safety nets
ROUND 2: One questionnaire, the Household Questionnaire, was administered to all households in the sample. The Household Questionnaire provides information on: - Demographics - Knowledge regarding the spread of COVID-19 - Behavior and social distancing - Access to basic goods and services (medicines, staple food, health, education, financial services, water, transportation, housing, internet, energy) - Employment - Income loss - Food insecurity experience - Welfare - Shocks and Coping strategies - Social safety nets ROUND 3: One questionnaire, the Household Questionnaire, was administered to all households in the sample. The Household Questionnaire provides information on: - Demographics - Behavior and social distancing - Access to basic goods and services (medicines, staple food, health, education, financial services) - Employment - Income loss - Food insecurity experience - Welfare - Shocks and Coping strategies - Social safety nets ROUND 4: One questionnaire, the Household Questionnaire, was administered to all households in the sample. The Household Questionnaire provides information on: - Demographics - Youth module screening - Behavior and social distancing - Access to basic goods and services (medicines, staple food, health, education, transportation, fuel) - Employment - Income loss - Food insecurity experience - Welfare - Shocks and Coping strategies - Social safety nets ROUND 5: One questionnaire, the Household Questionnaire, was administered to all households in the sample. Respondent were asked to think about each child in their household for the education question. The Household Questionnaire provides information on: - Demographics - Mental health of the respondent - Children education.
ROUND 6: One questionnaire, the Household Questionnaire, was administered to all households in the sample. One youth per household is interviewed in the youth section of the questionnaire. The Questionnaire provides information on: - Demographics - Access to basic goods (medicines, staple food) - Youth employment - Youth job search - Youth aspirations and expectations - Youth skills and mental health.
ROUND 7: One questionnaire, the Household Questionnaire, was administered to all households in the sample. The Household Questionnaire provides information on: - Geography - Access to basic goods and services (medicines, staple food, health, education, water, housing, electricity) - Employment - Income loss - Food insecurity experience - Welfare - Experience of Climate/Weather events - Shocks and Coping strategies
BASELINE (ROUND 1): A total of 4,032 households were successfully interviewed during the first round of data collection (conducted during June 16–July 5, 2020). Selected households from each state include both rural and urban households, with the representation of each state in the final sample being proportional to the state’s population relative to the overall population. Households who refused to tell their location (mode of living and state) were dropped to minimize bias. The final sample size accounts 4,027 households.
ROUND 2: Interviewers attempted to contact and interview all 4,032 households that were successfully interviewed in the baseline of the Sudan HFS on COVID-19. 2,989 households were successfully interviewed in the second round. However, households who refused to tell their location (mode of living and state) were dropped to minimize bias. The final sample size accounts 2,987 households.
ROUND 3: Interviewers attempted to contact and interview all 4,032 households that were successfully interviewed in the Baseline of the Sudan HFS on COVID-19. 2,990 households were successfully interviewed in the third round. Households who refused to tell their location (mode of living and state) were dropped to minimize bias. The final sample size accounts 2,987 households.
ROUND 4: Interviewers attempted to contact and interview all 4,032 households that were successfully interviewed in the Baseline of the Sudan
The global smartphone penetration in was forecast to continuously increase between 2024 and 2029 by in total 20.3 percentage points. After the fifteenth consecutive increasing year, the penetration is estimated to reach 74.98 percent and therefore a new peak in 2029. Notably, the smartphone penetration of was continuously increasing over the past years.The penetration rate refers to the share of the total population.The shown data are an excerpt of Statista's Key Market Indicators (KMI). The KMI are a collection of primary and secondary indicators on the macro-economic, demographic and technological environment in up to 150 countries and regions worldwide. All indicators are sourced from international and national statistical offices, trade associations and the trade press and they are processed to generate comparable data sets (see supplementary notes under details for more information).Find more key insights for the smartphone penetration in countries like North America and the Americas.
Datasys helps you understand and engage your audience wherever they are. Every day, we process up to 600 million privacy compliant records of online behavior, giving you a clear, actionable view of how consumers interact with your brand across websites, devices, and platforms.
In a world where people use multiple devices throughout their day, traditional cookies alone can’t tell the whole story. Datasys combines cookies with modern, privacy-friendly identifiers to connect the dots, making your targeting more accurate and your campaigns more effective.
Reach the right people at the right time on the right screen, all while respecting their privacy and building trust.
https://datacatalog.worldbank.org/public-licenses?fragment=researchhttps://datacatalog.worldbank.org/public-licenses?fragment=research
The World Bank in collaboration with the Kenya National Bureau of Statistics and the University of California, Berkeley are conducting the Kenya COVID-19 Rapid Response Phone Survey to track the socioeconomic impacts of the COVID-19 pandemic, the recovery from it as well as other shocks to provide timely data to inform policy. This dataset contains information from eight waves of the COVID-19 RRPS, which is part of a panel survey that targets Kenyan nationals and started in May 2020. The same households were interviewed every two months for five survey rounds, in the first year of data collection and every four months thereafter, with interviews conducted using Computer Assisted Telephone Interviewing (CATI) techniques.
The data set contains information from two samples of Kenyan households. The first sample is a randomly drawn subset of all households that were part of the 2015/16 Kenya Integrated Household Budget Survey (KIHBS) Computer-Assisted Personal Interviewing (CAPI) pilot and provided a phone number. The second was obtained through the Random Digit Dialing method, by which active phone numbers created from the 2020 Numbering Frame produced by the Kenya Communications Authority are randomly selected. The samples cover urban and rural areas and are designed to be representative of the population of Kenya using cell phones. Waves 1-7 of this survey include information on household background, service access, employment, food security, income loss, transfers, health, and COVID-19 knowledge and vaccinations. Wave 8 focused on how households were exposed to shocks, in particular adverse weather shocks and the increase in the price of food and fuel, but also included parts of the previous modules on household background, service access, employment, food security, income loss, and subjective wellbeing.
The data is uploaded in three files. The first is the hh file, which contains household level information. The ‘hhid’, uniquely identifies all household. The second is the adult level file, which contains data at the level of adult household members. Each adult in a household is uniquely identified by the ‘adult_id’. The third file is the child level file, available only for waves 3-7, which contains information for every child in the household. Each child in a household is uniquely identified by the ‘child_id’.
The duration of data collection and sample size for each completed wave was:
Wave 1: May 14 to July 7, 2020; 4,061 Kenyan households
Wave 2: July 16 to September 18, 2020; 4,492 Kenyan households
Wave 3: September 28 to December 2, 2020; 4,979 Kenyan households
Wave 4: January 15 to March 25, 2021; 4,892 Kenyan households
Wave 5: March 29 to June 13, 2021; 5,854 Kenyan households
Wave 6: July 14 to November 3, 2021; 5,765 Kenyan households
Wave 7: November 15, 2021, to March 31, 2022; 5,633 Kenyan households
Wave 8: May 31 to July 8, 2022: 4,550 Kenyan households
The same questionnaire is also administered to refugees in Kenya, with the data available in the UNHCR microdata library: https://microdata.unhcr.org/index.php/catalog/296/
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
The Arabic Sign Language (ASL) 20-Words Dataset v1 was carefully designed to reflect natural conditions, aiming to capture realistic signing environments and circumstances. Recognizing that nearly everyone has access to a smartphone with a camera as of 2020, the dataset was specifically recorded using mobile phones, aligning with how people commonly record videos in daily life. This approach ensures the dataset is grounded in real-world conditions, enhancing its applicability for practical use cases.
Each video in this dataset was recorded directly on the authors' smartphones, without any form of stabilization—neither hardware nor software. As a result, the videos vary in resolution and were captured across diverse locations, places, and backgrounds. This variability introduces natural noise and conditions, supporting the development of robust deep learning models capable of generalizing across environments.
In total, the dataset comprises 8,467 videos of 20 sign language words, contributed by 72 volunteers aged between 20 and 24. Each volunteer performed each sign a minimum of five times, resulting in approximately 100 videos per participant. This repetition standardizes the data and ensures each sign is adequately represented across different performers. The dataset’s mean video count per sign is 423.35, with a standard deviation of 18.58, highlighting the balance and consistency achieved across the signs.
For reference, Table 2 (in the research article) provides the count of videos for each sign, while Figure 2 (in the research article) offers a visual summary of the statistics for each word in the dataset. Additionally, sample frames from each word are displayed in Figure 3 (in the research article), giving a glimpse of the visual content captured.
For in-depth insights into the methodology and the dataset's creation, see the research paper: Balaha, M.M., El-Kady, S., Balaha, H.M., et al. (2023). "A vision-based deep learning approach for independent-users Arabic sign language interpretation". Multimedia Tools and Applications, 82, 6807–6826. https://doi.org/10.1007/s11042-022-13423-9
Please consider citing the following if you use this dataset:
@misc{balaha_asl_2024_db,
title={ASL 20-Words Dataset v1},
url={https://www.kaggle.com/dsv/9783691},
DOI={10.34740/KAGGLE/DSV/9783691},
publisher={Kaggle},
author={Mostafa Magdy Balaha and Sara El-Kady and Hossam Magdy Balaha and Mohamed Salama and Eslam Emad and Muhammed Hassan and Mahmoud M. Saafan},
year={2024}
}
@article{balaha2023vision,
title={A vision-based deep learning approach for independent-users Arabic sign language interpretation},
author={Balaha, Mostafa Magdy and El-Kady, Sara and Balaha, Hossam Magdy and Salama, Mohamed and Emad, Eslam and Hassan, Muhammed and Saafan, Mahmoud M},
journal={Multimedia Tools and Applications},
volume={82},
number={5},
pages={6807--6826},
year={2023},
publisher={Springer}
}
This dataset is available under the CC BY-NC-SA 4.0 license, which allows for sharing and adaptation under conditions of non-commercial use, proper attribution, and distribution under the same license.
For further inquiries or information: https://hossambalaha.github.io/.
The phone survey was conducted to gather data on the socio-economic impacts of COVID-19 crisis, as well as the Hunga Tonga-Hunga Ha'apai volcanic eruption and tsunami in Tonga. Round 2 interviewed 2,503 households both in urban and rural regions of the country from July 2022 to August 2022. Survey topics included employment and income, food security, coping strategies, access to health services, asset ownership, and preparedness. Purpose of Round 2 survey was to continue tracking the impact of the crises after Round 1, which was completed in April, 2022 - May, 2022. Additionally, round 2 survey besides the household information, gathers data on individual level that was not included in Round 1. Two individual datasets explore adult employment and child education. While these findings are not without their caveats due to the lack of baseline data, constraints of the mobile phone survey methodology, and data quality constraints, they represent the best estimates to date and supplement other data on macroeconomic conditions, exports, firm-level information, etc. to develop an initial picture of the impacts of the crises on the population.
National urban and rural (5 islands): Tongatapu, Vava'u, Ha'apai, Eua, Ongo Niua
Household, Individual
All respondents must be at least 18 years of age to undertake the survey.
Sample survey data [ssd]
The Tonga HFPS Round 2 sample was generated in three ways. The first method is Random Digit Dialing (RDD) process covering all cell telephone numbers active at the time of the sample selection. Approximately 16% of the sample was generated through RDD.
The RDD methodology generates virtually all possible telephone numbers in the country under the national telephone numbering plan and then draws a random sample of numbers. This method guarantees full coverage of the population with a phone.
First, a large first-phase sample of cell phone numbers was selected and screened through an automated process to identify the active numbers. Then, a smaller second-phase sample was selected from the active residential numbers identified in the first-phase sample and was delivered to the data collection team to be called by the interviewers. When a cell phone was called, the call answerer was interviewed as long as he or she was 18 years of age or above and knowledgeable about the household activities.
It was initially planned to stratify the sample by island group based on the phone number prefixes. However, this was not feasible given the high internal migration across islands and the atypical assignment of phone number prefixes across islands in Tonga. The sample is overrepresenting urban areas and the population of Tongatapu.
Approximately, 56% of the Round 2 sample was made up of the returning respondents from Round 1 who were recontacted.
The remaining 28% of the R2 respondents was taken from Tonga's Household Income and Expenditure Survey (HIES).
Computer Assisted Telephone Interview [cati]
The questionnaire was developed in both English and Tongan. Sections of the Questionnaire: 1. Interview Information 2. Basic Information 3. Vaccine Information 4. Health 5. Education 6. Food Insecurity 7. Employment 8. Income 9. Coping Strategies 10. Assets 11. Digital 12. Recontact
At the end of data collection, the raw dataset was cleaned by the survey firm and the World Bank team. Data cleaning mainly included formatting, relabeling, and excluding survey monitoring variables (e.g., interview start and end times). Data was edited using the software STATA.
Total number of households interviewed for round 2 survey was 5,085 out of which 2,503 finished the interview - about 50% success rate. More specifically, response rate for R1 recontacted households was 60.8%, and the response rate for RDD sample was 24%.
Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically
Mobile payments apps are used by more than two billion people globally, with millions more coming online each year. In India, South-east Asia and South America, the younger generation skipped the...
The COVID-19 pandemic is significantly having short term and long term impact on Burkinabe households’ welfare, impacting households through at least three broad channels: (i) the income/employment channel, which includes both labor and non-labor income, (ii) the price channel, and (iii) the long-term human capital channel. Most of these impacts are related to the direct health effect, but also to the containment measures that systematically altered socio-economic activities, resulting in a reduction of income across the board. Due to the urgent need for timely data and the limits on face-to-face surveys, the World Bank and the National Institute of Statistics and Demography (INSD) decided to implement a high-frequency phone survey of national households (HFPS) (https://microdata.worldbank.org/index.php/catalog/3768) to monitor the effects of COVID-19 on households, leveraging the available household phone number in the 2018/19 Enquete Harmonisée sur les Conditions de Vie des Ménages (EHCVM). In Burkina Faso, the forcibly displaced persons (FDP) include both refugees and internal displaced population. For security related issues, FDPs are predominantly internal displaced people (IDPs). According to recent studies, the number of internally displaced people soared from 87,000 in January 2019 to over 1 million in August 2020, an increase of more than 1000 per cent (Conseil National de Secours d'Urgence et de Réhabilitation – CONASUR, 2020). The unprecedented levels of displacement occurred as the coronavirus pandemic worsens an already critical humanitarian crisis in the violence-stricken country. This critical situation calls for the need of timely data and analysis especially during a pandemic for this vulnerable group in order to better inform policy and targeting programs. Given the mutual interest of the INSD, WB-UNHCR Joint Data Center on Forced Displacement (JDC), UNHCR, and World Bank, decision was made to further expand the sample of the high-frequency phone survey of national households (HFPS) to include IDPs for a total of three consecutive rounds. The core survey questionnaire of the Burkina Faso High Frequency Phone Survey on IDPs (BFA HFPS-IDP) is designed to cover important and relevant topics like employment, access to basic services and items, and non-labor sources of income. The core questionnaire is complemented by questions on selected topics that rotate each month, including knowledge of Covid-19 spread, social distancing and behavior, coping mechanisms to shocks, fragility, conflict and violence. Selected topics may be investigated more in detail in specific rounds.
The BFA HFPS-IDP is fielded alongside the Burkina Faso Covid-19 High Frequency Phone Survey of national households. Rounds 1, 2 and 3 of data collection for the HFPS-IDP occur simultaneously with round 9, 10 and 11 of the national HFPS operation, respectively.
The survey covers households from 9 of the 13 regions of Burkina Faso. These regions are: Boucle de Mouhoun, Cascades, Centre-Est, Centre-Nord, Est, Hauts-Bassins, Nord, Plateau Central, and Sahel.
Sample survey data [ssd]
The IDP sample is drawn from an IDP database named CONASUR database which serves as the sampling frame. The CONASUR has been developed and supported by the government of Burkina Faso with the technical and financial support of development partners, including UNHCR, IOM and OCHA. The CONASUR database is updated regularly, and has exhaustive list of refugees and IDPs, along with few socio-demographic characteristics, as well as information on the phone numbers of households. The sample is drawn from the 9 regions (out of 12) where the presence of IDPs is more relevant: Boucle du Mouhoun, Cascades, Centre-Est, Centre-Nord, Est, Hauts-Bassins, Nord, Plateau Central, Sahel. It is important to note that the BFA HFPS-IDPs is representative of households that have access to phones. Taken that into consideration, a key concern is the bias introduced by sampling households with at least a phone number, as phone penetration in some regions/areas might be limited. However, according to data from the CONASUR database, the percentage of households with at least one phone number is very high, accounting for above the 74% in all the sampled regions. To account for non-response and attrition, 1500 households were selected in baseline round of the HFS. 1,166 households were fully interviewed during the first round of interviews. The final successful sample have been contacted in subsequent rounds of the survey.
Computer Assisted Telephone Interview [cati]
ROUND 1: Household Respondent’s information; Access to Basic Services; Employment and revenues; Food Security and Other revenues. ROUND 2: Household Respondent’s information; Knowledge regarding the spread of COVID-19; Behavior and social distancing; Covid-19 Testing and Vaccination; Access to Basic Services; Credit; Employment and revenue (with a focus on livestock activities); Food Security; Other revenues; Shocks; Concerns regarding the impact of COVID-19 on personal health and financial wealth of the household; Fragility, Conflict and Violence. ROUND 3: Household Respondent’s information; Early Child Development; Access to Basic Services; Employment and revenue (with a focus on agricultural activities); Food Security; Other revenues; Concerns regarding the current situation; Social Safety Nets. All the interview materials were translated in French for the INSD. The questionnaire was administered in local languages with about varying length (about 25 minutes).
At the end of data collection, the raw dataset was cleaned by the INSD with the support of the WB team. This included formatting, and correcting results based on monitoring issues, enumerator feedback and survey changes.
BASELINE (ROUND 1): All 1500 households were called in the baseline round of the phone survey. 73.75 percent of sampled households were successfully contacted. Of those contacted, 1,156 households were fully interviewed. These 1,156 households constitute the final successful sample and will be contacted in subsequent rounds of the survey.
ROUND 2: Interviewers attempted to contact and interview all 1,156 households that were successfully interviewed in the Round 1 of the BFA COVID-19 HFPS. 1,114 households (96.3% of the 1,156 attempted) were contacted and 1,112 (96.1%) were successfully interviewed in the second round. Of those contacted, 2 households did not answer due to a language barrier.
ROUND 3: Interviewers attempted to contact and interview all 1,112 households that were successfully interviewed in the Round 2 of the BFA COVID-19 HFPS. 1,051 households (94.53% of the 1,112 attempted) were contacted and 1,048 (94.24%) were successfully interviewed in the third round. Of those contacted, 1 household refused the interview and 2 were only partially interviewed.
RESPONDENTS: Each round of the Burkina Faso COVID-19 HFPS has ONE RESPONDENT per household. The respondent was the household head or a knowledgeable adult household member. The respondent must be a member of the household. Unlike many other household surveys, interviewers were not expected to seek out other household members to provide their own information. The respondent may still consult with other household members as needed to respond to the questions, including to provide all the necessary information on each household member.
Interviewers were instructed to make every effort to reach the same respondent in subsequent rounds of the survey, in order to maintain the consistency of the information collected. However, in cases where the previous respondent was not available, interviewers would identify another knowledgeable adult household member to interview.
Access to up-to-date socio-economic data is a widespread challenge in Vanuatu and other Pacific Island Countries. To increase data availability and promote evidence-based policymaking, the Pacific Observatory provides innovative solutions and data sources to complement existing survey data and analysis. One of these data sources is a series of High Frequency Phone Surveys (HFPS), which began in 2020 to monitor the socio-economic impacts of the COVID-19 Pandemic, and since 2023 has grown into a series of continuous surveys for socio-economic monitoring. See https://www.worldbank.org/en/country/pacificislands/brief/the-pacific-observatory for further details.
For Vanuatu, data for December 2023 – January 2025 was collected with each month having approximately 1000 households in the sample and is representative of urban and rural areas but is not representative at the province level. This dataset contains combined monthly survey data for all months of the continuous HFPS in Vanuatu. There is one date file for household level data with a unique household ID. And a separate file for individual level data within each household data, that can be matched to the household file using the household ID, and which also has a unique individual ID within the household data which can be used to track individuals over time within households, where the data is panel data.
National, urban and rural. Six provinces were covered by this survey: Sanma, Shefa, Torba, Penama, Malampa and Tafea.
Household and individuals.
Sample survey data [ssd]
The Vanuatu High Frequency Phone Survey (HFPS) sample is drawn from the list of customer phone numbers (MSIDNS) provided by Digicel Vanuatu, one of the country’s two main mobile providers. Digicel’s customer base spans all regions of Vanuatu. For the initial data collection, Digicel filtered their MSIDNS database to ensure a representative distribution across regions. Recognizing the challenge of reaching low-income respondents, Digicel also included low-income areas and customers with a low-income profile (defined by monthly spending between 50 and 150 VT), as well as those with only incoming calls or using the IOU service without repayment. These filtered lists were then randomized, and enumerators began calling the numbers.
This approach was used to complete the first round of 1,000 interviews. The respondents from this first round formed a panel to be surveyed monthly. Each month, phone numbers from the panel are contacted until all have been interviewed, at which point new phone numbers (fresh MSIDNS from Digicel’s database) are used to replace those that have been exhausted. These new respondents are then added to the panel for future surveys.
Computer Assisted Telephone Interview [cati]
The questionnaire was developed in both English and Bislama. Sections of the Questionnaire:
-Interview Information
-Household Roster (separate modules for new households and returning households)
-Labor (separate modules for new households and returning households)
-Food Security
-Household Income
-Agriculture
-Social Protection
-Access to Services
-Assets
-Perceptions
-Follow-up
At the end of data collection, the raw dataset was cleaned by the survey firm and the World Bank team. Data cleaning mainly included formatting, relabeling, and excluding survey monitoring variables (e.g., interview start and end times). Data was edited using the software STATA.
The data are presented in two datasets: a household dataset and an individual dataset. The total number of observations is 13,779 in the household dataset and 77,501 in the individual dataset. The individual dataset contains information on individual demographics and labor market outcomes of all household members aged 15 and above, and the household data set contains information about household demographics, education, food security, household income, agriculture activities, social protection, access to services, and durable asset ownership. The household identifier (hhid) is available in both the household dataset and the individual dataset. The individual identifier (hhid_mem) can be found in the individual dataset.
In November 2024, a total of 7,874 calls were made. Of these, 2,251 calls were successfully connected, and 1,000 respondents completed the survey. By February 2024, the sample was fully comprised of returning respondents, with a re-contact rate of 99.9 percent.
The rapid and massive dissemination of mobile phones in the developing world is creating new opportunities for the discipline of survey research. The World Bank is interested in leveraging mobile phone technology as a means of direct communication with poor households in the developing world in order to gather rapid feedback on the impact of economic crises and other events on the economy of such households.
The World Bank commissioned Gallup to conduct the Listening to LAC (L2L) pilot program, a research project aimed at testing the feasibility of mobile phone technology as a way of data collection for conducting quick turnaround, self-administered, longitudinal surveys among households in Peru and Honduras.
The project used face-to-face interviews as its benchmark, and included Short Message Service (SMS), Interactive Voice Response (IVR) and Computer Assisted Telephone Interviews (CATI) as test methods of data collection.
The pilot was designed in a way that allowed testing the response rates and the quality of data, while also providing information on the cost of collecting data using mobile phones. Researchers also evaluated if providing incentives affected panel attrition rates. The Honduras design was a test-retest design, which is closely related to the difference-in-difference methodology of experimental evaluation.
The random stratified multistage sampling technique was used to select a nationally representative sample of 1,500 households. During the initial face-to-face interviews, researchers gathered information on the socio-economic characteristics of households and recruited participants for follow-up research. Questions wording was the same in all modes of data collection.
In Honduras, after the initial face-to-face interviews, respondents were exposed to the remaining three methodologies according to a randomized scheme (three rotations, one methodology per week). Panelists in Honduras were surveyed for four and a half months, starting in February 2012.
In Peru, households were randomly assigned to a communication mode (SMS, IVR, CATI), which stayed constant for all rounds (waves) of the survey.
Peru and Honduras - Includes the entire national territory, with the exception of neighborhoods where access of interviewers is extremely difficult, due to lack of transportation infrastructure or for situations that threaten the physical integrity of the interviewers and supervisors (i.e. extremely high crime rate, warfare, etc.)
All the households that exist in the neighborhoods of Honduras, as reported by the 2001 Census. Institutions such as military, religious or educational living quarters are not included in the universe.
Sample survey data [ssd]
Honduras
Honduras did not have an income oversample because the poverty rate is 60 percent, so oversampling 20 percent above the poverty rate would include a large portion of the middle class, which are not the most vulnerable in times of crisis.
The Honduras panel was built on a nationally representative sample of 1,500 households. The sample was drawn by means of a random, stratified, multistage design. The pilot used Gallup World Poll sampling frame.
Census-defined municipalities were classified into five strata according to population size: I. Municipalities with 500,000 to 999,000 inhabitants II. Municipalities with 100,000 to 499,000 inhabitants III. Municipalities with 50,000 to 99,000 inhabitants IV. Municipalities with 10,000 and 49,000 inhabitants V. Municipalities with less than 10,000 inhabitants
Interviews were then proportionally allocated to these five strata according to their share among the country's population.
The first stage of the design consisted of a random selection of Primary Sampling Units (PSU's) within each of the five strata previously defined.
In the second stage, in each PSU, one or more Secondary Sampling Units (SSU's) were then selected.
Once SSU's were selected, interviewers were sent to the field to proceed with the third stage of the sample design, which consisted of selecting households using a systematic "random route" procedure. Interviewers started from the previously selected "random origin" and walked around the block in clockwise direction, selecting every third household on their right hand side. They were also trained to handle vacant, nonresponsive, non-cooperative households, as well as other failed attempts, in a systematic manner.
Peru
The Peru panel was built on a nationally representative sample of 1,500 households. The sample was based on the sampling frame for the National Household Survey (ENAHO) conducted by the Peruvian National Statistics Office (INEI) every three months.
In Peru, the sample selection was guided by the following criteria: (i) the sample should be representative nationally, and in urban and rural areas, and (ii) households close to poverty line should be oversampled because policy decisions in time of crises need to be especially mindful of the poor and vulnerable. For the purposes of this project, "close to poverty line" was defined as 40 percent of consumption distribution that symmetrically band the national poverty line: 20 percent above and 20 percent below. In 27 percent of Peruvian households monthly per capita consumption was below the moderate poverty line in 2010 (ENAHO).Those households whose monthly per capita consumption falls between 7 and 47 percent of the national distribution were oversampled.
The L2L sample frame comprises all the panel conglomerados from the fourth trimester of ENAHO 2010, or 281 conglomerados.
Detailed information about the sampling procedure is available in "Listening to LAC: Using Mobile Phones for High Frequency Data Collection, Final Report" (p. 65-69) and "The World Bank Listening to LAC (L2L) Pilot Project Sample Design for Peru."
Other [oth]
The following survey instruments were used in the project:
1) Initial face-to-face questionnaire
In Peru, the starting point was the ENAHO (National Household Survey) questionnaire. Step-wise regressions were done to select the set of questions that best predicted consumption. For the purposes of robustness, the regressions were also done with questions that best predicted income, which yielded the same results. A similar procedure was done in Honduras, using the latest household survey deployed by the Honduran Statistics Institute, except that only best predictors of income were chosen, because Honduras did not have a recent consumption aggregate.
The survey gathered information on households' demographics, household infrastructure, employment, remittances, income, accidents, food security, self-perceptions on poverty, Internet access and cellphones use.
2) Monthly questionnaires (SMS, IVR, CATI)
The questionnaires were worded exactly the same way, regardless of the mode, which meant short questions, since SMS is limited to 160 characters. A maximum of 10 questions had to be chosen for the monthly questionnaire. In addition, two questions sought to ensure the validity of the responses by testing if the respondent was a member of the household. Most questions were time-variant and each questionnaire was repeated to observe if answers changed over time. All questions related to variables that strongly affect household welfare and are likely to change in times of crisis.
3) Final face-to-face questionnaire
Gallup conducted face-to-face closing surveys among 700 panelists. The researchers asked about issues the respondets had with mobile phones and coverage during the test. Panelists were also asked what would motivate them to keep on participating in a project like this in the future.
The questionnaires were worded exactly the same way, regardless of the mode, which meant short questions, since SMS is limited to 160 characters, unlike IVR and CATI.
In Honduras, 41% of recruited households failed to answer the first round of follow-up surveys. The attrition rate from the initial face-to-face interview to the end of panel study was 50%.
In Peru, 67 percent of recruited households failed to answer the first round of follow-up surveys. Attrition slightly increased with each wave of the survey (between 1 and 3 percentage points per wave), reaching 75 percent in wave 6.
As part of the survey administration process Gallup implemented a number of mechanisms to maximize the response rate and panelist retention. The following strategies were applied to respondents who did not replay first time:
Also, in order to minimize non-response, three types of incentives were given. First, households that did not own a mobile phone were provided one for free. Approximately 127 phones were donated in Honduras, and approximately 200 phones were donated in Peru. Second, all communications between the interviewers and the households were free to the respondents. Finally, households were randomly assigned to one of three incentive levels: one-third of households received US$1 in free airtime for each questionnaire they answered, one-third received US$5 in free
Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically
App Download Key StatisticsApp and Game DownloadsiOS App and Game DownloadsGoogle Play App and Game DownloadsGame DownloadsiOS Game DownloadsGoogle Play Game DownloadsApp DownloadsiOS App...
The global number of smartphone users in was forecast to continuously increase between 2024 and 2029 by in total *** billion users (+***** percent). After the fifteenth consecutive increasing year, the smartphone user base is estimated to reach *** billion users and therefore a new peak in 2029. Notably, the number of smartphone users of was continuously increasing over the past years.Smartphone users here are limited to internet users of any age using a smartphone. The shown figures have been derived from survey data that has been processed to estimate missing demographics.The shown data are an excerpt of Statista's Key Market Indicators (KMI). The KMI are a collection of primary and secondary indicators on the macro-economic, demographic and technological environment in up to *** countries and regions worldwide. All indicators are sourced from international and national statistical offices, trade associations and the trade press and they are processed to generate comparable data sets (see supplementary notes under details for more information).Find more key insights for the number of smartphone users in countries like the Americas and Asia.
In 2021, the number of mobile devices operating worldwide stood at almost 15 billion, up from just over 14 billion in the previous year. The number of mobile devices is expected to reach 18.22 billion by 2025, an increase of 4.2 billion devices compared to 2020 levels.
Moving forward with 5G
As the number of devices grows, so does our dependence on them to fulfill daily functions and activities. The use cases for mobile devices increasingly demand faster connection speeds and lower latency. The 5G network will be critical to fulfilling those demands, operating at significantly faster rates than 4G. In North America, for example, it is expected that there will be 218 million 5G connections, up from just ten million in 2020. This means around 48 percent of all mobile connections in North America. Globally, this figure should reach 20.1 percent by 2025.
6G: looking beyond 5G
While 5G has entered commercialization and is already creating new opportunities, researchers and engineers are already experimenting with 6G. Not only will the number of mobile devices continue to grow but cellular internet-of-things (IoT) devices are set to permeate more industrial sectors in the coming years, meaning a solution will eventually be required for network congestion and data transfer speeds.
6G ought to be capable of solving those problems before they arise, potentially enabling a network connection density ten times greater than that of 5G, and peak data rates up to fifty times faster than the rate of 5G. The Federal Communications Commission in the United States has opened spectrum for experimentation, and China have already launched what is described as a 6G satellite, so that actual potential of 6G should be revealed over the coming decade.