U.S. Government Workshttps://www.usa.gov/government-works
License information was derived automatically
The United States Department of Agriculture (USDA), National Agricultural Statistics Service (NASS) area sampling frame is a delineation of all parcels of land for the purpose of later sampling the parcels. The area frame is constructed by visually interpreting satellite imagery to divide a state into homogenous land use areas (strata) based on percent cultivated. The strata are typically defined as low, medium or high percent cultivated, non-agricultural land, urban use, agri-urban, or water. The boundaries of the strata usually follow identifiable features such as roads, railroads and waterways. The strata boundaries do not coincide with any political boundaries, with the exception of state boundaries. This site provides links to download ESRI shape and symbology layer files, as well as low resolution JPEG or higher resolution PDF images for each state. Also included in the FAQ are how to cite the data set, time period, how geographic features are represented and described, originators and contributors, contacts to address questions about the data, how the data set was created (previous works, e.g. USGS topographic quadrangles, US Census Bureau, space imagery, etc.), data generation-, processing-, and modification methods, and similar or related data. Applicable legal restrictions on access or use of the data and disclaimers are provided. Resources in this dataset:Resource Title: Land Use Strata - Selected States. File Name: Web Page, url: https://www.nass.usda.gov/Research_and_Science/stratafront2b.php This site provides links to download ESRI shape and symbology layer files, as well as low resolution JPEG or higher resolution PDF images for each state.
To better understand the impact of the shock induced by the COVID-19 pandemic on micro and small enterprises in Tunisia and assess the policy responses in a rapidly changing context, reliable data is imperative, and the need to resort to a dynamic data collection tool at a time when countries in the region are in a state of flux cannot be overstated. The COVID-19 MENA Monitor Survey was led by the Economic Research Forum (ERF) to provide data for researchers and policy makers on the economic and labor market impact of the global COVID-19 pandemic on enterprises.
The ERF COVID-19 MENA Monitor Survey is constructed using a series of short panel phone surveys, that are conducted approximately every two months, and it will cover business closure (temporary/permanent) due to lockdowns, ability to telework/deliver the service, disruptions to supply chains (for inputs and outputs), loss of product markets, increased cost of supplies, worker layoffs, salary adjustments, access to lines of credit and delays in transportation. Understanding the strategies of enterprises (particularly micro and small enterprises) to cope with the crisis is one of the main objectives of this survey. Specific constraints such as weak access to the internet in some areas or laws constraining goods' delivery will be analyzed. Enterprise owners will also be asked about prospects for the future, including ability to stay open, and whether they benefited from any measures to support their businesses. The ERF COVID-19 MENA Monitor Survey is a wide-ranging, nationally representative panel survey. The wave 3 of this dataset was collected from August to September 2021 and harmonized by the Economic Research Forum (ERF) and is featured as data for enterprise data.
The harmonization was designed to create comparable data that can facilitate cross-country and comparative research between other Arab countries (Morocco, Egypt, and Jordan). All the COVID-19 MENA Monitor surveys incorporate similar survey designs, with data on enterprises within Arab countries (Egypt, Jordan, Tunisia, and Morocco).
National
Enterprises
The sample universe for the enterprise survey was enterprises that had 6-199 workers pre-COVID-19
Sample survey data [ssd]
The sample universe for the firm survey was firms that had 6-199 workers pre-COVID-19. Stratified random samples were used to ensure adequate sample size in key strata. A target of 500 firms was set as a sample. Up to Five attempts were made to ensure response if a phone number was not picked up/answered, was disconnected or busy, or picked up but could not complete the interview at that time. After the fifth failed attempt, a firm was treated as a non-response and a random firm from the same stratum was used as an alternate.
Use the National Institute of Statistics (INS) and Agency for the Promotion of Industry and Innovation (APII) databases as follow: o Tunisia did not have a Yellow Pages or similar database, so administrative/statistics data sources had to be used o The sample started with the INS frame with 1,238 enterprises with 6-200 wage employees § Enterprises were stratified into: (1) Agriculture (2) Industry (3) Construction (4) Trade (5) Accommodation (6) Service § Enterprises were also stratified by size in terms of 6-49 versus 50-200 employees § A random stratified sample (order) was selected § Further restricted to enterprises with 6-199 workers in February 2020 based on an eligibility question during the phone interview § This sample frame was eventually exhausted o After the INS sample was exhausted, the APII sample was used § APII only covered enterprises with 10+ workers § APII only covered (1) services & transport, and (2) industry o Weights are based on the underlying data on all enterprises from INS, specifically: Entreprises privées selon l'activité principale et la tranche de salariés (RNE 2019). § We ultimately stratify the Tunisia weights by industry and enterprises sized: 6-9 employees (since APII only covered 10+), 10-49, and 50-199.
Computer Assisted Telephone Interview [cati]
The enterprise questionnaire is carried out to understand the strategies of enterprises -particularly micro and small enterprises- to cope with the crisis as well as related constraints and prospects for the future. It includes questions on business closure (temporary/permanent) due to lockdowns, ability to telework/deliver the service, disruptions to supply chains (for inputs and outputs), loss of product markets, increased cost of supplies, worker layoffs, salary adjustments, access to lines of credit and delays in transportation.
Note: The questionnaire can be seen in the documentation materials tab.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This data set shows labour force participation rate (LFPR) by urban and rural strata for all states in Malaysia. The statistics is derived from Labour Force Survey (LFS) which is conducted every month using household approach from year 1982 until 2020. LFPR is defined as the ratio of the labour force to the working age population (15-64 years), expressed as percentage. W.P. Labuan is gazzeted as a Federal Territory in 1984 while W.P. Putrajaya is gazzeted as a Federal Territory in 2001. The statistics for W.P. Putrajaya for 2001-2010 is treated as part of Selangor. Statistics for W.P. Putrajaya is available separately since 2011 onwards. LFS was not conducted during the years 1991 and 1994. No. of Views : 255
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This data set contains estimates of the base rates of 550 food safety-relevant food handling practices in European households. The data are representative for the population of private households in the ten European countries in which the SafeConsume Household Survey was conducted (Denmark, France, Germany, Greece, Hungary, Norway, Portugal, Romania, Spain, UK).
Sampling design
In each of the ten EU and EEA countries where the survey was conducted (Denmark, France, Germany, Greece, Hungary, Norway, Portugal, Romania, Spain, UK), the population under study was defined as the private households in the country. Sampling was based on a stratified random design, with the NUTS2 statistical regions of Europe and the education level of the target respondent as stratum variables. The target sample size was 1000 households per country, with selection probability within each country proportional to stratum size.
Fieldwork
The fieldwork was conducted between December 2018 and April 2019 in ten EU and EEA countries (Denmark, France, Germany, Greece, Hungary, Norway, Portugal, Romania, Spain, United Kingdom). The target respondent in each household was the person with main or shared responsibility for food shopping in the household. The fieldwork was sub-contracted to a professional research provider (Dynata, formerly Research Now SSI). Complete responses were obtained from altogether 9996 households.
Weights
In addition to the SafeConsume Household Survey data, population data from Eurostat (2019) were used to calculate weights. These were calculated with NUTS2 region as the stratification variable and assigned an influence to each observation in each stratum that was proportional to how many households in the population stratum a household in the sample stratum represented. The weights were used in the estimation of all base rates included in the data set.
Transformations
All survey variables were normalised to the [0,1] range before the analysis. Responses to food frequency questions were transformed into the proportion of all meals consumed during a year where the meal contained the respective food item. Responses to questions with 11-point Juster probability scales as the response format were transformed into numerical probabilities. Responses to questions with time (hours, days, weeks) or temperature (C) as response formats were discretised using supervised binning. The thresholds best separating between the bins were chosen on the basis of five-fold cross-validated decision trees. The binned versions of these variables, and all other input variables with multiple categorical response options (either with a check-all-that-apply or forced-choice response format) were transformed into sets of binary features, with a value 1 assigned if the respective response option had been checked, 0 otherwise.
Treatment of missing values
In many cases, a missing value on a feature logically implies that the respective data point should have a value of zero. If, for example, a participant in the SafeConsume Household Survey had indicated that a particular food was not consumed in their household, the participant was not presented with any other questions related to that food, which automatically results in missing values on all features representing the responses to the skipped questions. However, zero consumption would also imply a zero probability that the respective food is consumed undercooked. In such cases, missing values were replaced with a value of 0.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This data set shows unemployment rate by urban and rural strata for all states in Malaysia. The statistics is derived from Labour Force Survey (LFS) which is conducted every month using household approach and refers to those between the working age of 15-64 years old.
Unemployment rate is the proportion of unemployed population to the total population in the labour force, which measures the percentage of unemployed population in the labour force.
W.P. Labuan is gazzeted as a Federal Territory in 1984 while W.P. Putrajaya is gazzeted as a Federal Territory in 2001. The statistics for W.P. Putrajaya for 2001-2010 is treated as part of Selangor. Statistics for W.P. Putrajaya is available separately since 2011 onwards.
LFS was not conducted during the years 1991 and 1994.
U.S. Government Workshttps://www.usa.gov/government-works
License information was derived automatically
We used a stratified-random sampling approach to estimate the total abundance of Wedge-tailed Shearwater (Ardenna pacifica) nest sites across Kīlauea Point National Wildlife Refuge (KPNWR), Kauaʻi, during 1-7 July 2019. We first identified strata as unique geographic areas of the refuge to account for potential differences in nesting habitat and non-uniform nest site clustering. We then sub-divided strata where we expected high, low, minimal, or no nest site abundance. These distinctions were based on knowledge of shearwater nesting distribution gained while performing extensive ground-searching for tropicbirds across the entire refuge in April and May 2019. We delineated strata boundaries using recent satellite imagery in ArcGIS (version 10.7) and, based on direct observations in the field, refined in order to remove large contiguous areas lacking shearwater presence or nesting habitat. Planar area of each polygon was automatically calculated by ArcGIS. To calculate surface area ...
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This data set shows the number of unemployed persons by urban and rural strata for all states in Malaysia for year 1982 until 2020. The statistics is derived from Labour Force Survey (LFS) which is conducted every month using household approach and refers to those between the working age of 15-64 years old. The unemployed are classified into two groups that is the actively unemployed and inactively unemployed. The actively unemployed include all persons who did not work during the reference week but were available for work and were actively looking for work during the reference week. Inactively unemployed persons include the following categories: a. persons who did not look for work because they believed no work was available or that they were not qualified; b. persons who would have looked for work if they had not been temporarily ill or had it not been for bad weather; c. persons who were waiting for result of job applications; and d. persons who had looked for work prior to the reference week. W.P. Labuan is gazzeted as a Federal Territory in 1984 while W.P. Putrajaya is gazzeted as a Federal Territory in 2001. The statistics for W.P. Putrajaya for 2001-2010 is treated as part of Selangor. Statistics for W.P. Putrajaya is available separately since 2011 onwards. LFS was not conducted during the years 1991 and 1994. More info: https://www.dosm.gov.my
The objective of this three-year panel survey is to provide the Government of Nepal with empirical evidence on the patterns of exposure to shocks at the household level and on the vulnerability of households’ welfare to these shocks. It covers 6,000 households in non-metropolitan areas of Nepal, which were interviewed in mid 2016. Being a relatively comprehensive and representative (rural) sample household survey, it can also be used for other research into living conditions of Nepali households in rural areas. This is the entire dataset for the first wave of the survey. The same households will be reinterviewed in mid 2017 and mid 2018. The survey dataset contains a multi-topic survey which was completed for each of the 6,000 households, and a community survey fielded to a senior community representative at the village development committee (VDC) level in each of the 400 PSUs.
All non-metropolitan areas in Nepal. Non-metropolitan areas are as defined by the 2010 Census.
Household, following the NLSS definition.
Sample survey data [ssd]
The sample frame was all households in non-metropolitan areas per the 2010 Census definition, excluding households in the Kathmandu valley (Kathmandu, Lalitpur and Bhaktapur districts). The country was segmented into 11 analytical strata, defined to correspond to those used in the NLSS III (excluding the three urban strata used there). To increase the concentration of sampled households, 50 of the 75 districts in Nepal were selected with probability proportional to size (the measure of size being the number of households). PSUs were selected with probability proportional to size from the entire list of wards in the 50 selected districts, one stratum at a time. The number of PSUs per stratum is proportional to the stratum's population share, and corresponds closely to the allocations used in the LFS-II and NLSS-III (adjusted for different overall numbers of PSUs in those surveys).
In each of the selected PSUs (administrative wards), survey teams compiled a list of households in the ward based on existing administrative records, and cross-checked with local leaders. The number of households shown in the list was compared to the ward population in the 2010 Census, adjusted for likely population growth. Where the listed population deviated by more than 10% from the projected population based on the Census data, the team conducted a full listing of households in the ward. 15 households were selected at random from the ward list for interviewing, and a further 5 households were selected as potential replacements.
During the fieldwork, one PSU in Lapu VDC was inaccessible due to weather, and was replaced by a ward in Hastichaur VDC using PPS sampling on that stratum (excluding the already selected PSUs). All other sampled PSUs were reached, and a full sample of 6,000 households was interviewed in the first wave.
Computer Assisted Personal Interview [capi]
The household questionnaire contained 16 modules: the household roster; education; health; housing and access to facilities; food expenses and home production; non-food expenditures and inventory of durable goods; jobs and time use; wage jobs; farming and livestock; non-agriculture enterprises/activities; migration; credit, savings, and financial assets; private assistance; public assistance; shocks; and anthropometrics (for children less than 5 years). Where possible, the style of questions was kept similar to those used in the NLSS-III questionnaire for comparability reasons. In some cases, new modules needed to be developed. The shocks questionnaire was developed by the World Bank team. A food security module was added based on the design recommended by USAID, and a psychosocial questionnaire was also developed by social development specialists in the World Bank. The section on government and other assistance was also redesigned to cover a broader range of programs and elicit information on details such as experience with enrollment and frequency of payment.
The community questionnaire was fielded to a senior community representative at the VDC level in each of the 400 PSUs. The purpose of the community questionnaire was to obtain further details on access to services in each PSU, to gather information on shocks at the community level, and to collect market price data. The questionnaire had six modules: respondent details; community characteristics; access to facilities; educational facilities; community shocks, household shocks; and market price.
These are the raw data entered and checked by the survey firm, formatted to conform to the original questionnaire numbering system and confidentialized. The data were cleaned for spelling errors and translation of Nepali phrases, and suspicious values were checked by calling respondents. No other transformations have taken place.
Of the 6,000 originally sampled households, 5,191 agreed to be interviewed. Of the 13.5% of households that were not interviewed, 11.1% were resident but could not be located by the team after two attempts, 0.9% were found to have outmigrated, and 1.4% refused. The 809 replacement households were drawn in order from the randomized list created during sampling (see above).
A trawl survey is conducted each year at Heard Island and McDonald Islands (HIMI) to assess the abundance and biology of fish and invertebrate species. The survey provides information for input into …Show full descriptionA trawl survey is conducted each year at Heard Island and McDonald Islands (HIMI) to assess the abundance and biology of fish and invertebrate species. The survey provides information for input into the stock assessments for the two main fished species, Patagonian toothfish (Dissostichus eleginoides), and mackerel icefish (Champsocephalus gunnari). In addition, it provides information on biodiversity and bycatch species from the fishery. Nine strata were defined as areas for sampling during the annual Random Stratified Trawl Survey (RSTS) conducted on board an industry vessel. The area of the plateau down to 1000 metres depth was divided into nine strata, each covering an area of similar depth and/or fish abundance. A number of randomly allocated stations (between 10 and 30) are sampled in each stratum during every survey to assess the abundance of juvenile and adult toothfish on the shallow and deep parts of the Heard Island Plateau (300 to 1000 metres depth) and to assess the abundance of mackerel icefish on the Heard Island Plateau. Although the number and boundaries of strata have been adjusted over the years, they have been consistent since 2002 (Welsford et al. 2006). This dataset consists of a polygon shapefile representing the strata and a map displaying the strata.
This dataset provides information about the number of properties, residents, and average property values for Stratum Way cross streets in Atlanta, GA.
This dataset provides information about the number of properties, residents, and average property values for Stratum Way cross streets in Hampton, VA.
The dataset is a relational dataset of 8,000 households households, representing a sample of the population of an imaginary middle-income country. The dataset contains two data files: one with variables at the household level, the other one with variables at the individual level. It includes variables that are typically collected in population censuses (demography, education, occupation, dwelling characteristics, fertility, mortality, and migration) and in household surveys (household expenditure, anthropometric data for children, assets ownership). The data only includes ordinary households (no community households). The dataset was created using REaLTabFormer, a model that leverages deep learning methods. The dataset was created for the purpose of training and simulation and is not intended to be representative of any specific country.
The full-population dataset (with about 10 million individuals) is also distributed as open data.
The dataset is a synthetic dataset for an imaginary country. It was created to represent the population of this country by province (equivalent to admin1) and by urban/rural areas of residence.
Household, Individual
The dataset is a fully-synthetic dataset representative of the resident population of ordinary households for an imaginary middle-income country.
ssd
The sample size was set to 8,000 households. The fixed number of households to be selected from each enumeration area was set to 25. In a first stage, the number of enumeration areas to be selected in each stratum was calculated, proportional to the size of each stratum (stratification by geo_1 and urban/rural). Then 25 households were randomly selected within each enumeration area. The R script used to draw the sample is provided as an external resource.
other
The dataset is a synthetic dataset. Although the variables it contains are variables typically collected from sample surveys or population censuses, no questionnaire is available for this dataset. A "fake" questionnaire was however created for the sample dataset extracted from this dataset, to be used as training material.
The synthetic data generation process included a set of "validators" (consistency checks, based on which synthetic observation were assessed and rejected/replaced when needed). Also, some post-processing was applied to the data to result in the distributed data files.
This is a synthetic dataset; the "response rate" is 100%.
In 1992, Bosnia-Herzegovina, one of the six republics in former Yugoslavia, became an independent nation. A civil war started soon thereafter, lasting until 1995 and causing widespread destruction and losses of lives. Following the Dayton accord, BosniaHerzegovina (BiH) emerged as an independent state comprised of two entities, namely, the Federation of Bosnia-Herzegovina (FBiH) and the Republika Srpska (RS), and the district of Brcko. In addition to the destruction caused to the physical infrastructure, there was considerable social disruption and decline in living standards for a large section of the population. Along side these events, a period of economic transition to a market economy was occurring. The distributive impacts of this transition, both positive and negative, are unknown. In short, while it is clear that welfare levels have changed, there is very little information on poverty and social indicators on which to base policies and programs.
In the post-war process of rebuilding the economic and social base of the country, the government has faced the problems created by having little relevant data at the household level. The three statistical organizations in the country (State Agency for Statistics for BiH –BHAS, the RS Institute of Statistics-RSIS, and the FBiH Institute of Statistics-FIS) have been active in working to improve the data available to policy makers: both at the macro and the household level. One facet of their activities is to design and implement a series of household series. The first of these surveys is the Living Standards Measurement Study survey (LSMS). Later surveys will include the Household Budget Survey (an Income and Expenditure Survey) and a Labor Force Survey. A subset of the LSMS households will be re-interviewed in the two years following the LSMS to create a panel data set.
The three statistical organizations began work on the design of the Living Standards Measurement Study Survey (LSMS) in 1999. The purpose of the survey was to collect data needed for assessing the living standards of the population and for providing the key indicators needed for social and economic policy formulation. The survey was to provide data at the country and the entity level and to allow valid comparisons between entities to be made.
The LSMS survey was carried out in the Fall of 2001 by the three statistical organizations with financial and technical support from the Department for International Development of the British Government (DfID), United Nations Development Program (UNDP), the Japanese Government, and the World Bank (WB). The creation of a Master Sample for the survey was supported by the Swedish Government through SIDA, the European Commission, the Department for International Development of the British Government and the World Bank.
The overall management of the project was carried out by the Steering Board, comprised of the Directors of the RS and FBiH Statistical Institutes, the Management Board of the State Agency for Statistics and representatives from DfID, UNDP and the WB. The day-to-day project activities were carried out by the Survey Mangement Team, made up of two professionals from each of the three statistical organizations.
The Living Standard Measurement Survey LSMS, in addition to collecting the information necessary to obtain a comprehensive as possible measure of the basic dimensions of household living standards, has three basic objectives, as follows:
To provide the public sector, government, the business community, scientific institutions, international donor organizations and social organizations with information on different indicators of the population’s living conditions, as well as on available resources for satisfying basic needs.
To provide information for the evaluation of the results of different forms of government policy and programs developed with the aim to improve the population’s living standard. The survey will enable the analysis of the relations between and among different aspects of living standards (housing, consumption, education, health, labor) at a given time, as well as within a household.
To provide key contributions for development of government’s Poverty Reduction Strategy Paper, based on analyzed data.
National coverage. Domains: Urban/rural/mixed; Federation; Republic
Sample survey data [ssd]
A total sample of 5,400 households was determined to be adequate for the needs of the survey: with 2,400 in the Republika Srpska and 3,000 in the Federation of BiH. The difficulty was in selecting a probability sample that would be representative of the country's population. The sample design for any survey depends upon the availability of information on the universe of households and individuals in the country. Usually this comes from a census or administrative records. In the case of BiH the most recent census was done in 1991. The data from this census were rendered obsolete due to both the simple passage of time but, more importantly, due to the massive population displacements that occurred during the war.
At the initial stages of this project it was decided that a master sample should be constructed. Experts from Statistics Sweden developed the plan for the master sample and provided the procedures for its construction. From this master sample, the households for the LSMS were selected.
Master Sample [This section is based on Peter Lynn's note "LSMS Sample Design and Weighting - Summary". April, 2002. Essex University, commissioned by DfID.]
The master sample is based on a selection of municipalities and a full enumeration of the selected municipalities. Optimally, one would prefer smaller units (geographic or administrative) than municipalities. However, while it was considered that the population estimates of municipalities were reasonably accurate, this was not the case for smaller geographic or administrative areas. To avoid the error involved in sampling smaller areas with very uncertain population estimates, municipalities were used as the base unit for the master sample.
The Statistics Sweden team proposed two options based on this same method, with the only difference being in the number of municipalities included and enumerated. For reasons of funding, the smaller option proposed by the team was used, or Option B.
Stratification of Municipalities
The first step in creating the Master Sample was to group the 146 municipalities in the country into three strata- Urban, Rural and Mixed - within each of the two entities. Urban municipalities are those where 65 percent or more of the households are considered to be urban, and rural municipalities are those where the proportion of urban households is below 35 percent. The remaining municipalities were classified as Mixed (Urban and Rural) Municipalities. Brcko was excluded from the sampling frame.
Urban, Rural and Mixed Municipalities: It is worth noting that the urban-rural definitions used in BiH are unusual with such large administrative units as municipalities classified as if they were completely homogeneous. Their classification into urban, rural, mixed comes from the 1991 Census which used the predominant type of income of households in the municipality to define the municipality. This definition is imperfect in two ways. First, the distribution of income sources may have changed dramatically from the pre-war times: populations have shifted, large industries have closed and much agricultural land remains unusable due to the presence of land mines. Second, the definition is not comparable to other countries' where villages, towns and cities are classified by population size into rural or urban or by types of services and infrastructure available. Clearly, the types of communities within a municipality vary substantially in terms of both population and infrastructure.
However, these imperfections are not detrimental to the sample design (the urban/rural definition may not be very useful for analysis purposes, but that is a separate issue). [Note: It may be noted that the percent of LSMS households in each stratum reporting using agricultural land or having livestock is highest in the "rural" municipalities and lowest in the "urban" municipalities. However, the concentration of agricultural households is higher in RS, so the municipality types are not comparable across entities. The percent reporting no land or livestock in RS was 74.7% in "urban" municipalities, 43.4% in "mixed" municipalities and 31.2% in "rural" municipalities. Respective figures for FbiH were 88.7%, 60.4% and 40.0%.]
The classification is used simply for stratification. The stratification is likely to have some small impact on the variance of survey estimates, but it does not introduce any bias.
Selection of Municipalities
Option B of the Master Sample involved sampling municipalities independently from each of the six strata described in the previous section. Municipalities were selected with probability proportional to estimated population size (PPES) within each stratum, so as to select approximately 50% of the mostly urban municipalities, 20% of the mixed and 10% of the mostly rural ones. Overall, 25 municipalities were selected (out of 146) with 14 in the FbiH and 11 in the RS. The distribution of selected municipalities over the sampling strata is shown below.
Stratum / Total municipalities Mi / Sampled municipalities mi 1. Federation, mostly urban / 10 / 5 2. Federation, mostly mixed / 26 / 4 3. Federation, mostly rural / 48 / 5 4. RS, mostly urban /4 / 2 5. RS, mostly mixed /29 / 5 6. RS, mostly rural / 29 / 4
Note: Mi is the total number of municipalities in stratum i (i=1, … , 6); mi is the number of municipalities selected from stratum
This dataset contains a set of data files used as input for a World Bank research project (empirical comparative assessment of machine learning algorithms applied to poverty prediction). The objective of the project was to compare the performance of a series of classification algorithms. The dataset contains variables at the household, individual, and community levels. The variables selected to serve as potential predictors in the machine learning models are all qualitative variables (except for the household size). Information on household consumption is included, but in the form of dummy variables (indicating whether the household consumed or not each specific product or service listed in the survey questionnaire). The household-level data file contains the variables "Poor / Non poor" which served as the predicted variable ("label") in the models.
One of the data files included in the dataset contains data on household consumption (amounts) by main categories of products and services. This data file was not used in the prediction model. It is used only for the purpose of analyzing the models mis-classifications (in particular, to identify how far the mis-classified households are from the national poverty line).
These datasets are provided to allow interested users to replicate the analysis done for the project using Python 3 (a collection of Jupyter Notebooks containing the documented scripts is openly available on GitHub).
National
Sample survey data [ssd]
The IHS3 sampling frame is based on the listing information and cartography from the 2008 Malawi Population and Housing Census (PHC); includes the three major regions of Malawi, namely North, Center and South; and is stratified into rural and urban strata. The urban strata include the four major urban areas: Lilongwe City, Blantyre City, Mzuzu City, and the Municipality of Zomba. All other areas are considered as rural areas, and each of the 27 districts were considered as a separate sub-stratum as part of the main rural stratum. It was decided to exclude the island district of Likoma from the IHS3 sampling frame, since it only represents about 0.1% of the population of Malawi, and the corresponding cost of enumeration would be relatively high. The sampling frame further excludes the population living in institutions, such as hospitals, prisons and military barracks. Hence, the IHS3 strata are composed of 31 districts in Malawi.
A stratified two-stage sample design was used for the IHS3.
Face-to-face [f2f]
The survey was collectd using four questionnaires: 1) Household Questionnaire 2) Agriculture Questionnaire 3) Fishery Questionnaire 4) Community Questionnaire
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This data set shows the number of labour force by urban and rural strata for all states in Malaysia for year 1982 until 2021. The statistics is derived from Labour Force Survey (LFS) which is conducted every month using household approach. Labour force refers to those who during the reference week of LFS, are in the 15-64 years age group and who are either employed or unemployed. W.P. Labuan is gazzeted as a Federal Territory in 1984 while W.P. Putrajaya is gazzeted as a Federal Territory in 2001. The statistics for W.P. Putrajaya for 2001-2010 is treated as part of Selangor. Statistics for W.P. Putrajaya is available separately since 2011 onwards. LFS was not conducted during the years 1991 and 1994. Working less than 30 hours only available for Malaysia More info: https://www.dosm.gov.my No. of Views : 403
Reflectance confocal microscopy (RCM) is a powerful tool for in-vivo examination of a variety of skin diseases. However, current use of RCM depends on qualitative examination by a human expert to look for specific features in the different strata of the skin. Developing approaches to quantify features in RCM imagery requires an automated understanding of what anatomical strata is present in a given en-face section. This work presents an automated approach using a bag of features approach to represent en-face sections and a logistic regression classifier to classify sections into one of four classes (stratum corneum, viable epidermis, dermal-epidermal junction and papillary dermis). This approach was developed and tested using a dataset of 308 depth stacks from 54 volunteers in two age groups (20–30 and 50–70 years of age). The classification accuracy on the test set was 85.6%. The mean absolute error in determining the interface depth for each of the stratum corneum/viable epidermis, vi...
The purpose of the SNF study was to improve our understanding of the relationship between remotely sensed observations and important biophysical parameters in the boreal forest. A key element of the experiment was the development of methodologies to measure forest stand characteristics to determine values of importance to both remote sensing and ecology. Parameters studied were biomass, leaf area index, above ground net primary productivity, bark area index and ground coverage by vegetation. Thirty two quaking aspen and thirty one black spruce sites were studied. Use of multiple plots within each site allowed estimation of the importance of spatial variation in stand parameters. Within each plot, all woody stems greater than two meters in height were recorded by species and relevant dimensions were measured. Diameter breast height (dbh) was measured directly. Height of the tree and height of the first live branch were determined by triangulation. The difference between these two heights was used as the depth of crown. Similar measurements were made for shrubs between one and two meters tall in the aspen sites. The Forest Canopy Composition (SNF) data set provides the counts of canopy (over two meters tall) tree species and subcanopy (between one and two meters tall) tree species. Also related, for the aspen sites, in each plot a visual estimation of the percent coverage of the canopy, subcanopy and understory vegetation was made. The site averages of these coverage estimates are presented in the Aspen Forest Cover by Stratum/Plot (SNF) data set.
The NLSS 1995/96 is basically limited to the living standards of households.
The basic objectives of this survey was to provide information required for monitoring the progress in improving national living standards and to evaluate the impact of various government policies and program on living condition of the population. This survey captured comprehensive set of data on different aspects of households welfare like consumption, income, housing, labour markets, education, health etc.
National coverage The 4 strata of the survey: - Mountains - Hills (Urban) - Hills (Rural) - Terai
The survey covered all modified de jure household members (usual residents).
Sample survey data [ssd]
Sample Design
Sample Frame: A complete list of all wards in the country, with a measure of size, was developed in order to select from it with Probability Proportional to Size (PPS) the sample of wards to be visited. The 1991 Population Census of Nepal was the best starting point for building such a sample frame. The Central Bureau of Statistics (CBS) constructed a data set with basic information from the census at the ward level. This data set was used as a sample frame to develop the NLSS sample.
Sample Design: The sample size for the NLSS was set at 3,388 households. This sample was divided into four strata based on the geographic and ecological regions of the country: (i) mountains, (ii) urban Hills, (iii) rural Hills, and (iv) Terai.
The sample size was designed to provide enough observations within each ecological stratum to ensure adequate statistical accuracy, as well as enough variation in key variables for policy analysis within each stratum, while respecting resource constraints and the need to balance sampling and non-sampling errors.
A two-stage stratified sampling procedure was used to select the sample for the NLSS. The primary sampling unit (PSU) is the ward, the smallest administrative unit in the 1991 Population Census. In order to increase the variability of the sample, it was decided that a small number of households - twelve - would be interviewed in each ward. Thus, a total of275 wards was obtained.
In the first stage of the sampling, wards were selected with probability proportional to size (PPS) from each of the four ecological strata, using the number of household in the ward as the measure of size. In order to give the sample an implicit stratification respecting the division of the country into Development Regions, the sample frame was sorted by ascending order of district codes, and these were numbered from East to West. The sample frame considered all the 75 districts in the country, and indeed 73 of them were represented in the sample. In the second stage of the sampling, a fixed number of households were chosen with equal probabilities from each selected PSU.
The two-stage procedure just described has several advantages. It simplified the analysis by providing a self-weighted sample. It also reduced the travel time and cost, as 12 or 16 households are interviewed in each ward. In addition, as the number of households to be interviewed in each ward was known in advance, the procedure made it possible to plan an even workload across different survey teams.
Face-to-face [f2f]
A preliminary draft of the questionnaire was first prepared with several discussions held between the core staff and the consultant to the project. Several documents both received from the world bank as well as from countries that had already conducted such surveys in the past were referred during this process. Subsequently the questionnaire was translated into NepalI.
After a suitable draft design of the questionnaire, a pre-test was conducted in five different places of the country. The places selected for the pre-test were Biratnagar, Rasuwa, Palpa, Nepalganj and Kathmandu Valley. The entire teams created for the pre-test were also represented by either a consultant or an expert from the bank. Feedback received from the field was utilized for necessary improvements in finalizing the seventy page questionnaire.
The content of each questionnaire is as follows:
HOUSEHOLD QUESTIONNAIRE
Section 1. HOUSEHOLD INFORMATION This section served two main purposes: (i) identify every person who is a member of the household, and (ii) provide basic demographic data such as age, sex, and marital status of everyone presently living in the household. In addition, information collected also included data on all economic activities undertaken by household members and on unemployment.
Section 2. HOUSING This section collected information on the type of dwelling occupied by the household, as well as on the household's expenditures on housing and amenities (rent, expenditure on water, garbage collection, electricity, etc.).
Section 3. ACCESS TO FACILITIES This section collected information on the distance from the household's residence to various public facilities and services.
Section 4. MIGRATION This section collected information from the household head on permanent migration for reasons of work or land availability.
Section 5. FOOD EXPENSES AND HOME PRODUCTION This section collected information on all food expenditures of the household, as well as on consumption of food items that the household produced.
Section 6. NON-FOOD EXPENDITURES AND INVENTORY OF DURABLE GOODS This section collected information on expenditure on non-food items (clothing, fuels, items for the house, etc.), as well as on the durable goods owned by the household.
Section 7. EDUCATION This section collected information on literacy for all household members aged 5 years and above, on the level of education for those members who have attended school in the past, and on levelof education and expenditures on schooling for those currently attending an educational institution.
Section 8. HEALTH This section collected information on illnesses, use of medical facilities, expenditure on health care, children's immunization, and diarrhea.
Section 9. ANTHROPOMETRICS This section collected weight and height measurements for all children 3 years or under.
Section 10. MARRIAGE AND MATERNITY HISTORY This section collected information on maternity history, pre/post-natal care, and knowledge/use of family planning methods.
Section 11. WAGE EMPLOYMENT This section collected information on wage employment in agriculture and in non-agricultural activities, as well as on income earned through wage labor.
Section 12. FARMING AND LIVESTOCK This section collected information on all agricultural activities -- land owned or operated, crops grown, use of crops, income from the sale of crops, ownership of livestock, and income from the sale of livestock.
Section 13. NON-FARM ENTERPRISES/ACTIVITIES This section collected information on all non-agricultural enterprises and activities -- type of activity, revenue earned, expenditures, etc.
Section 14. CREDIT AND SAVINGS This section collected information on loans made by the household to others, or loans taken from others by household members, as well as on land, property, or other fixed assets owned by the household.
Section 15. REMITTANCES AND TRANSFERS This section collected information on remittances sent by members of the household to others and on transfers received by members of the household from others.
Section 16. OTHER ASSETS AND INCOME This section collected information on income from all other sources not covered elsewhere in the questionnaire.
Section 17. ADEQUACY OF CONSUMPTION This section collected information on whether the household perceives its level of consumption to be adequate or not.
RURAL COMMUNITY QUESTIONNAIRE
Section 1. POPULATION CHARACTERISTICS AND INFRASTRUCTURES This section collected information on the characteristics of the community, availability of electricity and its services and water supply and sewerage.
Section 2. ACCESS TO FACILITIES Data on services and amenities, education status and health facilities was collected.
Section 3. AGRICULTURE AND FORESTRY Information on the land situation, irrigation systems, crop cycles, wages paid to hired labor, rental rates for cattle and machinery and forestry use were asked in this section.
Section 4. MIGRATION This section collected information on the main migratory movements in and out.
Section 5. DEVELOPMENT PROGRAMS, USER GROUPS, etc. In this section, information on development programs, existence user groups, and the quality of life in the community was collected.
Section 6. RURAL PRIMARY SCHOOL This section collected information on enrollment, infrastructure, and supplies.
Section 7. RURAL HEALTH FACILITY This section collected information on health facilities, equipment and services available, and health personnel in the community.
Section 8. MARKETS AND PRICES This section collected information on local shops, Haat Bazaar, agricultural inputs, sale of crops and the conversion of local units into standard units.
URBAN COMMUNITY QUESTIONNAIRE
Section 1. POPULATION CHARACTERISTICS AND INFRASTRUCTURE Information was collected on the characteristics of the community, availability of electricity, water supply and sewerage system in the ward.
Section 2. ACCESS TO FACILITIES This section collected information on the distance from the community to the various places and public facilities and services.
Section 3. MARKETS AND PRICES This section collected information on the availability and prices of different goods.
Section 4. QUALITY OF LIFE Here the notion of the quality of life in the community was
Provenance of Cretaceous-Miocene sedimentary rocks in the Nepalese Lesser Himalayas is significant for understanding the timing and pattern of the India-Asia collision and tectonic evolution of the Himalaya orogen. This research analyses the provenance of Cretaceous-Miocene sedimentary rocks in two representative sections (Butwal section and Kalya section) in Nepalese Lesser Himalaya, which constrain the timing and pattern of the India-Asia collision and reconstruct the tectonic evolution of the Himalayan orogen. An ATL 193 nm ArF excimer laser ablation (LA) system coupled with an Agilent 7700X inductively coupled plasma mass spectrometry (ICP-MS) instrument was adopted to complete detrital zircon U-Pb dating of ten samples; A Geolas HD LA system coupled with a Neupane Plus multi-collector ICP-MS instrument were used to complete zircon in-situ Hf isotope analysis. Analytical results of detrital zircon U-Pb-Hf indicate that the timing of the India-Asia collision was no later than the early-middle Eocene in the central Himalayas, and the Greater Himalayas exposed to the surface by the early Miocene.
This dataset provides information about the number of properties, residents, and average property values for Strata cross streets in Irvine, CA.
U.S. Government Workshttps://www.usa.gov/government-works
License information was derived automatically
The United States Department of Agriculture (USDA), National Agricultural Statistics Service (NASS) area sampling frame is a delineation of all parcels of land for the purpose of later sampling the parcels. The area frame is constructed by visually interpreting satellite imagery to divide a state into homogenous land use areas (strata) based on percent cultivated. The strata are typically defined as low, medium or high percent cultivated, non-agricultural land, urban use, agri-urban, or water. The boundaries of the strata usually follow identifiable features such as roads, railroads and waterways. The strata boundaries do not coincide with any political boundaries, with the exception of state boundaries. This site provides links to download ESRI shape and symbology layer files, as well as low resolution JPEG or higher resolution PDF images for each state. Also included in the FAQ are how to cite the data set, time period, how geographic features are represented and described, originators and contributors, contacts to address questions about the data, how the data set was created (previous works, e.g. USGS topographic quadrangles, US Census Bureau, space imagery, etc.), data generation-, processing-, and modification methods, and similar or related data. Applicable legal restrictions on access or use of the data and disclaimers are provided. Resources in this dataset:Resource Title: Land Use Strata - Selected States. File Name: Web Page, url: https://www.nass.usda.gov/Research_and_Science/stratafront2b.php This site provides links to download ESRI shape and symbology layer files, as well as low resolution JPEG or higher resolution PDF images for each state.