The documentation covers Enterprise Survey panel datasets that were collected in Slovenia in 2009, 2013 and 2019.
The Slovenia ES 2009 was conducted between 2008 and 2009. The Slovenia ES 2013 was conducted between March 2013 and September 2013. Finally, the Slovenia ES 2019 was conducted between December 2018 and November 2019. The objective of the Enterprise Survey is to gain an understanding of what firms experience in the private sector.
As part of its strategic goal of building a climate for investment, job creation, and sustainable growth, the World Bank has promoted improving the business environment as a key strategy for development, which has led to a systematic effort in collecting enterprise data across countries. The Enterprise Surveys (ES) are an ongoing World Bank project in collecting both objective data based on firms' experiences and enterprises' perception of the environment in which they operate.
National
The primary sampling unit of the study is the establishment. An establishment is a physical location where business is carried out and where industrial operations take place or services are provided. A firm may be composed of one or more establishments. For example, a brewery may have several bottling plants and several establishments for distribution. For the purposes of this survey an establishment must take its own financial decisions and have its own financial statements separate from those of the firm. An establishment must also have its own management and control over its payroll.
As it is standard for the ES, the Slovenia ES was based on the following size stratification: small (5 to 19 employees), medium (20 to 99 employees), and large (100 or more employees).
Sample survey data [ssd]
The sample for Slovenia ES 2009, 2013, 2019 were selected using stratified random sampling, following the methodology explained in the Sampling Manual for Slovenia 2009 ES and for Slovenia 2013 ES, and in the Sampling Note for 2019 Slovenia ES.
Three levels of stratification were used in this country: industry, establishment size, and oblast (region). The original sample designs with specific information of the industries and regions chosen are included in the attached Excel file (Sampling Report.xls.) for Slovenia 2009 ES. For Slovenia 2013 and 2019 ES, specific information of the industries and regions chosen is described in the "The Slovenia 2013 Enterprise Surveys Data Set" and "The Slovenia 2019 Enterprise Surveys Data Set" reports respectively, Appendix E.
For the Slovenia 2009 ES, industry stratification was designed in the way that follows: the universe was stratified into manufacturing industries, services industries, and one residual (core) sector as defined in the sampling manual. Each industry had a target of 90 interviews. For the manufacturing industries sample sizes were inflated by about 17% to account for potential non-response cases when requesting sensitive financial data and also because of likely attrition in future surveys that would affect the construction of a panel. For the other industries (residuals) sample sizes were inflated by about 12% to account for under sampling in firms in service industries.
For Slovenia 2013 ES, industry stratification was designed in the way that follows: the universe was stratified into one manufacturing industry, and two service industries (retail, and other services).
Finally, for Slovenia 2019 ES, three levels of stratification were used in this country: industry, establishment size, and region. The original sample design with specific information of the industries and regions chosen is described in "The Slovenia 2019 Enterprise Surveys Data Set" report, Appendix C. Industry stratification was done as follows: Manufacturing – combining all the relevant activities (ISIC Rev. 4.0 codes 10-33), Retail (ISIC 47), and Other Services (ISIC 41-43, 45, 46, 49-53, 55, 56, 58, 61, 62, 79, 95).
For Slovenia 2009 and 2013 ES, size stratification was defined following the standardized definition for the rollout: small (5 to 19 employees), medium (20 to 99 employees), and large (more than 99 employees). For stratification purposes, the number of employees was defined on the basis of reported permanent full-time workers. This seems to be an appropriate definition of the labor force since seasonal/casual/part-time employment is not a common practice, except in the sectors of construction and agriculture.
For Slovenia 2009 ES, regional stratification was defined in 2 regions. These regions are Vzhodna Slovenija and Zahodna Slovenija. The Slovenia sample contains panel data. The wave 1 panel “Investment Climate Private Enterprise Survey implemented in Slovenia” consisted of 223 establishments interviewed in 2005. A total of 57 establishments have been re-interviewed in the 2008 Business Environment and Enterprise Performance Survey.
For Slovenia 2013 ES, regional stratification was defined in 2 regions (city and the surrounding business area) throughout Slovenia.
Finally, for Slovenia 2019 ES, regional stratification was done across two regions: Eastern Slovenia (NUTS code SI03) and Western Slovenia (SI04).
Computer Assisted Personal Interview [capi]
Questionnaires have common questions (core module) and respectfully additional manufacturing- and services-specific questions. The eligible manufacturing industries have been surveyed using the Manufacturing questionnaire (includes the core module, plus manufacturing specific questions). Retail firms have been interviewed using the Services questionnaire (includes the core module plus retail specific questions) and the residual eligible services have been covered using the Services questionnaire (includes the core module). Each variation of the questionnaire is identified by the index variable, a0.
Survey non-response must be differentiated from item non-response. The former refers to refusals to participate in the survey altogether whereas the latter refers to the refusals to answer some specific questions. Enterprise Surveys suffer from both problems and different strategies were used to address these issues.
Item non-response was addressed by two strategies: a- For sensitive questions that may generate negative reactions from the respondent, such as corruption or tax evasion, enumerators were instructed to collect the refusal to respond as (-8). b- Establishments with incomplete information were re-contacted in order to complete this information, whenever necessary. However, there were clear cases of low response.
For 2009 and 2013 Slovenia ES, the survey non-response was addressed by maximizing efforts to contact establishments that were initially selected for interview. Up to 4 attempts were made to contact the establishment for interview at different times/days of the week before a replacement establishment (with similar strata characteristics) was suggested for interview. Survey non-response did occur but substitutions were made in order to potentially achieve strata-specific goals. Further research is needed on survey non-response in the Enterprise Surveys regarding potential introduction of bias.
For 2009, the number of contacted establishments per realized interview was 6.18. This number is the result of two factors: explicit refusals to participate in the survey, as reflected by the rate of rejection (which includes rejections of the screener and the main survey) and the quality of the sample frame, as represented by the presence of ineligible units. The relatively low ratio of contacted establishments per realized interview (6.18) suggests that the main source of error in estimates in the Slovenia may be selection bias and not frame inaccuracy.
For 2013, the number of realized interviews per contacted establishment was 25%. This number is the result of two factors: explicit refusals to participate in the survey, as reflected by the rate of rejection (which includes rejections of the screener and the main survey) and the quality of the sample frame, as represented by the presence of ineligible units. The number of rejections per contact was 44%.
Finally, for 2019, the number of interviews per contacted establishments was 9.7%. This number is the result of two factors: explicit refusals to participate in the survey, as reflected by the rate of rejection (which includes rejections of the screener and the main survey) and the quality of the sample frame, as represented by the presence of ineligible units. The share of rejections per contact was 75.2%.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Summary : Fuel demand is shown to be influenced by fuel prices, people's income and motorization rates. We explore the effects of electric vehicle's rates in gasoline demand using this panel dataset.
Files : dataset.csv - Panel dimensions are the Brazilian state ( i ) and year ( t ). The other columns are: gasoline sales per capita (ln_Sg_pc), prices of gasoline (ln_Pg) and ethanol (ln_Pe) and their lags, motorization rates of combustion vehicles (ln_Mi_c) and electric vehicles (ln_Mi_e) and GDP per capita (ln_gdp_pc). All variables are all under the natural log function, since we use this to calculate demand elasticities in a regression model.
adjacency.csv - The adjacency matrix used in interaction with electric vehicles' motorization rates to calculate spatial effects. At first, it follows a binary adjacency formula: for each pair of states i and j, the cell (i, j) is 0 if the states are not adjacent and 1 if they are. Then, each row is normalized to have sum equal to one.
regression.do - Series of Stata commands used to estimate the regression models of our study. dataset.csv must be imported to work, see comment section.
dataset_predictions.xlsx - Based on the estimations from Stata, we use this excel file to make average predictions by year and by state. Also, by including years beyond the last panel sample, we also forecast the model into the future and evaluate the effects of different policies that influence gasoline prices (taxation) and EV motorization rates (electrification). This file is primarily used to create images, but can be used to further understand how the forecasting scenarios are set up.
Sources: Fuel prices and sales: ANP (https://www.gov.br/anp/en/access-information/what-is-anp/what-is-anp) State population, GDP and vehicle fleet: IBGE (https://www.ibge.gov.br/en/home-eng.html?lang=en-GB) State EV fleet: Anfavea (https://anfavea.com.br/en/site/anuarios/)
The documented dataset covers Enterprise Survey (ES) panel data collected in Peru in 2006, 2010 and 2017, as part of the Enterprise Survey initiative of the World Bank. An Indicator Survey is similar to an Enterprise Survey; it is implemented for smaller economies where the sampling strategies inherent in an Enterprise Survey are often not applicable due to the limited universe of firms.
The objective of the 2006-2017 Enterprise Survey is to obtain feedback from enterprises in client countries on the state of the private sector as well as to build a panel of enterprise data that will make it possible to track changes in the business environment over time and allow, for example, impact assessments of reforms. Through interviews with firms in the manufacturing and services sectors, the Indicator Survey data provides information on the constraints to private sector growth and is used to create statistically significant business environment indicators that are comparable across countries.
As part of its strategic goal of building a climate for investment, job creation, and sustainable growth, the World Bank has promoted improving the business environment as a key strategy for development, which has led to a systematic effort in collecting enterprise data across countries. The Enterprise Surveys (ES) are an ongoing World Bank project in collecting both objective data based on firms' experiences and enterprises' perception of the environment in which they operate.
National
The primary sampling unit of the study is the establishment. An establishment is a physical location where business is carried out and where industrial operations take place or services are provided. A firm may be composed of one or more establishments. For example, a brewery may have several bottling plants and several establishments for distribution. For the purposes of this survey an establishment must make its own financial decisions and have its own financial statements separate from those of the firm. An establishment must also have its own management and control over its payroll.
The whole population, or the universe, covered in the Enterprise Surveys is the non-agricultural economy. It comprises: all manufacturing sectors according to the ISIC Revision 3.1 group classification (group D), construction sector (group F), services sector (groups G and H), and transport, storage, and communications sector (group I). Note that this population definition excludes the following sectors: financial intermediation (group J), real estate and renting activities (group K, except sub-sector 72, IT, which was added to the population under study), and all public or utilities-sectors.
Sample survey data [ssd]
The sample for the 2006-2017 Peru Enterprise Survey (ES) was selected using stratified random sampling, following the methodology explained in the Sampling Manual. Stratified random sampling was preferred over simple random sampling for several reasons: - To obtain unbiased estimates for different subdivisions of the population with some known level of precision. - To obtain unbiased estimates for the whole population. The whole population, or universe of the study, is the non-agricultural economy. It comprises: all manufacturing sectors (group D), construction (group F), services (groups G and H), and transport, storage, and communications (group I). Groups are defined following ISIC revision 3.1. Note that this definition excludes the following sectors: financial intermediation (group J), real estate and renting activities (group K, excluding sub-sector 72, IT, which was added to the population under study), and all public or utilities-sectors. - To make sure that the final total sample includes establishments from all different sectors and that it is not concentrated in one or two of industries/sizes/regions. - To exploit the benefits of stratified sampling where population estimates, in most cases, will be more precise than using a simple random sampling method (i.e., lower standard errors, other things being equal.)
Three levels of stratification were used in every country: industry, establishment size, and region.
Industry stratification was designed in the following way: In small economies the population was stratified into 3 manufacturing industries, one services industry - retail-, and one residual sector as defined in the sampling manual. Each industry had a target of 120 interviews. In middle size economies the population was stratified into 4 manufacturing industries, 2 services industries -retail and IT-, and one residual sector. For the manufacturing industries sample sizes were inflated by 25% to account for potential non-response in the financing data.
For the Peru ES, size stratification was defined following the standardized definition for the rollout: small (5 to 19 employees), medium (20 to 99 employees), and large (more than 99 employees). For stratification purposed, the number of employees was defined on the basis of reported permanent full-time workers. This resulted in some difficulties in certain countries where seasonal/casual/part-time labor is common.
Face-to-face [f2f]
The current survey instruments are available: - Core Questionnaire + Manufacturing Module [ISIC Rev.3.1: 15-37] - Core Questionnaire + Retail Module [ISIC Rev.3.1: 52] - Core Questionnaire [ISIC Rev.3.1: 45, 50, 51, 55, 60-64, 72] - Screener Questionnaire.
The "Core Questionnaire" is the heart of the Enterprise Survey and contains the survey questions asked of all firms across the world. There are also two other survey instruments - the "Core Questionnaire + Manufacturing Module" and the "Core Questionnaire + Retail Module." The survey is fielded via three instruments in order to not ask questions that are irrelevant to specific types of firms, e.g. a question that relates to production and nonproduction workers should not be asked of a retail firm. In addition to questions that are asked across countries, all surveys are customized and contain country-specific questions. An example of customization would be including tourism-related questions that are asked in certain countries when tourism is an existing or potential sector of economic growth.
The standard Enterprise Survey topics include firm characteristics, gender participation, access to finance, annual sales, costs of inputs/labor, workforce composition, bribery, licensing, infrastructure, trade, crime, competition, capacity utilization, land and permits, taxation, informality, business-government relations, innovation and technology, and performance measures.
Data entry and quality controls are implemented by the contractor and data is delivered to the World Bank in batches (typically 10%, 50% and 100%). These data deliveries are checked for logical consistency, out of range values, skip patterns, and duplicate entries. Problems are flagged by the World Bank and corrected by the implementing contractor through data checks, callbacks, and revisiting establishments.
Survey non-response must be differentiated from item non-response. The former refers to refusals to participate in the survey altogether whereas the latter refers to the refusals to answer some specific questions. Enterprise Surveys suffer from both problems and different strategies were used to address these issues.
Item non-response was addressed by two strategies:
a- For sensitive questions that may generate negative reactions from the respondent, such as corruption or tax evasion, enumerators were instructed to collect the refusal to respond (-8) as a different option from don’t know (-9).
b- Establishments with incomplete information were re-contacted in order to complete this information, whenever necessary. However, there were clear cases of low response. The following graph shows non-response rates for the sales variable, d2, by sector. Please, note that for this specific question, refusals were not separately identified from “Don’t know” responses.
Survey non-response was addressed by maximizing efforts to contact establishments that were initially selected for interview. Attempts were made to contact the establishment for interview at different times/days of the week before a replacement establishment (with similar strata characteristics) was suggested for interview. Survey non-response did occur but substitutions were made in order to potentially achieve strata-specific goals; whenever this was done, strict rules were followed to ensure replacements were randomly selected within the same stratum. Further research is needed on survey non-response in the Enterprise Surveys regarding potential introduction of bias.
The documented dataset covers Enterprise Survey (ES) panel data collected in Lesotho in 2009 and 2016, as part of Africa Enterprise Surveys rollout, an initiative of the World Bank. The objective of the Enterprise Survey is to obtain feedback from enterprises on the state of the private sector as well as to help in building a panel of enterprise data that will make it possible to track changes in the business environment over time, thus allowing, for example, impact assessments of reforms.
Enterprise Surveys target a sample consisting of longitudinal (panel) observations and new cross-sectional data. Panel firms are prioritized in the sample selection, comprising up to 50% of the sample in the current wave. For all panel firms, regardless of the sample, current eligibility or operating status is determined and included in panel datasets.
Lesotho ES 2009 was conducted from September 2008 to February 2009, Lesotho ES 2016 was carried out in June - August 2016. Stratified random sampling was used to select the surveyed businesses. Data was collected using face-to-face interviews.
Data from 301 establishments was analyzed: 90 businesses were from 2009 only, 89 - from 2016 only, and 122 firms were from 2009 and 2016.
The standard Enterprise Survey topics include firm characteristics, gender participation, access to finance, annual sales, costs of inputs and labor, workforce composition, bribery, licensing, infrastructure, trade, crime, competition, capacity utilization, land and permits, taxation, informality, business-government relations, innovation and technology, and performance measures. Over 90 percent of the questions objectively measure characteristics of a country’s business environment. The remaining questions assess the survey respondents’ opinions on what are the obstacles to firm growth and performance.
National
The primary sampling unit of the study is an establishment. An establishment is a physical location where business is carried out and where industrial operations take place or services are provided. A firm may be composed of one or more establishments. For example, a brewery may have several bottling plants and several establishments for distribution. For the purposes of this survey an establishment must make its own financial decisions and have its own financial statements separate from those of the firm. An establishment must also have its own management and control over its payroll.
The whole population, or the universe, covered in the Enterprise Surveys is the non-agricultural private economy. It comprises: all manufacturing sectors according to the ISIC Revision 3.1 group classification (group D), construction sector (group F), services sector (groups G and H), and transport, storage, and communications sector (group I). Note that this population definition excludes the following sectors: financial intermediation (group J), real estate and renting activities (group K, except sub-sector 72, IT, which was added to the population under study), and all public or utilities sectors. Companies with 100% government ownership are not eligible to participate in the Enterprise Surveys.
Sample survey data [ssd]
Two levels of stratification were used in this country: industry and establishment size.
Industry stratification was designed as follows: the universe was stratified as into manufacturing and services industries - Manufacturing (ISIC Rev. 3.1 codes 15 - 37), and Services (ISIC codes 45, 50-52, 55, 60-64, and 72).
For the Lesotho ES, size stratification was defined as follows: small (5 to 19 employees), medium (20 to 99 employees), and large (100 or more employees). Regional stratification did not take place for the Lesotho ES.
In 2009, it was not possible to obtain a single usable frame for Lesotho. Instead frames were obtained from two government branches: the Chamber of Commerce and the Ministry of Trade, Industry, Cooperatives and Marketing. Those frames were merged and duplicates removed to provide the frame used for the survey.
In 2016 ES, the sample frame consisted of listings of firms from two sources: for panel firms the list of 151 firms from the Lesotho 2009 ES was used and for fresh firms (i.e., firms not covered in 2009) firm data from Lesotho Bureau of Statistics Business Register, published in August 2015, was used.
Face-to-face [f2f]
The following survey instruments were used for Lesotho ES: - Manufacturing Module Questionnaire - Services Module Questionnaire
The survey is fielded via manufacturing or services questionnaires in order not to ask questions that are irrelevant to specific types of firms, e.g. a question that relates to production and nonproduction workers should not be asked of a retail firm. In addition to questions that are asked across countries, all surveys are customized and contain country-specific questions. An example of customization would be including tourism-related questions that are asked in certain countries when tourism is an existing or potential sector of economic growth. There is a skip pattern in the Service Module Questionnaire for questions that apply only to retail firms.
Data entry and quality controls are implemented by the contractor and data is delivered to the World Bank in batches (typically 10%, 50% and 100%). These data deliveries are checked for logical consistency, out of range values, skip patterns, and duplicate entries. Problems are flagged by the World Bank and corrected by the implementing contractor through data checks, callbacks, and revisiting establishments.
Survey non-response must be differentiated from item non-response. The former refers to refusals to participate in the survey altogether whereas the latter refers to the refusals to answer some specific questions. Enterprise Surveys suffer from both problems and different strategies were used to address these issues.
Item non-response was addressed by two strategies: a- For sensitive questions that may generate negative reactions from the respondent, such as corruption or tax evasion, enumerators were instructed to collect "Refusal to respond" (-8) as a different option from "Don't know" (-9). b- Establishments with incomplete information were re-contacted in order to complete this information, whenever necessary.
Survey non-response was addressed by maximizing efforts to contact establishments that were initially selected for interview. Attempts were made to contact the establishment for interview at different times/days of the week before a replacement establishment (with similar strata characteristics) was suggested for interview. Survey non-response did occur but substitutions were made in order to potentially achieve strata-specific goals.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Raw data used in analysis of determinants of dividend policy - a case of banking sector in Serbia.
The documented dataset covers Enterprise Survey (ES) panel data collected in Sierra Leone in 2009 and 2017, as part of the Enterprise Survey initiative of the World Bank. An Indicator Survey is similar to an Enterprise Survey; it is implemented for smaller economies where the sampling strategies inherent in an Enterprise Survey are often not applicable due to the limited universe of firms.
The objective of the 2009-2017 survey is to obtain feedback from enterprises in client countries on the state of the private sector as well as to build a panel of enterprise data that will make it possible to track changes in the business environment over time and allow, for example, impact assessments of reforms. Through interviews with firms in the manufacturing and services sectors, the Indicator Survey data provides information on the constraints to private sector growth and is used to create statistically significant business environment indicators that are comparable across countries. As part of its strategic goal of building a climate for investment, job creation, and sustainable growth, the World Bank has promoted improving the business environment as a key strategy for development, which has led to a systematic effort in collecting enterprise data across countries. The Enterprise Surveys (ES) are an ongoing World Bank project in collecting both objective data based on firms' experiences and enterprises' perception of the environment in which they operate.
Questionnaire topics include firm characteristics, gender participation, access to finance, annual sales, costs of inputs/labor, workforce composition, bribery, licensing, infrastructure, trade, crime, competition, land and permits, taxation, business-government relations, performance measures, AIDS and sickness. The mode of data collection is face-to-face interviews.
National
The primary sampling unit of the study is the establishment. An establishment is a physical location where business is carried out and where industrial operations take place or services are provided. A firm may be composed of one or more establishments. For example, a brewery may have several bottling plants and several establishments for distribution. For the purposes of this survey an establishment must make its own financial decisions and have its own financial statements separate from those of the firm. An establishment must also have its own management and control over its payroll.
The whole population, or the universe, covered in the Enterprise Surveys is the non-agricultural economy. It comprises: all manufacturing sectors according to the ISIC Revision 3.1 group classification (group D), construction sector (group F), services sector (groups G and H), and transport, storage, and communications sector (group I). Note that this population definition excludes the following sectors: financial intermediation (group J), real estate and renting activities (group K, except sub-sector 72, IT, which was added to the population under study), and all public or utilities sectors.
Sample survey data [ssd]
The sample for registered establishments in Sierra Leone was selected using stratified random sampling, following the methodology explained in the Sampling Note.
Stratified random sampling was preferred over simple random sampling for several reasons: a. To obtain unbiased estimates for different subdivisions of the population with some known level of precision. b. To obtain unbiased estimates for the whole population. The whole population, or universe of the study, is the non-agricultural economy. It comprises: all manufacturing sectors according to the group classification of ISIC Revision 3.1: (group D), construction sector (group F), services sector (groups G and H), and transport, storage, and communications sector (group I). Note that this definition excludes the following sectors: financial intermediation (group J), real estate and renting activities (group K, except sub-sector 72, IT, which was added to the population under study), and all public or utilities-sectors. c. To make sure that the final total sample includes establishments from all different sectors and that it is not concentrated in one or two of industries/sizes/regions. d. To exploit the benefits of stratified sampling where population estimates, in most cases, will be more precise than using a simple random sampling method (i.e., lower standard errors, other things being equal.) e. Stratification may produce a smaller bound on the error of estimation than would be produced by a simple random sample of the same size. This result is particularly true if measurements within strata are homogeneous. f. The cost per observation in the survey may be reduced by stratification of the population elements into convenient groupings.
Three levels of stratification were used in the Sierra Leone sample: firm sector, firm size, and geographic region.
Industry stratification was designed as follows: the universe was stratified into one manufacturing industry and one services industry (retail).
Size stratification was defined following the standardized definition used for the Indicator Surveys: small (5 to 19 employees), medium (20 to 99 employees), and large (more than 99 employees). For stratification purposes, the number of employees was defined on the basis of reported permanent full-time workers.
Regional stratification was defined in terms of the geographic regions with the largest commercial presence in the country: Kenema and W/A Urban. In 2017, regional stratification was done across four regions: Bo, Western Urban, Kenema, and Bombali.
Given the stratified design, sample frames containing a complete and updated list of establishments as well as information on all stratification variables (number of employees, industry, and region) are required to draw the sample. Great efforts were made to obtain the best source for these listings.
The sample frame consisted of listings of firms from two sources: For panel firms the list of 150 firms from the Sierra Leone 2009 ES was used and for fresh firms (i.e., firms not covered in 2009) firm data from 2016 Business Establishment Census and Dun & Bradstreet Global database (June 2017), was used.
Necessary measures were taken to ensure the quality of the frame; however, the sample frame was not immune to the typical problems found in establishment surveys: positive rates of non-eligibility, repetition, non-existent units, etc.
Given the impact that non-eligible units included in the sample universe may have on the results, adjustments may be needed when computing the appropriate weights for individual observations. The percentage of confirmed non-eligible units as a proportion of the total number of sampled establishments contacted for the survey was 8.9% (18 out of 202 establishments).
Face-to-face [f2f]
The current survey instruments are available: - Services and Manufacturing Questionnaire - Screener Questionnaire.
The standard Enterprise Survey topics include firm characteristics, gender participation, access to finance, annual sales, costs of inputs/labor, workforce composition, bribery, licensing, infrastructure, trade, crime, competition, capacity utilization, land and permits, taxation, informality, business-government relations, innovation and technology, and performance measures. Over 90% of the questions objectively ascertain characteristics of a country's business environment. The remaining questions assess the survey respondents' opinions on what are the obstacles to firm growth and performance.
Data entry and quality controls are implemented by the contractor and data is delivered to the World Bank in batches (typically 10%, 50% and 100%). These data deliveries are checked for logical consistency, out of range values, skip patterns, and duplicate entries. Problems are flagged by the World Bank and corrected by the implementing contractor through data checks, callbacks, and revisiting establishments.
There was a high response rate especially as a result of positive attitude towards the international community in collaboration with the government in their reconstruction efforts after a period of civil strife. It is period in which a lot of statistics is being collected by the Sierra Leone Statistics for reconstruction thus most respondents were enlightened on research benefits.
The documented dataset covers Enterprise Survey (ES) panel data collected in Argentina in 2006, 2010 and 2017, as part of the Enterprise Survey initiative of the World Bank. An Indicator Survey is similar to an Enterprise Survey; it is implemented for smaller economies where the sampling strategies inherent in an Enterprise Survey are often not applicable due to the limited universe of firms.
The objective of the 2006-2017 Enterprise Survey is to obtain feedback from enterprises in client countries on the state of the private sector as well as to build a panel of enterprise data that will make it possible to track changes in the business environment over time and allow, for example, impact assessments of reforms. Through interviews with firms in the manufacturing and services sectors, the Indicator Survey data provides information on the constraints to private sector growth and is used to create statistically significant business environment indicators that are comparable across countries.
As part of its strategic goal of building a climate for investment, job creation, and sustainable growth, the World Bank has promoted improving the business environment as a key strategy for development, which has led to a systematic effort in collecting enterprise data across countries. The Enterprise Surveys (ES) are an ongoing World Bank project in collecting both objective data based on firms' experiences and enterprises' perception of the environment in which they operate.
National
The primary sampling unit of the study is the establishment. An establishment is a physical location where business is carried out and where industrial operations take place or services are provided. A firm may be composed of one or more establishments. For example, a brewery may have several bottling plants and several establishments for distribution. For the purposes of this survey an establishment must make its own financial decisions and have its own financial statements separate from those of the firm. An establishment must also have its own management and control over its payroll.
The whole population, or the universe, covered in the Enterprise Surveys is the non-agricultural economy. It comprises: all manufacturing sectors according to the ISIC Revision 3.1 group classification (group D), construction sector (group F), services sector (groups G and H), and transport, storage, and communications sector (group I). Note that this population definition excludes the following sectors: financial intermediation (group J), real estate and renting activities (group K, except sub-sector 72, IT, which was added to the population under study), and all public or utilities-sectors.
Sample survey data [ssd]
The sample for the 2006-2017 Argentina Enterprise Survey (ES) was selected using stratified random sampling, following the methodology explained in the Sampling Manual. Stratified random sampling was preferred over simple random sampling for several reasons: - To obtain unbiased estimates for different subdivisions of the population with some known level of precision. - To obtain unbiased estimates for the whole population. The whole population, or universe of the study, is the non-agricultural economy. It comprises: all manufacturing sectors (group D), construction (group F), services (groups G and H), and transport, storage, and communications (group I). Groups are defined following ISIC revision 3.1. Note that this definition excludes the following sectors: financial intermediation (group J), real estate and renting activities (group K, excluding sub-sector 72, IT, which was added to the population under study), and all public or utilities-sectors. - To make sure that the final total sample includes establishments from all different sectors and that it is not concentrated in one or two of industries/sizes/regions. - To exploit the benefits of stratified sampling where population estimates, in most cases, will be more precise than using a simple random sampling method (i.e., lower standard errors, other things being equal.)
Three levels of stratification were used in every country: industry, establishment size, and region.
Industry stratification was designed in the following way: In small economies the population was stratified into 3 manufacturing industries, one services industry - retail-, and one residual sector as defined in the sampling manual. Each industry had a target of 120 interviews. In middle size economies the population was stratified into 4 manufacturing industries, 2 services industries -retail and IT-, and one residual sector. For the manufacturing industries sample sizes were inflated by 25% to account for potential non-response in the financing data.
For the Argentina ES, size stratification was defined following the standardized definition for the rollout: small (5 to 19 employees), medium (20 to 99 employees), and large (more than 99 employees). For stratification purposed, the number of employees was defined on the basis of reported permanent full-time workers. This resulted in some difficulties in certain countries where seasonal/casual/part-time labor is common.
Face-to-face [f2f]
The current survey instruments are available: - Core Questionnaire + Manufacturing Module [ISIC Rev.3.1: 15-37] - Core Questionnaire + Retail Module [ISIC Rev.3.1: 52] - Core Questionnaire [ISIC Rev.3.1: 45, 50, 51, 55, 60-64, 72] - Screener Questionnaire.
The "Core Questionnaire" is the heart of the Enterprise Survey and contains the survey questions asked of all firms across the world. There are also two other survey instruments - the "Core Questionnaire + Manufacturing Module" and the "Core Questionnaire + Retail Module." The survey is fielded via three instruments in order to not ask questions that are irrelevant to specific types of firms, e.g. a question that relates to production and nonproduction workers should not be asked of a retail firm. In addition to questions that are asked across countries, all surveys are customized and contain country-specific questions. An example of customization would be including tourism-related questions that are asked in certain countries when tourism is an existing or potential sector of economic growth.
The standard Enterprise Survey topics include firm characteristics, gender participation, access to finance, annual sales, costs of inputs/labor, workforce composition, bribery, licensing, infrastructure, trade, crime, competition, capacity utilization, land and permits, taxation, informality, business-government relations, innovation and technology, and performance measures.
Data entry and quality controls are implemented by the contractor and data is delivered to the World Bank in batches (typically 10%, 50% and 100%). These data deliveries are checked for logical consistency, out of range values, skip patterns, and duplicate entries. Problems are flagged by the World Bank and corrected by the implementing contractor through data checks, callbacks, and revisiting establishments.
Survey non-response must be differentiated from item non-response. The former refers to refusals to participate in the survey altogether whereas the latter refers to the refusals to answer some specific questions. Enterprise Surveys suffer from both problems and different strategies were used to address these issues.
Item non-response was addressed by two strategies:
a- For sensitive questions that may generate negative reactions from the respondent, such as corruption or tax evasion, enumerators were instructed to collect the refusal to respond (-8) as a different option from don't know (-9).
b- Establishments with incomplete information were re-contacted in order to complete this information, whenever necessary. However, there were clear cases of low response. The following graph shows non-response rates for the sales variable, d2, by sector. Please, note that for this specific question, refusals were not separately identified from "Don't know" responses.
Survey non-response was addressed by maximizing efforts to contact establishments that were initially selected for interview. Attempts were made to contact the establishment for interview at different times/days of the week before a replacement establishment (with similar strata characteristics) was suggested for interview. Survey non-response did occur but substitutions were made in order to potentially achieve strata-specific goals; whenever this was done, strict rules were followed to ensure replacements were randomly selected within the same stratum. Further research is needed on survey non-response in the Enterprise Surveys regarding potential introduction of bias.
analyze the health and retirement study (hrs) with r the hrs is the one and only longitudinal survey of american seniors. with a panel starting its third decade, the current pool of respondents includes older folks who have been interviewed every two years as far back as 1992. unlike cross-sectional or shorter panel surveys, respondents keep responding until, well, death d o us part. paid for by the national institute on aging and administered by the university of michigan's institute for social research, if you apply for an interviewer job with them, i hope you like werther's original. figuring out how to analyze this data set might trigger your fight-or-flight synapses if you just start clicking arou nd on michigan's website. instead, read pages numbered 10-17 (pdf pages 12-19) of this introduction pdf and don't touch the data until you understand figure a-3 on that last page. if you start enjoying yourself, here's the whole book. after that, it's time to register for access to the (free) data. keep your username and password handy, you'll need it for the top of the download automation r script. next, look at this data flowchart to get an idea of why the data download page is such a righteous jungle. but wait, good news: umich recently farmed out its data management to the rand corporation, who promptly constructed a giant consolidated file with one record per respondent across the whole panel. oh so beautiful. the rand hrs files make much of the older data and syntax examples obsolete, so when you come across stuff like instructions on how to merge years, you can happily ignore them - rand has done it for you. the health and retirement study only includes noninstitutionalized adults when new respondents get added to the panel (as they were in 1992, 1993, 1998, 2004, and 2010) but once they're in, they're in - respondents have a weight of zero for interview waves when they were nursing home residents; but they're still responding and will continue to contribute to your statistics so long as you're generalizing about a population from a previous wave (for example: it's possible to compute "among all americans who were 50+ years old in 1998, x% lived in nursing homes by 2010"). my source for that 411? page 13 of the design doc. wicked. this new github repository contains five scripts: 1992 - 2010 download HRS microdata.R loop through every year and every file, download, then unzip everything in one big party impor t longitudinal RAND contributed files.R create a SQLite database (.db) on the local disk load the rand, rand-cams, and both rand-family files into the database (.db) in chunks (to prevent overloading ram) longitudinal RAND - analysis examples.R connect to the sql database created by the 'import longitudinal RAND contributed files' program create tw o database-backed complex sample survey object, using a taylor-series linearization design perform a mountain of analysis examples with wave weights from two different points in the panel import example HRS file.R load a fixed-width file using only the sas importation script directly into ram with < a href="http://blog.revolutionanalytics.com/2012/07/importing-public-data-with-sas-instructions-into-r.html">SAScii parse through the IF block at the bottom of the sas importation script, blank out a number of variables save the file as an R data file (.rda) for fast loading later replicate 2002 regression.R connect to the sql database created by the 'import longitudinal RAND contributed files' program create a database-backed complex sample survey object, using a taylor-series linearization design exactly match the final regression shown in this document provided by analysts at RAND as an update of the regression on pdf page B76 of this document . click here to view these five scripts for more detail about the health and retirement study (hrs), visit: michigan's hrs homepage rand's hrs homepage the hrs wikipedia page a running list of publications using hrs notes: exemplary work making it this far. as a reward, here's the detailed codebook for the main rand hrs file. note that rand also creates 'flat files' for every survey wave, but really, most every analysis you c an think of is possible using just the four files imported with the rand importation script above. if you must work with the non-rand files, there's an example of how to import a single hrs (umich-created) file, but if you wish to import more than one, you'll have to write some for loops yourself. confidential to sas, spss, stata, and sudaan users: a tidal wave is coming. you can get water up your nose and be dragged out to sea, or you can grab a surf board. time to transition to r. :D
The Ghana Socioeconomic Panel Survey is a joint effort between the Economic Growth Centre at Yale University and the Institute of Statistical, Social and Economic Research (ISSER), at the University of Ghana (Legon, Ghana). The survey is meant to remedy a major constraint on the understanding of development in low-income countries - the absence of detailed, multi-level and long-term scientific data that follows individuals over time and describes both the natural and man-made environment in which the individuals reside. Most data collection efforts are short-term - carried out at one point in time; and limited in scope – collecting information on only a few aspects of the lives of the persons in the study; and when there are multiple rounds of data collection, individuals who leave the study area are dropped. This means that the most mobile people are not included in existing surveys and studies, perhaps substantially biasing inferences about who benefits from and who bears the cost of the development process. The goal of this project is to follow all individuals, or a random subset, over time using a comprehensive set of survey instruments to shed new light on long-run processes of economic development.
The 2009 survey is the first in a series that is intended to include 5 surveys over the next 15-21 years. Surveys will be implemented approximately every 3 years. In subsequent waves of the panel, a sample of moved households and individuals who have moved out of original households to form new households or joined other households originally not in the panel sample, will be interviewed in addition to the original sample. The number of households in the Panel Study thus has the potential of increasing due to the nature of the design; tracking wholly moved and split households.
The principal objective of the panel survey is to provide a comprehensive data base for carrying out a wide range of studies of the medium- and long-term changes, or lack of changes, that take place during the process of development. The information gathered from the survey is expected to inform decision makers in the formulation of economic and social policies to: - Identify target groups for government assistance; - Construct models to stimulate the impact on individual groups of the various policy options and to analyze the impact of decisions that have already been implemented; - Access the economic situation on living conditions of households; and - Provide benchmark data for district assemblies.
The survey provides regionally representative data for the 10 regions of Ghana. In all, 5010 households from 334 Enumeration Areas (EAs) were sampled. Fifteen households were selected from each of the EAs. The number of EAs for each region was proportionately allocated based on estimated 2009 population share for each region. EAs for Upper East and Upper West regions, which have relatively smaller population sizes, were over sampled to allow for a reasonable number of households to be interviewed in these regions.
Households, individuals, agricultural plots, household enterprises
Nationally representative, regionally representative for all 10 regions.
Sample survey data [ssd]
The survey provides regionally representative data for the 10 regions of Ghana. In all, 5010 households from 334 Enumeration Areas (EAs) were sampled. Fifteen households were selected from each of the EAs. The distribution of the enumeration areas across the regions in Ghana is presented in Table 1. The number of EAs for each region was proportionately allocated based on estimated 2009 population share for each region. EAs for Upper East and Upper West regions, which have relatively smaller population sizes, were over sampled to allow for a reasonable number of households to be interviewed in these regions.
A two-stage stratified sample design was used for the survey. Stratification was based on the regions of Ghana. The first stage involved selecting geographical precincts or clusters from an updated master sampling frame constructed from the 2000 Ghana Population and Housing Census. A total of 334 clusters (census enumeration areas) were selected from the master sampling frame. The clusters were randomly selected from the list of EAs in each region. The selection was based on a simple random sampling technique. A complete household listing was conducted in 2009 in all the selected clusters to provide a sampling frame for the second stage selection of households.
The second stage of selection involved a simple random sampling of 15 of the listed households from each selected cluster. The primary objective of the second stage of selection was to ensure adequate numbers of completed individual interviews to provide estimates for key indicators with acceptable precision at the regional level. Other sampling objectives were to facilitate manageable interviewer workload within each sample area and to reduce the effects of intra-class correlation within a sample area on the variance of the survey estimates.
Face-to-face [f2f]
The information gathered from the survey will assist decision makers in the formulation of economic and social policies to: - Identify target groups for government assistance - Construct models to stimulate the impact on individual groups of the various policy options and to analyze the impact of decisions that have already been implemented - Access the economic situation on living conditions of households - Provide benchmark data for district assemblies
To achieve these objectives, detailed data has been collected in the following subject areas: 1. Demographic characteristics: employment, education, migration
Information about non-resident spouses and relatives
Assets:
Household assets: (i) Livestock (ii) Tools (iii) Durable Goods Financial assets: (i) Borrowing (ii) Lending (iii) In-transfers (iv) Out-transfers (v) Savings
Agricultural Production
Land information: (i) Plot background (ii) Size (iii) Fallowing information, soil type, irrigation (iv) Investment, ownership, rental status (v) Crops (vi) Chemical inputs (vii) Tractor use (viii) Seeds (ix) Labour inputs
Sales and storage: (ii) Revenues from crop production (ii) Crop stores
Non-farm Household Enterprise
Basic Information and Assets (i) Basic information (ii) Enterprise assets
Information about employees (i) Information about all employees (ii) Information about four important employees (iii) Enterprises operating in the past 1 month (iv) Enterprise in a typical month
Accounting: General enterprise
Accounting: Trade/wholesale enterprise
Accounting: Food enterprise
Accounting: Services
Household Health
Insurance
Anthropometry
Immunization
Activities of daily living
Miscellaneous health
Health in the past 2 weeks
Health in the past 12 month
Womens' Health
Fertility
Power
Mens' Health
Reproductive Health
Power
Children's Module
Young child health - children younger than 5 years old
Digit span test- children aged 5-15
Raven's Pattern Cognitive Assessment- children aged 5-15
Math questions- children aged 9-26
English questions- children aged 9-26
Psychology/Social Networking
Psychology (i) Depression (ii) Subjective social welfare (iii) Regretted consumption (iv) Townsend questions (v) Trust and solidarity (vi) Time use
Big 5 personality questions
Social networking
Information seeking (i) Interaction with organizations (ii) Extension services (iii) Volunteerism
Consumption Module
Food items consumed
Clothing and footwear
Expenditure on other items in last 12 months
Fuel and other lubricants
Housing Characteristics
Part A - Rent, water, light, cooking, waste disposal, building materials
Part B - Dwelling type, ownership, living conditions, power supply, surroundings
The community inventory documents a broad range of natural and institutional features of the community, including political organizations, financial institutions, the presence of various development programs, and community infrastructure. There was also a questionnaire for Districts and Municipal Assemblies. As of December 2015, Seperate documentation for the Community survey and the data will be made available later.
The processing of the survey data began shortly after the fieldwork commenced. The first stage of data processing involved office editing and post-coding. Questionnaires were edited to double-check for completeness and consistency in the questionnaires returned, while the post-coding served to define new response categories to pre-coded question or define a response set for open ended questions. Once the editing and post-coding were done, the questionnaires were passed on for data entry.
The data entry program was designed in CSPro version 4.0. The entry program was designed with the necessary skip patterns and consistency checks to ensure adequate data quality and validity. All questionnaires were entered twice (100 percent verification) and the two files were compared for entry errors which were subsequently verified and corrected with the questionnaires. The data entry was completed in August of 2010. The consolidated data files in CSPro format were then converted to STATA format for further consistency checks and cleaning.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
We include Stata syntax (dummy_dataset_create.do) that creates a panel dataset for negative binomial time series regression analyses, as described in our paper "Examining methodology to identify patterns of consulting in primary care for different groups of patients before a diagnosis of cancer: an exemplar applied to oesophagogastric cancer". We also include a sample dataset for clarity (dummy_dataset.dta), and a sample of that data in a spreadsheet (Appendix 2).
The variables contained therein are defined as follows:
case: binary variable for case or control status (takes a value of 0 for controls and 1 for cases).
patid: a unique patient identifier.
time_period: A count variable denoting the time period. In this example, 0 denotes 10 months before diagnosis with cancer, and 9 denotes the month of diagnosis with cancer,
ncons: number of consultations per month.
period0 to period9: 10 unique inflection point variables (one for each month before diagnosis). These are used to test which aggregation period includes the inflection point.
burden: binary variable denoting membership of one of two multimorbidity burden groups.
We also include two Stata do-files for analysing the consultation rate, stratified by burden group, using the Maximum likelihood method (1_menbregpaper.do and 2_menbregpaper_bs.do).
Note: In this example, for demonstration purposes we create a dataset for 10 months leading up to diagnosis. In the paper, we analyse 24 months before diagnosis. Here, we study consultation rates over time, but the method could be used to study any countable event, such as number of prescriptions.
The main British Household Panel Survey (BHPS) is conducted by the ESRC UK Longitudinal Studies Centre (ULSC), together with the Institute for Social and Economic Research (ISER) at the University of Essex. In addition to conducting the BHPS and disseminating it to the research community, ISER undertakes a programme of research based on panel data, using the BHPS and other national panels to monitor and measure social change.
The main objective of the BHPS is to further understanding of social and economic change at the individual and household level in the UK, and to identify, model and forecast such changes and their causes and consequences in relation to a range of socio-economic variables. It is conducted as a longitudinal study, where each adult member (aged 16 years and over) of a sampled household is interviewed annually. If individuals leave their original household, all adult members of their new households are interviewed. Children are also interviewed. For full details of the BHPS methodology, sampling, changes over time, and a complete set of documentation, see the main BHPS study, held at the UK Data Archive under SN 5151.
Understanding Society:
From Wave 19, the BHPS has been subsumed into a new longitudinal study called Understanding Society, or the United Kingdom Household Longitudinal Study (UKHLS), conducted by ISER. The BHPS Wave 19 formed part of Understanding Society Wave 2 (January 2010 - March 2011). The BHPS fieldwork period therefore moved from September-April to January-March. This means that the gap between interviews 18 and 19 for the BHPS sample ranges between 16 and 30 months rather than the standard 12 months. From Wave 2, the BHPS sample has been a permanent part of the larger study and interviews are conducted annually again. BHPS sample members have an identifier within the Understanding Society datasets, allowing BHPS users to match BHPS Wave 1-18 data to Understanding Society. The main Understanding Society study, held under SN 6614 now includes harmonised BHPS data in addition to the main Understanding Society files. Further information is available on the Understanding Society web site.
This dataset contains Acorn geodemographic classification codes for each wave of the BHPS and a household identification serial number for file matching to the main BHPS data. This dataset is subject to restrictive access conditions, different to those for the main BHPS: see Access section below.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Fictitious data to look at the overall health of Starfleet volunteers on four different drugs. Since each volunteer is measured on each of the four drugs, we propose to use this datasets to look at repeater measures ANOVA to determine if the mean health scores differs between drugs.
The documented dataset covers Enterprise Survey (ES) panel data collected in Benin in 2004, 2009 and 2016, as part of Africa Enterprise Surveys rollout, an initiative of the World Bank. The objective of the Enterprise Survey is to obtain feedback from enterprises on the state of the private sector as well as to help in building a panel of enterprise data that will make it possible to track changes in the business environment over time, thus allowing, for example, impact assessments of reforms.
Enterprise Surveys target a sample consisting of longitudinal (panel) observations and new cross-sectional data. Panel firms are prioritized in the sample selection, comprising up to 50% of the sample in the current wave. For all panel firms, regardless of the sample, current eligibility or operating status is determined and included in panel datasets.
Benin ES 2009 was conducted from May 18 to Sept. 30, 2009, Benin ES 2016 was carried out in July - October 2016. Stratified random sampling was used to select the surveyed businesses. Data was collected using face-to-face interviews.
Data from 497 establishments was analyzed: 128 businesses were from 2004 only, 53 - from 2009 only, 88 - from 2016 only, 70 - from 2004 and 2009 only, 56 - from 2009 and 2016 only and 102 firms were from 2004, 2009 and 2016.
The standard Enterprise Survey topics include firm characteristics, gender participation, access to finance, annual sales, costs of inputs and labor, workforce composition, bribery, licensing, infrastructure, trade, crime, competition, capacity utilization, land and permits, taxation, informality, business-government relations, innovation and technology, and performance measures. Over 90 percent of the questions objectively measure characteristics of a country’s business environment. The remaining questions assess the survey respondents’ opinions on what are the obstacles to firm growth and performance.
National
The primary sampling unit of the study is an establishment. An establishment is a physical location where business is carried out and where industrial operations take place or services are provided. A firm may be composed of one or more establishments. For example, a brewery may have several bottling plants and several establishments for distribution. For the purposes of this survey an establishment must make its own financial decisions and have its own financial statements separate from those of the firm. An establishment must also have its own management and control over its payroll.
The whole population, or the universe, covered in the Enterprise Surveys is the non-agricultural private economy. It comprises: all manufacturing sectors according to the ISIC Revision 3.1 group classification (group D), construction sector (group F), services sector (groups G and H), and transport, storage, and communications sector (group I). Note that this population definition excludes the following sectors: financial intermediation (group J), real estate and renting activities (group K, except sub-sector 72, IT, which was added to the population under study), and all public or utilities sectors. Companies with 100% government ownership are not eligible to participate in the Enterprise Surveys.
Sample survey data [ssd]
Three levels of stratification were used in this country: industry, establishment size, and region.
Industry stratification was designed as follows: the universe was stratified as into manufacturing and services industries- Manufacturing (ISIC Rev. 3.1 codes 15 - 37), and Services (ISIC codes 45, 50-52, 55, 60-64, and 72).
For the Benin ES, size stratification was defined as follows: small (5 to 19 employees), medium (20 to 99 employees), and large (100 or more employees).
In 2016 ES, regional stratification was done across five regions: Atlantique, Borgou, Mono, Ouémé and Littoral. In 2009 ES, Cotonou and Other were the two areas selected.
In 2016 ES, the sample frame consisted of listings of firms from three sources: for panel firms, the list of 150 firms from the Benin 2009 ES was used, and for fresh firms (i.e., firms not covered in 2009) lists obtained from National Statistical Institute and Tax Directorate (2013) and the Chamber of Commerce (2016) were used.
In 2009 ES, two sample frames were used. The first one included the official list "Repertoire of Companies in Benin" (2009) from the Chambre de Commerce et d' Industrie du Benin. The second frame (the panel sample) consisted of enterprises interviewed for the Enterprise Survey in 2004, which were to be re-interviewed where they were in the selected geographical regions and met eligibility criteria.
Face-to-face [f2f]
The following survey instruments were used for Benin ES 2009 and 2016: - Manufacturing Module Questionnaire - Services Module Questionnaire
The survey is fielded via manufacturing or services questionnaires in order not to ask questions that are irrelevant to specific types of firms, e.g. a question that relates to production and nonproduction workers should not be asked of a retail firm. In addition to questions that are asked across countries, all surveys are customized and contain country-specific questions. An example of customization would be including tourism-related questions that are asked in certain countries when tourism is an existing or potential sector of economic growth. There is a skip pattern in the Service Module Questionnaire for questions that apply only to retail firms.
Data entry and quality controls are implemented by the contractor and data is delivered to the World Bank in batches (typically 10%, 50% and 100%). These data deliveries are checked for logical consistency, out of range values, skip patterns, and duplicate entries. Problems are flagged by the World Bank and corrected by the implementing contractor through data checks, callbacks, and revisiting establishments.
Survey non-response must be differentiated from item non-response. The former refers to refusals to participate in the survey altogether whereas the latter refers to the refusals to answer some specific questions. Enterprise Surveys suffer from both problems and different strategies were used to address these issues.
Item non-response was addressed by two strategies: a- For sensitive questions that may generate negative reactions from the respondent, such as corruption or tax evasion, enumerators were instructed to collect "Refusal to respond" (-8) as a different option from "Don't know" (-9). b- Establishments with incomplete information were re-contacted in order to complete this information, whenever necessary.
Survey non-response was addressed by maximizing efforts to contact establishments that were initially selected for interview. Attempts were made to contact the establishment for interview at different times/days of the week before a replacement establishment (with similar strata characteristics) was suggested for interview. Survey non-response did occur but substitutions were made in order to potentially achieve strata-specific goals.
Introduction. This document provides an overview of an archive composed of four sections.
[1] An introduction (this document) which describes the scope of the project
[2] Yearly folder, from 2002 until 2010, of the coarse Microsoft Access datasets + the surveys used to collect information for each year. The word coarse does not mean the information in the Microsoft Access dataset was not corrected for mistakes; it was, but some mistakes and inconsistencies remain, such as with data on age or education. Furthermore, the coarse dataset provides disaggregated information for selected topics, which appear in summary statistics in the clean dataset. For example, in the coarse dataset one can find the different illnesses afflicting a person during the past 14 days whereas in the clean dataset only the total number of illnesses appears.
[3] A letter from the Gran Consejo Tsimane’ authorizing the public use of de-identified data collected in our studies among Tsimane’.
[4] A Microsoft Excel document with the unique identification number for each person in the panel study.
Background. During 2002-2010, a team of international researchers, surveyors, and translators gathered longitudinal (panel) data on the demography, economy, social relations, health, nutritional status, local ecological knowledge, and emotions of about 1400 native Amazonians known as Tsimane’ who lived in thirteen villages near and far from towns in the department of Beni in the Bolivian Amazon. A report titled “Too little, too late” summarizes selected findings from the study and is available to the public at the electronic library of Brandeis University:
https://scholarworks.brandeis.edu/permalink/01BRAND_INST/1bo2f6t/alma9923926194001921
A copy of the clean, merged, and appended Stata (V17) dataset is available to the public at the following two web addresses:
[a] Brandeis University:
https://scholarworks.brandeis.edu/permalink/01BRAND_INST/1bo2f6t/alma9923926193901921
[b] Inter-university Consortium for Political and Social Research (ICPSR), University of Michigan (only available to users affiliated with institutions belonging to ICPSR)
http://www.icpsr.umich.edu/icpsrweb/ICPSR/studies/37671/utilization
Chapter 4 of the report “Too little, too late” mentioned above describes the motivation and history of the study, the difference between the coarse and clean datasets, and topics which can be examined only with coarse data.
Aims. The aims of this archive are to:
· Make available in Microsoft Access the coarse de-identified dataset [1] for each of the seven yearly surveys (2004-2010) and [2] one Access data based on quarterly surveys done during 2002 and 2003. Together, these two datasets form one longitudinal dataset of individuals, households, and villages.
· Provide guidance on how to link files within and across years, and
· Make available a Microsoft Excel file with a unique identification number to link individuals across years
The datasets in the archive.
· Eight Microsoft Access datasets with data on a wide range of variables. Except for the Access file for 2002-2003, all the other information in each of the other Access files refers to one year. Within any Access dataset, users will find two types of files:
o Thematic files. The name of a thematic file contains the prefix tbl (e.g., 29_tbl_Demography or tbl_29_Demography). The file name (sometimes in Spanish, sometimes in English) indicates the content of the file. For example, in the Access dataset for one year, the micro file tbl_30_Ventas has all the information on sales for that year. Within each micro file, columns contain information on a variable and the name of the column indicates the content of the variable. For instance, the column heading item in the Sales file would indicate the type of good sold. The exac…
The documented dataset covers Enterprise Survey (ES) panel data collected in Dominican Republic in 2010 and 2016, as part of Latin America and the Caribbean Enterprise Surveys rollout, an initiative of the World Bank. The objective of the Enterprise Survey is to obtain feedback from enterprises on the state of the private sector as well as to help in building a panel of enterprise data that will make it possible to track changes in the business environment over time, thus allowing, for example, impact assessments of reforms.
Enterprise Surveys target a sample consisting of longitudinal (panel) observations and new cross-sectional data. Panel firms are prioritized in the sample selection, comprising up to 50% of the sample. For all panel firms, regardless of the sample, current eligibility or operating status is determined and included in panel datasets.
Dominican Republic ES 2010 was conducted in March - September 2011, ES 2016 was carried out in August 2016 - April 2017. Stratified random sampling was used to select the surveyed businesses. Data was collected using face-to-face interviews.
Data from 719 establishments was analyzed: 257 businesses were from 2010 ES only, 256 - from 2016 only, and 206 firms were from 2010 and 2016.
The standard Enterprise Survey topics include firm characteristics, gender participation, access to finance, annual sales, costs of inputs and labor, workforce composition, bribery, licensing, infrastructure, trade, crime, competition, capacity utilization, land and permits, taxation, informality, business-government relations, innovation and technology, and performance measures. Over 90 percent of the questions objectively measure characteristics of a country’s business environment. The remaining questions assess the survey respondents’ opinions on what are the obstacles to firm growth and performance.
National
The primary sampling unit of the study is an establishment. An establishment is a physical location where business is carried out and where industrial operations take place or services are provided. A firm may be composed of one or more establishments. For example, a brewery may have several bottling plants and several establishments for distribution. For the purposes of this survey an establishment must make its own financial decisions and have its own financial statements separate from those of the firm. An establishment must also have its own management and control over its payroll.
The whole population, or the universe, covered in the Enterprise Surveys is the non-agricultural private economy. It comprises: all manufacturing sectors according to the ISIC Revision 3.1 group classification (group D), construction sector (group F), services sector (groups G and H), and transport, storage, and communications sector (group I). Note that this population definition excludes the following sectors: financial intermediation (group J), real estate and renting activities (group K, except sub-sector 72, IT, which was added to the population under study), and all public or utilities sectors. Companies with 100% government ownership are not eligible to participate in the Enterprise Surveys.
Sample survey data [ssd]
Three levels of stratification were used in this country: industry, establishment size and region.
Industry stratification was designed as follows: the universe was stratified as into manufacturing and services industries - Manufacturing (ISIC Rev. 3.1 codes 15 - 37), and Services (ISIC codes 45, 50-52, 55, 60-64, and 72).
Size stratification was defined as follows: small (5 to 19 employees), medium (20 to 99 employees), and large (100 or more employees).
In 2016, regional stratification was done across three regions: Santo Domingo, Santiago-Puerto Plata-Espaillat and the Rest of the country.
The sample frame consisted of listings of firms from three sources: for panel firms the list of 360 firms from the Dominican Republic 2010 ES was used and for fresh firms (i.e., firms not covered in 2010) a listing of firms obtained from El Directorio de Empresas y Establecimientos (DEE) 2015 and Oficina Nacional de Estadística (ONE), were used.
In 2010, regional stratification was defined in two locations: Santo Domingo and the rest of the country (constituted by urban centers around Santiago and Higuey). For the purposes of sampling, the rest of the country was treated as one area.
The sample frame for 2010 ES was provided by the Oficina Nacional de Estadistica (ONE), dated 2009.
Face-to-face [f2f]
Data entry and quality controls are implemented by the contractor and data is delivered to the World Bank in batches (typically 10%, 50% and 100%). These data deliveries are checked for logical consistency, out of range values, skip patterns, and duplicate entries. Problems are flagged by the World Bank and corrected by the implementing contractor through data checks, callbacks, and revisiting establishments.
Survey non-response must be differentiated from item non-response. The former refers to refusals to participate in the survey altogether whereas the latter refers to the refusals to answer some specific questions. Enterprise Surveys suffer from both problems and different strategies were used to address these issues.
Item non-response was addressed by two strategies: a- For sensitive questions that may generate negative reactions from the respondent, such as corruption or tax evasion, enumerators were instructed to collect "Refusal to respond" (-8) as a different option from "Don't know" (-9). b- Establishments with incomplete information were re-contacted in order to complete this information, whenever necessary.
Survey non-response was addressed by maximizing efforts to contact establishments that were initially selected for interview. Attempts were made to contact the establishment for interview at different times/days of the week before a replacement establishment (with similar strata characteristics) was suggested for interview. Survey non-response did occur but substitutions were made in order to potentially achieve strata-specific goals.
The documented dataset covers Enterprise Survey (ES) panel data collected in Nicaragua in 2003, 2006, 2010 and 2016, as part of Latin America and the Caribbean Enterprise Surveys rollout, an initiative of the World Bank. The objective of the survey is to obtain feedback from enterprises on the state of the private sector as well as to help in building a panel of enterprise data that will make it possible to track changes in the business environment over time, thus allowing, for example, impact assessments of reforms. Through interviews with firms in the manufacturing and services sectors, the survey assesses the constraints to private sector growth and creates statistically significant business environment indicators that are comparable across countries. Only registered businesses are surveyed in the Enterprise Survey.
Enterprise Surveys target a sample consisting of longitudinal (panel) observations and new cross-sectional data. Panel firms are prioritized in the sample selection, comprising up to 50% of the sample. For all panel firms, regardless of the sample, current eligibility or operating status is determined and included in panel datasets.
Nicaragua ES 2010 was conducted in August 2010- May 2011, Ecuador ES 2016 was carried out in October 2016 - June 2017. Stratified random sampling was used to select the surveyed businesses. Data was collected using face-to-face interviews.
Data from 1,599 establishments was analyzed: 211 businesses were from 2003 only, 153 firms were from 2006 only, 119 - from 2010 only, 213 - from 2016 only, 146 firms were from 2010 and 2016, 110 - from 2006 and 2010, 72 firms were from 2003, 2006, 2010 and 2016.
National
The primary sampling unit of the study is an establishment. An establishment is a physical location where business is carried out and where industrial operations take place or services are provided. A firm may be composed of one or more establishments. For example, a brewery may have several bottling plants and several establishments for distribution. For the purposes of this survey an establishment must make its own financial decisions and have its own financial statements separate from those of the firm. An establishment must also have its own management and control over its payroll.
The whole population. The whole population, or universe of the study, is the non-agricultural economy. It comprises: all manufacturing sectors according to the group classification of ISIC Revision 3.1: (group D), construction sector (group F), services sector (groups G and H), and transport, storage, and communications sector (group I). Note that this definition excludes the following sectors: financial intermediation (group J), real estate and renting activities (group K, except sub-sector 72, IT, which was added to the population under study), and all public or utilities-sectors.
Sample survey data [ssd]
Three levels of stratification were used in this country: industry, establishment size, and region.
Industry stratification was designed in the way that follows: the universe was stratified into Manufacturing industries (ISIC Rev. 3.1 codes 15- 37), Retail industries (ISIC code 52) and Other Services (ISIC codes 45, 50, 51, 55, 60-64, and 72).
For the Nicaragua ES 2016, size stratification was defined as follows: small (4 to 20 employees), medium (21 to 50 employees), and large (51 or more employees). These categories differ from the global ES size definitions - small (5 to 19 employees), medium (20 to 99 employees), and large (100 or more employees).
The sample frame consisted of listings of firms from two sources: For panel firms the list of 336 firms from the Nicaragua 2010 ES was used, and for fresh firms (i.e., firms not covered in 2010) the sample frame was comprised of a list randomly drawn from the Economic Census, provided by the Banco Central de Nicaragua. Standardized size categories provided by the Census were used.
In 2010, regional stratification was defined in two locations (city and the surrounding business area): Managua and the Rest of the Country.
Face-to-face [f2f]
The structure of the data base reflects the fact that two different versions of the survey instrument were used for all registered establishments. Questionnaires have common questions (core module) and respectfully additional manufacturing- and services-specific questions.
The eligible manufacturing industries have been surveyed using the Manufacturing questionnaire (includes the core module, plus manufacturing specific questions).
Retail firms have been interviewed using the Services questionnaire (includes the core module plus retail specific questions) and the residual eligible services have been covered using the Services questionnaire (includes the core module).
Data entry and quality controls are implemented by the contractor and data is delivered to the World Bank in batches (typically 10%, 50% and 100%). These data deliveries are checked for logical consistency, out of range values, skip patterns, and duplicate entries. Problems are flagged by the World Bank and corrected by the implementing contractor through data checks, callbacks, and revisiting establishments.
Survey non-response must be differentiated from item non-response. The former refers to refusals to participate in the survey altogether whereas the latter refers to the refusals to answer some specific questions. Enterprise Surveys suffer from both problems and different strategies were used to address these issues.
Item non-response was addressed by two strategies: a- For sensitive questions that may generate negative reactions from the respondent, such as corruption or tax evasion, enumerators were instructed to collect "Refusal to respond" (-8) as a different option from "Don't know" (-9). b- Establishments with incomplete information were re-contacted in order to complete this information, whenever necessary.
Survey non-response was addressed by maximizing efforts to contact establishments that were initially selected for interview. Attempts were made to contact the establishment for interview at different times/days of the week before a replacement establishment (with similar strata characteristics) was suggested for interview. Survey non-response did occur but substitutions were made in order to potentially achieve strata-specific goals.
The documented dataset covers Enterprise Survey (ES) panel data collected in Bolivia in 2006, 2010 and 2017, as part of Latin America and the Caribbean Enterprise Surveys rollout, an initiative of the World Bank. The objective of the Enterprise Survey is to obtain feedback from enterprises on the state of the private sector as well as to help in building a panel of enterprise data that will make it possible to track changes in the business environment over time, thus allowing, for example, impact assessments of reforms.
Enterprise Surveys target a sample consisting of longitudinal (panel) observations and new cross-sectional data. Panel firms are prioritized in the sample selection, comprising up to 50% of the sample. For all panel firms, regardless of the sample, current eligibility or operating status is determined and included in panel datasets.
Bolivia ES 2010 was conducted in June 2010 and October 2010, Bolivia ES 2016 was carried out in January and June of 2017. Stratified random sampling was used to select the surveyed businesses. Data was collected using face-to-face interviews.
Data from 1,339 establishments was analyzed: 433 businesses were from 2006 only, 97 - from 2010 only, 197 - from 2017 only, 170 firms were from 2010 and 2017, 196 - from 2006 and 2010, 246 firms were from 2006, 2010 and 2017.
The standard Enterprise Survey topics include firm characteristics, gender participation, access to finance, annual sales, costs of inputs and labor, workforce composition, bribery, licensing, infrastructure, trade, crime, competition, capacity utilization, land and permits, taxation, informality, business-government relations, innovation and technology, and performance measures. Over 90 percent of the questions objectively measure characteristics of a country’s business environment. The remaining questions assess the survey respondents’ opinions on what are the obstacles to firm growth and performance.
National
The primary sampling unit of the study is an establishment. An establishment is a physical location where business is carried out and where industrial operations take place or services are provided. A firm may be composed of one or more establishments. For example, a brewery may have several bottling plants and several establishments for distribution. For the purposes of this survey an establishment must make its own financial decisions and have its own financial statements separate from those of the firm. An establishment must also have its own management and control over its payroll.
The whole population. The whole population, or universe of the study, is the non-agricultural economy. It comprises: all manufacturing sectors according to the group classification of ISIC Revision 3.1: (group D), construction sector (group F), services sector (groups G and H), and transport, storage, and communications sector (group I). Note that this definition excludes the following sectors: financial intermediation (group J), real estate and renting activities (group K, except sub-sector 72, IT, which was added to the population under study), and all public or utilities-sectors.
Sample survey data [ssd]
Three levels of stratification were used in this country: industry, establishment size, and region.
Industry stratification was designed as follows: the universe was stratified into Manufacturing industries (ISIC Rev. 3.1 codes 15- 37), Retail industries (ISIC code 52) and Other Services (ISIC codes 45, 50, 51, 55, 60-64, and 72).
Size stratification was defined as follows: small (5 to 19 employees), medium (20 to 99 employees), and large (100 or more employees).
The sample frame for 2017 consisted of listings of firms from two sources: for panel firms the list of 362 firms from the Bolivia 2010 ES was used, and for fresh firms (i.e., firms not covered in 2010) Economic Census, updated by Encuestas y Estudios (2016) was used.
In 2010, regional stratification was defined in three locations (city and the surrounding business area): La Paz, Santa Cruz, and Cochabamba.
Face-to-face [f2f]
Data entry and quality controls are implemented by the contractor and data is delivered to the World Bank in batches (typically 10%, 50% and 100%). These data deliveries are checked for logical consistency, out of range values, skip patterns, and duplicate entries. Problems are flagged by the World Bank and corrected by the implementing contractor through data checks, callbacks, and revisiting establishments.
Survey non-response must be differentiated from item non-response. The former refers to refusals to participate in the survey altogether whereas the latter refers to the refusals to answer some specific questions. Enterprise Surveys suffer from both problems and different strategies were used to address these issues.
Item non-response was addressed by two strategies: a- For sensitive questions that may generate negative reactions from the respondent, such as corruption or tax evasion, enumerators were instructed to collect "Refusal to respond" (-8) as a different option from "Don't know" (-9). b- Establishments with incomplete information were re-contacted in order to complete this information, whenever necessary.
Survey non-response was addressed by maximizing efforts to contact establishments that were initially selected for interview. Attempts were made to contact the establishment for interview at different times/days of the week before a replacement establishment (with similar strata characteristics) was suggested for interview. Survey non-response did occur but substitutions were made in order to potentially achieve strata-specific goals.
The documented dataset covers Enterprise Survey (ES) panel data collected in Niger in 2005, 2009 and 2016, as part of Africa Enterprise Surveys rollout, an initiative of the World Bank. The objective of the survey is to obtain feedback from enterprises on the state of the private sector as well as to help in building a panel of enterprise data that will make it possible to track changes in the business environment over time, thus allowing, for example, impact assessments of reforms.
Through interviews with firms in the manufacturing and services sectors, the survey assesses the constraints to private sector growth and creates statistically significant business environment indicators that are comparable across countries. Only registered businesses are surveyed in the Enterprise Survey.
Data from 151 establishments was analyzed. Stratified random sampling was used to select the surveyed businesses. The data was collected using face-to-face interviews.
The standard Enterprise Survey topics include firm characteristics, gender participation, access to finance, annual sales, costs of inputs and labor, workforce composition, bribery, licensing, infrastructure, trade, crime, competition, capacity utilization, land and permits, taxation, informality, business-government relations, innovation and technology, and performance measures. Over 90 percent of the questions objectively ascertain characteristics of a country’s business environment. The remaining questions assess the survey respondents’ opinions on what are the obstacles to firm growth and performance.
National
The primary sampling unit of the study is an establishment. An establishment is a physical location where business is carried out and where industrial operations take place or services are provided. A firm may be composed of one or more establishments. For example, a brewery may have several bottling plants and several establishments for distribution. For the purposes of this survey an establishment must make its own financial decisions and have its own financial statements separate from those of the firm. An establishment must also have its own management and control over its payroll.
The whole population, or the universe, covered in the Enterprise Surveys is the non-agricultural private economy. It comprises: all manufacturing sectors according to the ISIC Revision 3.1 group classification (group D), construction sector (group F), services sector (groups G and H), and transport, storage, and communications sector (group I). Note that this population definition excludes the following sectors: financial intermediation (group J), real estate and renting activities (group K, except sub-sector 72, IT, which was added to the population under study), and all public or utilities sectors. Companies with 100% government ownership are not eligible to participate in the Enterprise Surveys.
Sample survey data [ssd]
Three levels of stratification were used in this country: industry, establishment size, and region.
Industry stratification was designed as follows: the universe was stratified as into manufacturing and services industries- Manufacturing (ISIC Rev. 3.1 codes 15 - 37), and Services (ISIC codes 45, 50-52, 55, 60-64, and 72).
For the 2009 sample stratification purposes, the number of employees was defined on the basis of reported permanent full-time workers. Size stratification was defined following the standardized definition used for the Enterprise Surveys: small (5 to 19 employees), medium (20 to 99 employees), and large (more than 99 employees). For stratification purposes, the number of employees was defined on the basis of reported permanent full-time workers. Regional stratification was defined in terms of the geographic regions with the largest commercial presence in the country: Maradi and Niamey were the two areas selected in Niger.
Two frames were used for Niger. The first one included official lists from the Chamber of commerce, craft and industries of Niger 2008 and the Repertoire of Companies (2008) operating in Niger. The second frame (the panel sample) consisted of enterprises interviewed for the Enterprise Survey in 2005, which were to be re-interviewed where they were in the selected geographical regions and met eligibility criteria. Both database contained the following information: -Name of the firm -Contact details -ISIC code -Number of employees.
Given the impact that non-eligible units included in the sample universe may have on the results, adjustments may be needed when computing the appropriate weights for individual observations. The percentage of confirmed non-eligible units as a proportion of the total number of sampled establishments contacted for the survey was 39.9% (134 out of 344 establishments). Breaking down by industry, the following numbers of establishments were surveyed: Manufacturing - 52, Services - 98.
For 2017: Regional stratification for the Niger ES was done across two regions: Niamey and Rest of the Country.
The sample frame consisted of listings of firms from three sources: - the list of 150 firms from the Niger 2009 ES for panel firms - firm data from La Caisse Nationale de Sécurité Sociale (CNSS) and a list of exporting firms by the Institut National des Statistiques (INS) for fresh firms (firms not covered in 2009).
Given the impact that non-eligible units included in the sample universe may have on the results, adjustments may be needed when computing the appropriate weights for individual observations. The percentage of confirmed non-eligible units as a proportion of the total number of sampled establishments contacted for the survey was 18.6% (76 out of 409 establishments).
Face-to-face [f2f]
Data entry and quality controls are implemented by the contractor and data is delivered to the World Bank in batches (typically 10%, 50% and 100%). These data deliveries are checked for logical consistency, out of range values, skip patterns, and duplicate entries. Problems are flagged by the World Bank and corrected by the implementing contractor through data checks, callbacks, and revisiting establishments.
Survey non-response must be differentiated from item non-response. The former refers to refusals to participate in the survey altogether whereas the latter refers to the refusals to answer some specific questions. Enterprise Surveys suffer from both problems and different strategies were used to address these issues.
Item non-response was addressed by two strategies: a- For sensitive questions that may generate negative reactions from the respondent, such as corruption or tax evasion, enumerators were instructed to collect "Refusal to respond" (-8) as a different option from "Don't know" (-9). b- Establishments with incomplete information were re-contacted in order to complete this information, whenever necessary.
Survey non-response was addressed by maximizing efforts to contact establishments that were initially selected for interview. Attempts were made to contact the establishment for interview at different times/days of the week before a replacement establishment (with similar strata characteristics) was suggested for interview. Survey non-response did occur but substitutions were made in order to potentially achieve strata-specific goals.
The documented dataset covers Enterprise Survey (ES) panel data collected in Liberia in 2009 and 2017, as part of the Enterprise Survey initiative of the World Bank. An Indicator Survey is similar to an Enterprise Survey; it is implemented for smaller economies where the sampling strategies inherent in an Enterprise Survey are often not applicable due to the limited universe of firms.
The objective of the 2009-2017 Enterprise Survey is to obtain feedback from enterprises in client countries on the state of the private sector as well as to build a panel of enterprise data that will make it possible to track changes in the business environment over time and allow, for example, impact assessments of reforms. Through interviews with firms in the manufacturing and services sectors, the Indicator Survey data provides information on the constraints to private sector growth and is used to create statistically significant business environment indicators that are comparable across countries.
As part of its strategic goal of building a climate for investment, job creation, and sustainable growth, the World Bank has promoted improving the business environment as a key strategy for development, which has led to a systematic effort in collecting enterprise data across countries. The Enterprise Surveys (ES) are an ongoing World Bank project in collecting both objective data based on firms' experiences and enterprises' perception of the environment in which they operate.
National
The primary sampling unit of the study is the establishment. An establishment is a physical location where business is carried out and where industrial operations take place or services are provided. A firm may be composed of one or more establishments. For example, a brewery may have several bottling plants and several establishments for distribution. For the purposes of this survey an establishment must make its own financial decisions and have its own financial statements separate from those of the firm. An establishment must also have its own management and control over its payroll.
The whole population, or the universe, covered in the Enterprise Surveys is the non-agricultural economy. It comprises: all manufacturing sectors according to the ISIC Revision 3.1 group classification (group D), construction sector (group F), services sector (groups G and H), and transport, storage, and communications sector (group I). Note that this population definition excludes the following sectors: financial intermediation (group J), real estate and renting activities (group K, except sub-sector 72, IT, which was added to the population under study), and all public or utilities sectors.
Sample survey data [ssd]
The sample for the 2009-2017 Liberia Enterprise Survey (ES) was selected using stratified random sampling, following the methodology explained in the Sampling Note. Stratified random was preferred over simple random sampling for several reasons: - To obtain unbiased estimates for different subdivisions of the population with some known level of precision. - To obtain unbiased estimates for the whole population. The whole population, or universe of the study, is the non-agricultural economy. It comprises: all manufacturing sectors according to the group classification of ISIC Revision 3.1: (group D), construction sector (group F), services sector (groups G and H), and transport, storage, and communications sector (group I). Note that this definition excludes the following sectors: financial intermediation (group J), real estate and renting activities (group K, except subsector 72, IT, which was added to the population under study), and all public or utilities sectors.
The cost per observation in the survey may be reduced by stratification of the population elements into convenient groupings.
Three levels of stratification were used in this country: industry, establishment size, and region. Industry stratification was designed as follows: the universe was stratified as into manufacturing and services industries. Manufacturing (ISIC Rev. 3.1 codes 15 - 37), and Services (ISIC codes 45, 50-52, 55, 60-64, and 72).
For the Liberia ES, size stratification was defined as follows: small (5 to 19 employees), medium (20 to 99 employees), and large (100 or more employees).
Regional stratification for the Liberia ES was done across three regions: Montserrado, Margibi, and Nimba.
Face-to-face [f2f]
The current survey instruments are available: - Services and Manufacturing Questionnaire - Screener Questionnaire.
The standard Enterprise Survey topics include firm characteristics, gender participation, access to finance, annual sales, costs of inputs/labor, workforce composition, bribery, licensing, infrastructure, trade, crime, competition, capacity utilization, land and permits, taxation, informality, business-government relations, innovation and technology, and performance measures. Over 90% of the questions objectively ascertain characteristics of a country's business environment. The remaining questions assess the survey respondents' opinions on what are the obstacles to firm growth and performance.
Data entry and quality controls are implemented by the contractor and data is delivered to the World Bank in batches (typically 10%, 50% and 100%). These data deliveries are checked for logical consistency, out of range values, skip patterns, and duplicate entries. Problems are flagged by the World Bank and corrected by the implementing contractor through data checks, callbacks, and revisiting establishments.
There was a high response rate especially as a result of positive attitude towards the international community in collaboration with the government in their reconstruction efforts after a period of civil strife.There was also very positive attitude towards World Bank initiatives.
The 2016 Integrated Household Panel Survey (IHPS) was launched in April 2016 as part of the Malawi Fourth Integrated Household Survey fieldwork operation. The IHPS 2016 targeted 1,989 households that were interviewed in the IHPS 2013 and that could be traced back to half of the 204 enumeration areas that were originally sampled as part of the Third Integrated Household Survey (IHS3) 2010/11. The 2019 IHPS was launched in April 2019 as part of the Malawi Fifth Integrated Household Survey fieldwork operations targeting the 2,508 households that were interviewed in 2016. The panel sample expanded each wave through the tracking of split-off individuals and the new households that they formed. Available as part of this project is the IHPS 2019 data, the IHPS 2016 data as well as the rereleased IHPS 2010 & 2013 data including only the subsample of 102 EAs with updated panel weights. Additionally, the IHPS 2016 was the first survey that received complementary financial and technical support from the Living Standards Measurement Study – Plus (LSMS+) initiative, which has been established with grants from the Umbrella Facility for Gender Equality Trust Fund, the World Bank Trust Fund for Statistical Capacity Building, and the International Fund for Agricultural Development, and is implemented by the World Bank Living Standards Measurement Study (LSMS) team, in collaboration with the World Bank Gender Group and partner national statistical offices. The LSMS+ aims to improve the availability and quality of individual-disaggregated household survey data, and is, at start, a direct response to the World Bank IDA18 commitment to support 6 IDA countries in collecting intra-household, sex-disaggregated household survey data on 1) ownership of and rights to selected physical and financial assets, 2) work and employment, and 3) entrepreneurship – following international best practices in questionnaire design and minimizing the use of proxy respondents while collecting personal information. This dataset is included here.
National coverage
The IHPS 2016 and 2019 attempted to track all IHPS 2013 households stemming from 102 of the original 204 baseline panel enumeration areas as well as individuals that moved away from the 2013 dwellings between 2013 and 2016 as long as they were neither servants nor guests at the time of the IHPS 2013; were projected to be at least 12 years of age and were known to be residing in mainland Malawi but excluding those in Likoma Island and in institutions, including prisons, police compounds, and army barracks.
Sample survey data [ssd]
A sub-sample of IHS3 2010 sample enumeration areas (EAs) (i.e. 204 EAs out of 768 EAs) was selected prior to the start of the IHS3 field work with the intention to (i) to track and resurvey these households in 2013 in accordance with the IHS3 fieldwork timeline and as part of the Integrated Household Panel Survey (IHPS 2013) and (ii) visit a total of 3,246 households in these EAs twice to reduce recall associated with different aspects of agricultural data collection. At baseline, the IHPS sample was selected to be representative at the national, regional, urban/rural levels and for each of the following 6 strata: (i) Northern Region - Rural, (ii) Northern Region - Urban, (iii) Central Region - Rural, (iv) Central Region - Urban, (v) Southern Region - Rural, and (vi) Southern Region - Urban. The IHPS 2013 main fieldwork took place during the period of April-October 2013, with residual tracking operations in November-December 2013.
Given budget and resource constraints, for the IHPS 2016 the number of sample EAs in the panel was reduced to 102 out of the 204 EAs. As a result, the domains of analysis are limited to the national, urban and rural areas. Although the results of the IHPS 2016 cannot be tabulated by region, the stratification of the IHPS by region, urban and rural strata was maintained. The IHPS 2019 tracked all individuals 12 years or older from the 2016 households.
Computer Assisted Personal Interview [capi]
Data Entry Platform To ensure data quality and timely availability of data, the IHPS 2019 was implemented using the World Bank’s Survey Solutions CAPI software. To carry out IHPS 2019, 1 laptop computer and a wireless internet router were assigned to each team supervisor, and each enumerator had an 8–inch GPS-enabled Lenovo tablet computer that the NSO provided. The use of Survey Solutions allowed for the real-time availability of data as the completed data was completed, approved by the Supervisor and synced to the Headquarters server as frequently as possible. While administering the first module of the questionnaire the enumerator(s) also used their tablets to record the GPS coordinates of the dwelling units. Geo-referenced household locations from that tablet complemented the GPS measurements taken by the Garmin eTrex 30 handheld devices and these were linked with publically available geospatial databases to enable the inclusion of a number of geospatial variables - extensive measures of distance (i.e. distance to the nearest market), climatology, soil and terrain, and other environmental factors - in the analysis.
Data Management The IHPS 2019 Survey Solutions CAPI based data entry application was designed to stream-line the data collection process from the field. IHPS 2019 Interviews were mainly collected in “sample” mode (assignments generated from headquarters) and a few in “census” mode (new interviews created by interviewers from a template) for the NSO to have more control over the sample. This hybrid approach was necessary to aid the tracking operations whereby an enumerator could quickly create a tracking assignment considering that they were mostly working in areas with poor network connection and hence could not quickly receive tracking cases from Headquarters.
The range and consistency checks built into the application was informed by the LSMS-ISA experience with the IHS3 2010/11, IHPS 2013 and IHPS 2016. Prior programming of the data entry application allowed for a wide variety of range and consistency checks to be conducted and reported and potential issues investigated and corrected before closing the assigned enumeration area. Headquarters (the NSO management) assigned work to the supervisors based on their regions of coverage. The supervisors then made assignments to the enumerators linked to their supervisor account. The work assignments and syncing of completed interviews took place through a Wi-Fi connection to the IHPS 2019 server. Because the data was available in real time it was monitored closely throughout the entire data collection period and upon receipt of the data at headquarters, data was exported to Stata for other consistency checks, data cleaning, and analysis.
Data Cleaning The data cleaning process was done in several stages over the course of fieldwork and through preliminary analysis. The first stage of data cleaning was conducted in the field by the field-based field teams utilizing error messages generated by the Survey Solutions application when a response did not fit the rules for a particular question. For questions that flagged an error, the enumerators were expected to record a comment within the questionnaire to explain to their supervisor the reason for the error and confirming that they double checked the response with the respondent. The supervisors were expected to sync the enumerator tablets as frequently as possible to avoid having many questionnaires on the tablet, and to enable daily checks of questionnaires. Some supervisors preferred to review completed interviews on the tablets so they would review prior to syncing but still record the notes in the supervisor account and reject questionnaires accordingly. The second stage of data cleaning was also done in the field, and this resulted from the additional error reports generated in Stata, which were in turn sent to the field teams via email or DropBox. The field supervisors collected reports for their assignments and in coordination with the enumerators reviewed, investigated, and collected errors. Due to the quick turn-around in error reporting, it was possible to conduct call-backs while the team was still operating in the EA when required. Corrections to the data were entered in the rejected questionnaires and sent back to headquarters.
The data cleaning process was done in several stages over the course of the fieldwork and through preliminary analyses. The first stage was during the interview itself. Because CAPI software was used, as enumerators asked the questions and recorded information, error messages were provided immediately when the information recorded did not match previously defined rules for that variable. For example, if the education level for a 12 year old respondent was given as post graduate. The second stage occurred during the review of the questionnaire by the Field Supervisor. The Survey Solutions software allows errors to remain in the data if the enumerator does not make a correction. The enumerator can write a comment to explain why the data appears to be incorrect. For example, if the previously mentioned 12 year old was, in fact, a genius who had completed graduate studies. The next stage occurred when the data were transferred to headquarters where the NSO staff would again review the data for errors and verify the comments from the
The documentation covers Enterprise Survey panel datasets that were collected in Slovenia in 2009, 2013 and 2019.
The Slovenia ES 2009 was conducted between 2008 and 2009. The Slovenia ES 2013 was conducted between March 2013 and September 2013. Finally, the Slovenia ES 2019 was conducted between December 2018 and November 2019. The objective of the Enterprise Survey is to gain an understanding of what firms experience in the private sector.
As part of its strategic goal of building a climate for investment, job creation, and sustainable growth, the World Bank has promoted improving the business environment as a key strategy for development, which has led to a systematic effort in collecting enterprise data across countries. The Enterprise Surveys (ES) are an ongoing World Bank project in collecting both objective data based on firms' experiences and enterprises' perception of the environment in which they operate.
National
The primary sampling unit of the study is the establishment. An establishment is a physical location where business is carried out and where industrial operations take place or services are provided. A firm may be composed of one or more establishments. For example, a brewery may have several bottling plants and several establishments for distribution. For the purposes of this survey an establishment must take its own financial decisions and have its own financial statements separate from those of the firm. An establishment must also have its own management and control over its payroll.
As it is standard for the ES, the Slovenia ES was based on the following size stratification: small (5 to 19 employees), medium (20 to 99 employees), and large (100 or more employees).
Sample survey data [ssd]
The sample for Slovenia ES 2009, 2013, 2019 were selected using stratified random sampling, following the methodology explained in the Sampling Manual for Slovenia 2009 ES and for Slovenia 2013 ES, and in the Sampling Note for 2019 Slovenia ES.
Three levels of stratification were used in this country: industry, establishment size, and oblast (region). The original sample designs with specific information of the industries and regions chosen are included in the attached Excel file (Sampling Report.xls.) for Slovenia 2009 ES. For Slovenia 2013 and 2019 ES, specific information of the industries and regions chosen is described in the "The Slovenia 2013 Enterprise Surveys Data Set" and "The Slovenia 2019 Enterprise Surveys Data Set" reports respectively, Appendix E.
For the Slovenia 2009 ES, industry stratification was designed in the way that follows: the universe was stratified into manufacturing industries, services industries, and one residual (core) sector as defined in the sampling manual. Each industry had a target of 90 interviews. For the manufacturing industries sample sizes were inflated by about 17% to account for potential non-response cases when requesting sensitive financial data and also because of likely attrition in future surveys that would affect the construction of a panel. For the other industries (residuals) sample sizes were inflated by about 12% to account for under sampling in firms in service industries.
For Slovenia 2013 ES, industry stratification was designed in the way that follows: the universe was stratified into one manufacturing industry, and two service industries (retail, and other services).
Finally, for Slovenia 2019 ES, three levels of stratification were used in this country: industry, establishment size, and region. The original sample design with specific information of the industries and regions chosen is described in "The Slovenia 2019 Enterprise Surveys Data Set" report, Appendix C. Industry stratification was done as follows: Manufacturing – combining all the relevant activities (ISIC Rev. 4.0 codes 10-33), Retail (ISIC 47), and Other Services (ISIC 41-43, 45, 46, 49-53, 55, 56, 58, 61, 62, 79, 95).
For Slovenia 2009 and 2013 ES, size stratification was defined following the standardized definition for the rollout: small (5 to 19 employees), medium (20 to 99 employees), and large (more than 99 employees). For stratification purposes, the number of employees was defined on the basis of reported permanent full-time workers. This seems to be an appropriate definition of the labor force since seasonal/casual/part-time employment is not a common practice, except in the sectors of construction and agriculture.
For Slovenia 2009 ES, regional stratification was defined in 2 regions. These regions are Vzhodna Slovenija and Zahodna Slovenija. The Slovenia sample contains panel data. The wave 1 panel “Investment Climate Private Enterprise Survey implemented in Slovenia” consisted of 223 establishments interviewed in 2005. A total of 57 establishments have been re-interviewed in the 2008 Business Environment and Enterprise Performance Survey.
For Slovenia 2013 ES, regional stratification was defined in 2 regions (city and the surrounding business area) throughout Slovenia.
Finally, for Slovenia 2019 ES, regional stratification was done across two regions: Eastern Slovenia (NUTS code SI03) and Western Slovenia (SI04).
Computer Assisted Personal Interview [capi]
Questionnaires have common questions (core module) and respectfully additional manufacturing- and services-specific questions. The eligible manufacturing industries have been surveyed using the Manufacturing questionnaire (includes the core module, plus manufacturing specific questions). Retail firms have been interviewed using the Services questionnaire (includes the core module plus retail specific questions) and the residual eligible services have been covered using the Services questionnaire (includes the core module). Each variation of the questionnaire is identified by the index variable, a0.
Survey non-response must be differentiated from item non-response. The former refers to refusals to participate in the survey altogether whereas the latter refers to the refusals to answer some specific questions. Enterprise Surveys suffer from both problems and different strategies were used to address these issues.
Item non-response was addressed by two strategies: a- For sensitive questions that may generate negative reactions from the respondent, such as corruption or tax evasion, enumerators were instructed to collect the refusal to respond as (-8). b- Establishments with incomplete information were re-contacted in order to complete this information, whenever necessary. However, there were clear cases of low response.
For 2009 and 2013 Slovenia ES, the survey non-response was addressed by maximizing efforts to contact establishments that were initially selected for interview. Up to 4 attempts were made to contact the establishment for interview at different times/days of the week before a replacement establishment (with similar strata characteristics) was suggested for interview. Survey non-response did occur but substitutions were made in order to potentially achieve strata-specific goals. Further research is needed on survey non-response in the Enterprise Surveys regarding potential introduction of bias.
For 2009, the number of contacted establishments per realized interview was 6.18. This number is the result of two factors: explicit refusals to participate in the survey, as reflected by the rate of rejection (which includes rejections of the screener and the main survey) and the quality of the sample frame, as represented by the presence of ineligible units. The relatively low ratio of contacted establishments per realized interview (6.18) suggests that the main source of error in estimates in the Slovenia may be selection bias and not frame inaccuracy.
For 2013, the number of realized interviews per contacted establishment was 25%. This number is the result of two factors: explicit refusals to participate in the survey, as reflected by the rate of rejection (which includes rejections of the screener and the main survey) and the quality of the sample frame, as represented by the presence of ineligible units. The number of rejections per contact was 44%.
Finally, for 2019, the number of interviews per contacted establishments was 9.7%. This number is the result of two factors: explicit refusals to participate in the survey, as reflected by the rate of rejection (which includes rejections of the screener and the main survey) and the quality of the sample frame, as represented by the presence of ineligible units. The share of rejections per contact was 75.2%.