The documentation covers Enterprise Survey panel datasets that were collected in Slovenia in 2009, 2013 and 2019.
The Slovenia ES 2009 was conducted between 2008 and 2009. The Slovenia ES 2013 was conducted between March 2013 and September 2013. Finally, the Slovenia ES 2019 was conducted between December 2018 and November 2019. The objective of the Enterprise Survey is to gain an understanding of what firms experience in the private sector.
As part of its strategic goal of building a climate for investment, job creation, and sustainable growth, the World Bank has promoted improving the business environment as a key strategy for development, which has led to a systematic effort in collecting enterprise data across countries. The Enterprise Surveys (ES) are an ongoing World Bank project in collecting both objective data based on firms' experiences and enterprises' perception of the environment in which they operate.
National
The primary sampling unit of the study is the establishment. An establishment is a physical location where business is carried out and where industrial operations take place or services are provided. A firm may be composed of one or more establishments. For example, a brewery may have several bottling plants and several establishments for distribution. For the purposes of this survey an establishment must take its own financial decisions and have its own financial statements separate from those of the firm. An establishment must also have its own management and control over its payroll.
As it is standard for the ES, the Slovenia ES was based on the following size stratification: small (5 to 19 employees), medium (20 to 99 employees), and large (100 or more employees).
Sample survey data [ssd]
The sample for Slovenia ES 2009, 2013, 2019 were selected using stratified random sampling, following the methodology explained in the Sampling Manual for Slovenia 2009 ES and for Slovenia 2013 ES, and in the Sampling Note for 2019 Slovenia ES.
Three levels of stratification were used in this country: industry, establishment size, and oblast (region). The original sample designs with specific information of the industries and regions chosen are included in the attached Excel file (Sampling Report.xls.) for Slovenia 2009 ES. For Slovenia 2013 and 2019 ES, specific information of the industries and regions chosen is described in the "The Slovenia 2013 Enterprise Surveys Data Set" and "The Slovenia 2019 Enterprise Surveys Data Set" reports respectively, Appendix E.
For the Slovenia 2009 ES, industry stratification was designed in the way that follows: the universe was stratified into manufacturing industries, services industries, and one residual (core) sector as defined in the sampling manual. Each industry had a target of 90 interviews. For the manufacturing industries sample sizes were inflated by about 17% to account for potential non-response cases when requesting sensitive financial data and also because of likely attrition in future surveys that would affect the construction of a panel. For the other industries (residuals) sample sizes were inflated by about 12% to account for under sampling in firms in service industries.
For Slovenia 2013 ES, industry stratification was designed in the way that follows: the universe was stratified into one manufacturing industry, and two service industries (retail, and other services).
Finally, for Slovenia 2019 ES, three levels of stratification were used in this country: industry, establishment size, and region. The original sample design with specific information of the industries and regions chosen is described in "The Slovenia 2019 Enterprise Surveys Data Set" report, Appendix C. Industry stratification was done as follows: Manufacturing – combining all the relevant activities (ISIC Rev. 4.0 codes 10-33), Retail (ISIC 47), and Other Services (ISIC 41-43, 45, 46, 49-53, 55, 56, 58, 61, 62, 79, 95).
For Slovenia 2009 and 2013 ES, size stratification was defined following the standardized definition for the rollout: small (5 to 19 employees), medium (20 to 99 employees), and large (more than 99 employees). For stratification purposes, the number of employees was defined on the basis of reported permanent full-time workers. This seems to be an appropriate definition of the labor force since seasonal/casual/part-time employment is not a common practice, except in the sectors of construction and agriculture.
For Slovenia 2009 ES, regional stratification was defined in 2 regions. These regions are Vzhodna Slovenija and Zahodna Slovenija. The Slovenia sample contains panel data. The wave 1 panel “Investment Climate Private Enterprise Survey implemented in Slovenia” consisted of 223 establishments interviewed in 2005. A total of 57 establishments have been re-interviewed in the 2008 Business Environment and Enterprise Performance Survey.
For Slovenia 2013 ES, regional stratification was defined in 2 regions (city and the surrounding business area) throughout Slovenia.
Finally, for Slovenia 2019 ES, regional stratification was done across two regions: Eastern Slovenia (NUTS code SI03) and Western Slovenia (SI04).
Computer Assisted Personal Interview [capi]
Questionnaires have common questions (core module) and respectfully additional manufacturing- and services-specific questions. The eligible manufacturing industries have been surveyed using the Manufacturing questionnaire (includes the core module, plus manufacturing specific questions). Retail firms have been interviewed using the Services questionnaire (includes the core module plus retail specific questions) and the residual eligible services have been covered using the Services questionnaire (includes the core module). Each variation of the questionnaire is identified by the index variable, a0.
Survey non-response must be differentiated from item non-response. The former refers to refusals to participate in the survey altogether whereas the latter refers to the refusals to answer some specific questions. Enterprise Surveys suffer from both problems and different strategies were used to address these issues.
Item non-response was addressed by two strategies: a- For sensitive questions that may generate negative reactions from the respondent, such as corruption or tax evasion, enumerators were instructed to collect the refusal to respond as (-8). b- Establishments with incomplete information were re-contacted in order to complete this information, whenever necessary. However, there were clear cases of low response.
For 2009 and 2013 Slovenia ES, the survey non-response was addressed by maximizing efforts to contact establishments that were initially selected for interview. Up to 4 attempts were made to contact the establishment for interview at different times/days of the week before a replacement establishment (with similar strata characteristics) was suggested for interview. Survey non-response did occur but substitutions were made in order to potentially achieve strata-specific goals. Further research is needed on survey non-response in the Enterprise Surveys regarding potential introduction of bias.
For 2009, the number of contacted establishments per realized interview was 6.18. This number is the result of two factors: explicit refusals to participate in the survey, as reflected by the rate of rejection (which includes rejections of the screener and the main survey) and the quality of the sample frame, as represented by the presence of ineligible units. The relatively low ratio of contacted establishments per realized interview (6.18) suggests that the main source of error in estimates in the Slovenia may be selection bias and not frame inaccuracy.
For 2013, the number of realized interviews per contacted establishment was 25%. This number is the result of two factors: explicit refusals to participate in the survey, as reflected by the rate of rejection (which includes rejections of the screener and the main survey) and the quality of the sample frame, as represented by the presence of ineligible units. The number of rejections per contact was 44%.
Finally, for 2019, the number of interviews per contacted establishments was 9.7%. This number is the result of two factors: explicit refusals to participate in the survey, as reflected by the rate of rejection (which includes rejections of the screener and the main survey) and the quality of the sample frame, as represented by the presence of ineligible units. The share of rejections per contact was 75.2%.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
These data include the individual responses for the City of Tempe Annual Community Survey conducted by ETC Institute. This dataset has two layers and includes both the weighted data and unweighted data. Weighting data is a statistical method in which datasets are adjusted through calculations in order to more accurately represent the population being studied. The weighted data are used in the final published PDF report.These data help determine priorities for the community as part of the City's on-going strategic planning process. Averaged Community Survey results are used as indicators for several city performance measures. The summary data for each performance measure is provided as an open dataset for that measure (separate from this dataset). The performance measures with indicators from the survey include the following (as of 2023):1. Safe and Secure Communities1.04 Fire Services Satisfaction1.06 Crime Reporting1.07 Police Services Satisfaction1.09 Victim of Crime1.10 Worry About Being a Victim1.11 Feeling Safe in City Facilities1.23 Feeling of Safety in Parks2. Strong Community Connections2.02 Customer Service Satisfaction2.04 City Website Satisfaction2.05 Online Services Satisfaction Rate2.15 Feeling Invited to Participate in City Decisions2.21 Satisfaction with Availability of City Information3. Quality of Life3.16 City Recreation, Arts, and Cultural Centers3.17 Community Services Programs3.19 Value of Special Events3.23 Right of Way Landscape Maintenance3.36 Quality of City Services4. Sustainable Growth & DevelopmentNo Performance Measures in this category presently relate directly to the Community Survey5. Financial Stability & VitalityNo Performance Measures in this category presently relate directly to the Community SurveyMethods:The survey is mailed to a random sample of households in the City of Tempe. Follow up emails and texts are also sent to encourage participation. A link to the survey is provided with each communication. To prevent people who do not live in Tempe or who were not selected as part of the random sample from completing the survey, everyone who completed the survey was required to provide their address. These addresses were then matched to those used for the random representative sample. If the respondent’s address did not match, the response was not used. To better understand how services are being delivered across the city, individual results were mapped to determine overall distribution across the city. Additionally, demographic data were used to monitor the distribution of responses to ensure the responding population of each survey is representative of city population. Processing and Limitations:The location data in this dataset is generalized to the block level to protect privacy. This means that only the first two digits of an address are used to map the location. When they data are shared with the city only the latitude/longitude of the block level address points are provided. This results in points that overlap. In order to better visualize the data, overlapping points were randomly dispersed to remove overlap. The result of these two adjustments ensure that they are not related to a specific address, but are still close enough to allow insights about service delivery in different areas of the city. The weighted data are used by the ETC Institute, in the final published PDF report.The 2023 Annual Community Survey report is available on data.tempe.gov or by visiting https://www.tempe.gov/government/strategic-management-and-innovation/signature-surveys-research-and-dataThe individual survey questions as well as the definition of the response scale (for example, 1 means “very dissatisfied” and 5 means “very satisfied”) are provided in the data dictionary.Additional InformationSource: Community Attitude SurveyContact (author): Adam SamuelsContact E-Mail (author): Adam_Samuels@tempe.govContact (maintainer): Contact E-Mail (maintainer): Data Source Type: Excel tablePreparation Method: Data received from vendor after report is completedPublish Frequency: AnnualPublish Method: ManualData Dictionary
NaiveBayes_R.xlsx: This Excel file includes information as to how probabilities of observed features are calculated given recidivism (P(x_ij│R)) in the training data. Each cell is embedded with an Excel function to render appropriate figures. P(Xi|R): This tab contains probabilities of feature attributes among recidivated offenders. NIJ_Recoded: This tab contains re-coded NIJ recidivism challenge data following our coding schema described in Table 1. Recidivated_Train: This tab contains re-coded features of recidivated offenders. Tabs from [Gender] through [Condition_Other]: Each tab contains probabilities of feature attributes given recidivism. We use these conditional probabilities to replace the raw values of each feature in P(Xi|R) tab. NaiveBayes_NR.xlsx: This Excel file includes information as to how probabilities of observed features are calculated given non-recidivism (P(x_ij│N)) in the training data. Each cell is embedded with an Excel function to render appropriate figures. P(Xi|N): This tab contains probabilities of feature attributes among non-recidivated offenders. NIJ_Recoded: This tab contains re-coded NIJ recidivism challenge data following our coding schema described in Table 1. NonRecidivated_Train: This tab contains re-coded features of non-recidivated offenders. Tabs from [Gender] through [Condition_Other]: Each tab contains probabilities of feature attributes given non-recidivism. We use these conditional probabilities to replace the raw values of each feature in P(Xi|N) tab. Training_LnTransformed.xlsx: Figures in each cell are log-transformed ratios of probabilities in NaiveBayes_R.xlsx (P(Xi|R)) to the probabilities in NaiveBayes_NR.xlsx (P(Xi|N)). TestData.xlsx: This Excel file includes the following tabs based on the test data: P(Xi|R), P(Xi|N), NIJ_Recoded, and Test_LnTransformed (log-transformed P(Xi|R)/ P(Xi|N)). Training_LnTransformed.dta: We transform Training_LnTransformed.xlsx to Stata data set. We use Stat/Transfer 13 software package to transfer the file format. StataLog.smcl: This file includes the results of the logistic regression analysis. Both estimated intercept and coefficient estimates in this Stata log correspond to the raw weights and standardized weights in Figure 1. Brier Score_Re-Check.xlsx: This Excel file recalculates Brier scores of Relaxed Naïve Bayes Classifier in Table 3, showing evidence that results displayed in Table 3 are correct. *****Full List***** NaiveBayes_R.xlsx NaiveBayes_NR.xlsx Training_LnTransformed.xlsx TestData.xlsx Training_LnTransformed.dta StataLog.smcl Brier Score_Re-Check.xlsx Data for Weka (Training Set): Bayes_2022_NoID Data for Weka (Test Set): BayesTest_2022_NoID Weka output for machine learning models (Conventional naïve Bayes, AdaBoost, Multilayer Perceptron, Logistic Regression, and Random Forest)
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
IntroductionBreast cancer continues to be the most common malignancy and the leading cause of cancer-related deaths in Ethiopia. The poor prognosis and high mortality rate of breast cancer patients in the country are largely caused by late-stage diagnosis. Hence, understanding the epidemiology of late-stage diagnosis is essential to address this important problem. However, previous reports in Ethiopia indicated inconsistent findings. Therefore, this literature review was conducted to generate dependable evidence by summarizing the prevalence and determinants of late-stage diagnosis among breast cancer patients in Ethiopia.MethodsPertinent articles were retrieved by systematically searching on major electronic databases and gray literature. Data were extracted into an Excel spreadsheet and analyzed using the STATA 17 statistical software. The pooled estimates were summarized using the random effect meta-analysis model. Heterogeneity and small study effect were evaluated using the I2 statistics and Egger’s regression test in conjunction with the funnel plot, respectively. Meta-regression, sub-group analysis, and sensitivity analysis were also employed. Protocol registration number: CRD42024496237.ResultsThe pooled prevalence of late-stage diagnosis after combining reports of 24 studies with 8,677 participants was 65.85 (95% CI: 58.38, 73.32). Residence (adjusted OR: 1.92; 95% CI: 1.45, 2.53), patient delay at their first presentation (adjusted OR: 2.65; 95% CI: 1.56, 4.49), traditional medicine use (adjusted OR: 2.54; 95% CI: 1.89, 3.41), and breast self-examination practice (adjusted OR: 0.28; 95% CI: 0.09, 0.88) were significant determinants of late-stage diagnosis.ConclusionTwo-thirds of breast cancer patients in Ethiopia were diagnosed at an advanced stage. Residence, delay in the first presentation, traditional medicine use, and breast self-examination practice were significantly associated with late-stage diagnosis. Public education about breast cancer and its early detection techniques is crucial to reduce mortality and improve the survival of patients. Besides, improving access to cancer screening services is useful to tackle the disease at its curable stages.
This data set presents sediment sample data consisting of biofacies abundance and sediment grain size and properties for samples collected in 2021 over the Vitoria Trindade Ridge during research cruise CORE-VTRCC on naval vessel Nho Cruzeiro do Sul. During the sampling program, nine seafloor sediment samples were collected within the Vitória-Trindade Ridge with a Van-Veen grab sampler (3600 cm²) on the top of the volcanic seamounts. Sample V7 collected only rhodoliths, without additional sediments. Grain sizes larger than 40 mm diameter (pebble size) were separated for rhodolith measurements. Sediment samples were washed to dissolve the salt concentration for 48 hours, then oven-dried at 45 °C for 72 hours. The rhodolith samples were dried at 35 °C for 48 hours. The morphometry of the rhodoliths was classified in spheirodal, discoidal and ellipsoidal based on the measurement of the long (L), intermediate (I) and short (S) axis with a Vernier Caliper. Samples were weighed and sieved in phi fractions (-1.5, -1.0, -0.5, 0, 0.5, 1.0, 1.5, 2.0, 2.5, 3.0, 3.5, 4.0 and > 4.0) for 10 minutes. Samples sieved were described in phi fractions and were grouped in the following orders: granules (-2 to -1), very coarse sand (-1 to 0), coarse sand (0 to 1), medium sand (1 to 2), fine sand (2 to 3), very fine sand (3 to 4) and silt (> 4). The mean grain size and sorting were analyzed on Gradistat v9.1 software. Sediment samples were also placed on a Petri dish for microscope analysis and photography using Fiji software. Grains were then classified (Carbonate debris, foraminifers, bryozoans, sponge spicules, bivalves, gastropods, crustaceans, echinoderms, and annelida) and abundance was quantified based on 300 random point counts per sample. The data file is in Excel spreadsheet format. In the file names, SS = "Seafloor Samples". For the analysis, each seafloor sample was subdivided into ten subsamples (Q number). Codes: V# - Number of Sample (_# goes when there is multiple images for the same sample); Q# - Number of the quartile sample (only for the biofacies). Funding for this work was provided through FAPESP awards 2016/24946-9 and 2020/08847-6, and through the Brazilian Navy program PROAMAZONIA AZUL.
The harmonized data set on health, created and published by the ERF, is a subset of Iraq Household Socio Economic Survey (IHSES) 2012. It was derived from the household, individual and health modules, collected in the context of the above mentioned survey. The sample was then used to create a harmonized health survey, comparable with the Iraq Household Socio Economic Survey (IHSES) 2007 micro data set.
----> Overview of the Iraq Household Socio Economic Survey (IHSES) 2012:
Iraq is considered a leader in household expenditure and income surveys where the first was conducted in 1946 followed by surveys in 1954 and 1961. After the establishment of Central Statistical Organization, household expenditure and income surveys were carried out every 3-5 years in (1971/ 1972, 1976, 1979, 1984/ 1985, 1988, 1993, 2002 / 2007). Implementing the cooperation between CSO and WB, Central Statistical Organization (CSO) and Kurdistan Region Statistics Office (KRSO) launched fieldwork on IHSES on 1/1/2012. The survey was carried out over a full year covering all governorates including those in Kurdistan Region.
The survey has six main objectives. These objectives are:
The raw survey data provided by the Statistical Office were then harmonized by the Economic Research Forum, to create a comparable version with the 2006/2007 Household Socio Economic Survey in Iraq. Harmonization at this stage only included unifying variables' names, labels and some definitions. See: Iraq 2007 & 2012- Variables Mapping & Availability Matrix.pdf provided in the external resources for further information on the mapping of the original variables on the harmonized ones, in addition to more indications on the variables' availability in both survey years and relevant comments.
National coverage: Covering a sample of urban, rural and metropolitan areas in all the governorates including those in Kurdistan Region.
1- Household/family. 2- Individual/person.
The survey was carried out over a full year covering all governorates including those in Kurdistan Region.
Sample survey data [ssd]
----> Design:
Sample size was (25488) household for the whole Iraq, 216 households for each district of 118 districts, 2832 clusters each of which includes 9 households distributed on districts and governorates for rural and urban.
----> Sample frame:
Listing and numbering results of 2009-2010 Population and Housing Survey were adopted in all the governorates including Kurdistan Region as a frame to select households, the sample was selected in two stages: Stage 1: Primary sampling unit (blocks) within each stratum (district) for urban and rural were systematically selected with probability proportional to size to reach 2832 units (cluster). Stage two: 9 households from each primary sampling unit were selected to create a cluster, thus the sample size of total survey clusters was 25488 households distributed on the governorates, 216 households in each district.
----> Sampling Stages:
In each district, the sample was selected in two stages: Stage 1: based on 2010 listing and numbering frame 24 sample points were selected within each stratum through systematic sampling with probability proportional to size, in addition to the implicit breakdown urban and rural and geographic breakdown (sub-district, quarter, street, county, village and block). Stage 2: Using households as secondary sampling units, 9 households were selected from each sample point using systematic equal probability sampling. Sampling frames of each stages can be developed based on 2010 building listing and numbering without updating household lists. In some small districts, random selection processes of primary sampling may lead to select less than 24 units therefore a sampling unit is selected more than once , the selection may reach two cluster or more from the same enumeration unit when it is necessary.
Face-to-face [f2f]
----> Preparation:
The questionnaire of 2006 survey was adopted in designing the questionnaire of 2012 survey on which many revisions were made. Two rounds of pre-test were carried out. Revision were made based on the feedback of field work team, World Bank consultants and others, other revisions were made before final version was implemented in a pilot survey in September 2011. After the pilot survey implemented, other revisions were made in based on the challenges and feedbacks emerged during the implementation to implement the final version in the actual survey.
----> Questionnaire Parts:
The questionnaire consists of four parts each with several sections: Part 1: Socio – Economic Data: - Section 1: Household Roster - Section 2: Emigration - Section 3: Food Rations - Section 4: housing - Section 5: education - Section 6: health - Section 7: Physical measurements - Section 8: job seeking and previous job
Part 2: Monthly, Quarterly and Annual Expenditures: - Section 9: Expenditures on Non – Food Commodities and Services (past 30 days). - Section 10 : Expenditures on Non – Food Commodities and Services (past 90 days). - Section 11: Expenditures on Non – Food Commodities and Services (past 12 months). - Section 12: Expenditures on Non-food Frequent Food Stuff and Commodities (7 days). - Section 12, Table 1: Meals Had Within the Residential Unit. - Section 12, table 2: Number of Persons Participate in the Meals within Household Expenditure Other Than its Members.
Part 3: Income and Other Data: - Section 13: Job - Section 14: paid jobs - Section 15: Agriculture, forestry and fishing - Section 16: Household non – agricultural projects - Section 17: Income from ownership and transfers - Section 18: Durable goods - Section 19: Loans, advances and subsidies - Section 20: Shocks and strategy of dealing in the households - Section 21: Time use - Section 22: Justice - Section 23: Satisfaction in life - Section 24: Food consumption during past 7 days
Part 4: Diary of Daily Expenditures: Diary of expenditure is an essential component of this survey. It is left at the household to record all the daily purchases such as expenditures on food and frequent non-food items such as gasoline, newspapers…etc. during 7 days. Two pages were allocated for recording the expenditures of each day, thus the roster will be consists of 14 pages.
----> Raw Data:
Data Editing and Processing: To ensure accuracy and consistency, the data were edited at the following stages: 1. Interviewer: Checks all answers on the household questionnaire, confirming that they are clear and correct. 2. Local Supervisor: Checks to make sure that questions has been correctly completed. 3. Statistical analysis: After exporting data files from excel to SPSS, the Statistical Analysis Unit uses program commands to identify irregular or non-logical values in addition to auditing some variables. 4. World Bank consultants in coordination with the CSO data management team: the World Bank technical consultants use additional programs in SPSS and STAT to examine and correct remaining inconsistencies within the data files. The software detects errors by analyzing questionnaire items according to the expected parameter for each variable.
----> Harmonized Data:
Iraq Household Socio Economic Survey (IHSES) reached a total of 25488 households. Number of households refused to response was 305, response rate was 98.6%. The highest interview rates were in Ninevah and Muthanna (100%) while the lowest rates were in Sulaimaniya (92%).
The World Bank in collaboration with the Kenya National Bureau of Statistics and the University of California, Berkeley are conducting the Kenya COVID-19 Rapid Response Phone Survey to track the socioeconomic impacts of the COVID-19 pandemic, the recovery from it as well as other shocks to provide timely data to inform policy. This dataset contains information from eight waves of the COVID-19 RRPS, which is part of a panel survey that targets Kenyan nationals and started in May 2020. The same households were interviewed every two months for five survey rounds, in the first year of data collection and every four months thereafter, with interviews conducted using Computer Assisted Telephone Interviewing (CATI) techniques.
The data set contains information from two samples of Kenyan households. The first sample is a randomly drawn subset of all households that were part of the 2015/16 Kenya Integrated Household Budget Survey (KIHBS) Computer-Assisted Personal Interviewing (CAPI) pilot and provided a phone number. The second was obtained through the Random Digit Dialing method, by which active phone numbers created from the 2020 Numbering Frame produced by the Kenya Communications Authority are randomly selected. The samples cover urban and rural areas and are designed to be representative of the population of Kenya using cell phones. Waves 1-7 of this survey include information on household background, service access, employment, food security, income loss, transfers, health, and COVID-19 knowledge and vaccinations. Wave 8 focused on how households were exposed to shocks, in particular adverse weather shocks and the increase in the price of food and fuel, but also included parts of the previous modules on household background, service access, employment, food security, income loss, and subjective wellbeing.
The data is uploaded in three files. The first is the hh file, which contains household level information. The ‘hhid’, uniquely identifies all household. The second is the adult level file, which contains data at the level of adult household members. Each adult in a household is uniquely identified by the ‘adult_id’. The third file is the child level file, available only for waves 3-7, which contains information for every child in the household. Each child in a household is uniquely identified by the ‘child_id’.
The duration of data collection and sample size for each completed wave was: Wave 1: May 14 to July 7, 2020; 4,061 Kenyan households Wave 2: July 16 to September 18, 2020; 4,492 Kenyan households Wave 3: September 28 to December 2, 2020; 4,979 Kenyan households Wave 4: January 15 to March 25, 2021; 4,892 Kenyan households Wave 5: March 29 to June 13, 2021; 5,854 Kenyan households Wave 6: July 14 to November 3, 2021; 5,765 Kenyan households Wave 7: November 15, 2021, to March 31, 2022; 5,633 Kenyan households Wave 8: May 31 to July 8, 2022: 4,550 Kenyan households
The same questionnaire is also administered to refugees in Kenya, with the data available in the UNHCR microdata library: https://microdata.unhcr.org/index.php/catalog/296/
National coverage covering rural and urban areas
Household, Individual
The COVID-19 RRPS with Kenyan households has two samples. The first sample consists of households that were part of the 2015/16 KIHBS CAPI pilot and provided a phone number. The 2015/16 KIHBS CAPI pilot is representative at the national level stratified by county and place of residence (urban and rural areas). At least one valid phone number was obtained for 9,007 households and all of them were included in the COVID-19 RRPS sample. The target respondent was the primary male or female household member from the 2015/16 KIHBS CAPI pilot. The second sample consists of households selected using the Random Digit Dialing method. A list of random mobile phone numbers was created using a random number generator from the 2020 Numbering Frame produced by the Kenya Communications Authority. The initial sampling frame therefore consisted of 92,999,970 randomly ordered phone numbers assigned to three networks: Safaricom, Airtel and Telkom. An introductory text message was sent to 5,000 randomly selected numbers to determine if numbers were in operation. Out of these, 4,075 were found to be active and formed the final sampling frame. There was no stratification and individuals that were called were asked about the households they live in. Until wave 7 sampled households that were not reached in earlier waves were also contacted along with households that were interviewed before. In wave 8 only households that had previously participated in the survey were contacted for interview. The “wave” variable represents in which wave the households were interviewed in.
Computer Assisted Personal Interview [capi]
The questionnaire was administered in English and is provided as a resource in pdf format. Additionally, questionnaires for each wave are also provided in Excel format coded for SCTO. The same questionnaire is also administered to refugees in Kenya, with the data available in the UNHCR microdata library: https://microdata.unhcr.org/index.php/catalog/296/
These data are part of NACJD's Fast Track Release and are distributed as they were received from the data depositor. The files have been zipped by NACJD for release but not checked or processed except for the removal of direct identifiers. Users should refer to the accompanying readme file for a brief description of the files available with this collection and consult the investigator(s) if further information is needed. This study reports the findings of a randomized controlled trial (RCT) involving more than 400 police officers and the use of body-worn cameras (BWC) in the Las Vegas Metropolitan Police Department (LVMPD). Officers were surveyed before and after the trial, and a random sample was interviewed to assess their level of comfort with technology, perceptions of self, civilians, other officers, and the use of BWCs. Information was gathered during ride-alongs with BWC officers and from a review of BWC videos. The collection includes 2 SPSS data files, 4 Excel data files, and 2 files containing aggregated treatment groups and rank-and-treatment groups, in Stata, Excel, and CSV format:
SPSS: officer-survey---pretest.sav (n=422; 30 variables) SPSS: officer-survey---posttest2.sav (n=95; 33 variables) Excel: officer-interviews---form-a.xlsx (n=23; 52 variables) Excel: officer-interviews---form-b.xlsx (n=27; 52 variables) Excel: ride-along-observations.xlsx (n=72; 20 variables) Excel: video-review-data.xlsx (n=53; 21 variables) Stata: hours-and-compensation-rollup-to-treatment-group.dta (n=4; 42 variables) Excel: hours-and-compensation-rollup-to-treatment-group.xls (n=4; 42 variables) CSV: hours-and-compensation-rollup-to-treatment-group.csv (n=4; 42 variables) Stata: hours-and-compensation-rollup-to-rank-and-treatment-group.dta (n=12; 43 variables) Excel: hours-and-compensation-rollup-to-rank-and-treatment-group.xls (n=12; 43 variables) CSV: hours-and-compensation-rollup-to-rank-and-treatment-group.csv (n=12; 43 variables)
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Social Security Administration released Earnings Public-Use File (EPUF) for 2006. File contains earnings information for individuals drawn from a systematic random 1-percent sample of all Social Security numbers (SSNs) issued before January 2007. EPUF consists of two linkable subfiles. One contains selected demographic and aggregate earnings information for all 4,348,254 individuals in the file, and the second contains annual earnings records for the 3,131,424 individuals who had positive earnings in at least 1 year from 1951 through 2006. Please Note: This data set is very large and will not work properly in Microsoft Excel. Data software capable of handling large files should be used.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
The Disability Analysis File (DAF) - Public Use File (PUF), which contains a random 10 percent sample of beneficiaries included in the full DAF, is a set of files containing data related to program participation and benefits for Social Security Disability Insurance (SSDI) and Supplemental Security Income (SSI) beneficiaries between the ages of 18 and Full Retirement Age (FRA) who have received disability benefits in any month since 1996. It also includes data on selected beneficiaries aged 10 to 17 who have received benefits since 2005. The DAF contains data as early as January 1994 and many longitudinal variables are available for the full time period. Some variables are only available for a portion of the time range and information in the DAF documentation provides users with the range available for each variable. The PUF is sectioned into two core components, the Demographic component and the Annuals components, linkable using the unique identifier variable in each file. Please Note: the CSV files will not open completely in Excel due to Excel’s row limit.
The Disability Analysis File (DAF) - Demographic Public Use File (PUF), which contains a random 10 percent sample of beneficiaries included in the full DAF, contains demographic and other one-time information, such as date of birth, date of death, and information collected at the time of disability application. The Demographic file contains one record for each beneficiary who has ever met the DAF selection criteria since 1996 and is not limited to those still receiving benefits. This file contains a snapshot of what each beneficiary's administrative record looks like as of December 2016. Please Note: the CSV files will not open completely in Excel due to Excel’s row limit.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
This data belongs to a manuscript submitted to Data in Brief, in which the content and lay-out of this data is described in detail. Data overviews (including figures and tables for age and gender groups) can be found at OSF | Normative 3D gait data of healthy subjects walking at three different speeds on an instrumented treadmill in virtual reality.A normative gait dataset of 246 healthy adults (122 men / 124 women, range in age 18-91 years, body weight 46.80-116.10 kg, height 1.53-1.97 m and BMI 18.25-35.63 kg/m2) is presented and publicly shared for three walking speed conditions (comfortable, slow and fast speed).Three dimensional gait analysis was performed at the Computer Assisted Rehabilitation Environment (CAREN) at the Maastricht University Medical Centre (MUMC+). Subjects walked on the instrumented treadmill surrounded with twelve 3D cameras, three 2D cameras and a virtual environment projected on a 180° screen using the Human Body Lower Limb Model with trunk markers (HBM-II) as biomechanical model.Subjects walked at comfortable walking speed, 30% slower and 30% faster. These walking speed conditions were applied in a random sequence. Comfortable walking speed was determined using a RAMP protocol: subjects started to walk at 0.5 m/s and every second the speed was increased wit 0.01 m/s until comfortable speed was reached. The average of three repetitions was considered the comfortable speed. For each walking speed condition, 250 steps were recorded.The 3D gait data was collected using the D-flow CAREN software. Raw data were processed in Matlab (Mathworks 2016), including quality check, step determination and the exportation of data to xls. Processed data includes spatiotemporal parameters, medio-lateral (ML) and back-forward (BF) margins of stability (MoS), 3D joint angles, anterior-posterior (AP) and vertical GRFs, 3D joint moments and 3D joint power of both legs.The attached files include the processed data for each adult for walking at slow (comfortable -30%) speed containing spatiotemporal parameters, MoS, joint angles, GRF, joint moments, joint power including every valid step of both legs.The title of this file (27_individual excel files) corresponds to the associated manuscript (submitted to Data in Brief)
The annual Disability Analysis File (DAF) - Public Use File (PUF) for calendar years 2018-2020, which contains a random 10 percent sample of beneficiaries included in the full DAF, contains SSA administrative monthly and yearly longitudinal data related to program participation and benefits for Social Security Disability Insurance (SSDI) and Supplemental Security Income (SSI) beneficiaries between the ages of 18 and Full Retirement Age (FRA) who received disability benefits during any month of the year. Please Note: the CSV files will not open completely in Excel due to Excel’s row limit.
The annual Disability Analysis File (DAF) - Public Use File (PUF) for calendar years 2009-2011, which contains a random 10 percent sample of beneficiaries included in the full DAF, contains SSA administrative monthly and yearly longitudinal data related to program participation and benefits for Social Security Disability Insurance (SSDI) and Supplemental Security Income (SSI) beneficiaries between the ages of 18 and Full Retirement Age (FRA) who received disability benefits during any month of the year. Please Note: the CSV files will not open completely in Excel due to Excel’s row limit.
The annual Disability Analysis File (DAF) - Public Use File (PUF) for calendar years 2015-2017, which contains a random 10 percent sample of beneficiaries included in the full DAF, contains SSA administrative monthly and yearly longitudinal data related to program participation and benefits for Social Security Disability Insurance (SSDI) and Supplemental Security Income (SSI) beneficiaries between the ages of 18 and Full Retirement Age (FRA) who received disability benefits during any month of the year. Please Note: the CSV files will not open completely in Excel due to Excel’s row limit.
Microsoft 365 is used by over two million companies worldwide, with over one million customers in the United States alone using the office suite software. Office 365 is the brand name previously used by Microsoft for a group of software applications providing productivity related services to its subscribers. Office 365 applications include Outlook, OneDrive, Word, Excel, PowerPoint, OneNote, SharePoint and Microsoft Teams. The consumer and small business plans of Office 365 were renamed as Microsoft 365 on 21 April, 2020. Global office suite market share An office suite is a collection of software applications (word processing, spreadsheets, database etc.) designed to be used for tasks within an organization. Worldwide market share of office suite technologies is split between Google’s G Suite and Microsoft’s Office 365, with G Suite controlling around 45 percent of the global market and Office 365 holding around 26 percent. This trend is similar across most worldwide regions.
This information covers fires, false alarms and other incidents attended by fire crews, and the statistics include the numbers of incidents, fires, fatalities and casualties as well as information on response times to fires. The Home Office also collect information on the workforce, fire prevention work, health and safety and firefighter pensions. All data tables on fire statistics are below.
The Home Office has responsibility for fire services in England. The vast majority of data tables produced by the Home Office are for England but some (0101, 0103, 0201, 0501, 1401) tables are for Great Britain split by nation. In the past the Department for Communities and Local Government (who previously had responsibility for fire services in England) produced data tables for Great Britain and at times the UK. Similar information for devolved administrations are available at https://www.firescotland.gov.uk/about/statistics/" class="govuk-link">Scotland: Fire and Rescue Statistics, https://statswales.gov.wales/Catalogue/Community-Safety-and-Social-Inclusion/Community-Safety" class="govuk-link">Wales: Community safety and http://www.nifrs.org/" class="govuk-link">Northern Ireland: Fire and Rescue Statistics.
If you use assistive technology (for example, a screen reader) and need a version of any of these documents in a more accessible format, please email alternativeformats@homeoffice.gov.uk. Please tell us what format you need. It will help us if you say what assistive technology you use.
Fire statistics guidance
Fire statistics incident level datasets
https://assets.publishing.service.gov.uk/media/6787aa6c2cca34bdaf58a257/fire-statistics-data-tables-fire0101-230125.xlsx">FIRE0101: Incidents attended by fire and rescue services by nation and population (MS Excel Spreadsheet, 94 KB) Previous FIRE0101 tables
https://assets.publishing.service.gov.uk/media/6787ace93f1182a1e258a25c/fire-statistics-data-tables-fire0102-230125.xlsx">FIRE0102: Incidents attended by fire and rescue services in England, by incident type and fire and rescue authority (MS Excel Spreadsheet, 1.51 MB) Previous FIRE0102 tables
https://assets.publishing.service.gov.uk/media/6787b036868b2b1923b64648/fire-statistics-data-tables-fire0103-230125.xlsx">FIRE0103: Fires attended by fire and rescue services by nation and population (MS Excel Spreadsheet, 123 KB) Previous FIRE0103 tables
https://assets.publishing.service.gov.uk/media/6787b3ac868b2b1923b6464d/fire-statistics-data-tables-fire0104-230125.xlsx">FIRE0104: Fire false alarms by reason for false alarm, England (MS Excel Spreadsheet, 295 KB) Previous FIRE0104 tables
https://assets.publishing.service.gov.uk/media/6787b4323f1182a1e258a26a/fire-statistics-data-tables-fire0201-230125.xlsx">FIRE0201: Dwelling fires attended by fire and rescue services by motive, population and nation (MS Excel Spreadsheet, 111 KB) <a href="https://www.gov.uk/government/statistical-data-sets/fire0201-previous-data-t
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
These datasets contain many economic variables related to agriculture like crop output value, profit and several others. These datasets can be used for testing several hypotheses related to agricultural economics, both at plot level and household level.
Users can also reproduce these datasets using the STATA 14 do file ‘VDSA data management for agricultural performance’. This STATA program file uses the Village Dynamics in South Asia (VDSA) raw data files in excel format. The resulting output will be two data files in stata format, one at plot level and other at household level.
These plot level and household level data sets are also included in this repository. The word file ‘guidelines’ contain instructions to extract VDSA raw data from VDSA knowledge bank and use them as inputs to run the STATA do file ‘VDSA data management for agricultural performance’
The VDSA raw data files in excel format needed to run the stata do file are also available in this repository for users convenience
The raw VDSA data were generated by the International Crops Research Institute for the Semi-Arid Tropics (ICRISAT) in partnership with Indian Council of Agricultural Research (ICAR) Institutes and the International Rice Research Institute (IRRI) and funded by the Bill & Melinda Gates Foundation (BMGF) (Grant ID: 51937). The data were acquired in surveys by resident field investigators. Data collection was mostly through paper based questionnaires and Samsung tablets were also used since 2012. The survey instruments used for different modules are available at http://vdsa.icrisat.ac.in/vdsa-questionaires.aspx
Study sites were selected using a stepwise purposive sampling covering agro-ecological diversity of the region. Three districts within each zone were selected based on soil, climate parameters as well as the share of agricultural land under ICRISAT mandate crops. On similar lines, one typical sub-district within each district and two villages within each sub-district were selected. Within each village, ten random households from four landholding groups were selected.
Selected farmers were visited by well trained, agriculture graduate, resident field investigators, once every three weeks to collect information related to various socioeconomic indicators. Some of the data modules like details on crop cultivation activities including plot wise input, output was collected every three weeks while others like general endowments were collected once at the beginning of every agricultural year.
The compiled data, source data, data descriptions and data management code are all published in a public repository at http://dataverse.icrisat.org/dataverse/socialscience at https://doi.org/10.21421/D2/HDEUKU]
Some of the several benefits of these data are:
Scientists, students, development practitioners can benefit from these data to track changes in the livelihood options of the rural poor as this data provides long-term, multi-generational perspective on agricultural, social and economic change in rural livelihoods.
The survey sites provide a socio-economic field laboratory for teaching and training students and researchers
This dataset can be used for diverse agricultural, development and socio-economic analysis and to better understand the dynamics of Indian agriculture.
The data helps to provide feedback for designing policy interventions, setting research priorities and refining technologies.
Shed light on the pathways in which new technologies, policies, and programs impact poverty, village economies, and societies
Not seeing a result you expected?
Learn how you can add new datasets to our index.
The documentation covers Enterprise Survey panel datasets that were collected in Slovenia in 2009, 2013 and 2019.
The Slovenia ES 2009 was conducted between 2008 and 2009. The Slovenia ES 2013 was conducted between March 2013 and September 2013. Finally, the Slovenia ES 2019 was conducted between December 2018 and November 2019. The objective of the Enterprise Survey is to gain an understanding of what firms experience in the private sector.
As part of its strategic goal of building a climate for investment, job creation, and sustainable growth, the World Bank has promoted improving the business environment as a key strategy for development, which has led to a systematic effort in collecting enterprise data across countries. The Enterprise Surveys (ES) are an ongoing World Bank project in collecting both objective data based on firms' experiences and enterprises' perception of the environment in which they operate.
National
The primary sampling unit of the study is the establishment. An establishment is a physical location where business is carried out and where industrial operations take place or services are provided. A firm may be composed of one or more establishments. For example, a brewery may have several bottling plants and several establishments for distribution. For the purposes of this survey an establishment must take its own financial decisions and have its own financial statements separate from those of the firm. An establishment must also have its own management and control over its payroll.
As it is standard for the ES, the Slovenia ES was based on the following size stratification: small (5 to 19 employees), medium (20 to 99 employees), and large (100 or more employees).
Sample survey data [ssd]
The sample for Slovenia ES 2009, 2013, 2019 were selected using stratified random sampling, following the methodology explained in the Sampling Manual for Slovenia 2009 ES and for Slovenia 2013 ES, and in the Sampling Note for 2019 Slovenia ES.
Three levels of stratification were used in this country: industry, establishment size, and oblast (region). The original sample designs with specific information of the industries and regions chosen are included in the attached Excel file (Sampling Report.xls.) for Slovenia 2009 ES. For Slovenia 2013 and 2019 ES, specific information of the industries and regions chosen is described in the "The Slovenia 2013 Enterprise Surveys Data Set" and "The Slovenia 2019 Enterprise Surveys Data Set" reports respectively, Appendix E.
For the Slovenia 2009 ES, industry stratification was designed in the way that follows: the universe was stratified into manufacturing industries, services industries, and one residual (core) sector as defined in the sampling manual. Each industry had a target of 90 interviews. For the manufacturing industries sample sizes were inflated by about 17% to account for potential non-response cases when requesting sensitive financial data and also because of likely attrition in future surveys that would affect the construction of a panel. For the other industries (residuals) sample sizes were inflated by about 12% to account for under sampling in firms in service industries.
For Slovenia 2013 ES, industry stratification was designed in the way that follows: the universe was stratified into one manufacturing industry, and two service industries (retail, and other services).
Finally, for Slovenia 2019 ES, three levels of stratification were used in this country: industry, establishment size, and region. The original sample design with specific information of the industries and regions chosen is described in "The Slovenia 2019 Enterprise Surveys Data Set" report, Appendix C. Industry stratification was done as follows: Manufacturing – combining all the relevant activities (ISIC Rev. 4.0 codes 10-33), Retail (ISIC 47), and Other Services (ISIC 41-43, 45, 46, 49-53, 55, 56, 58, 61, 62, 79, 95).
For Slovenia 2009 and 2013 ES, size stratification was defined following the standardized definition for the rollout: small (5 to 19 employees), medium (20 to 99 employees), and large (more than 99 employees). For stratification purposes, the number of employees was defined on the basis of reported permanent full-time workers. This seems to be an appropriate definition of the labor force since seasonal/casual/part-time employment is not a common practice, except in the sectors of construction and agriculture.
For Slovenia 2009 ES, regional stratification was defined in 2 regions. These regions are Vzhodna Slovenija and Zahodna Slovenija. The Slovenia sample contains panel data. The wave 1 panel “Investment Climate Private Enterprise Survey implemented in Slovenia” consisted of 223 establishments interviewed in 2005. A total of 57 establishments have been re-interviewed in the 2008 Business Environment and Enterprise Performance Survey.
For Slovenia 2013 ES, regional stratification was defined in 2 regions (city and the surrounding business area) throughout Slovenia.
Finally, for Slovenia 2019 ES, regional stratification was done across two regions: Eastern Slovenia (NUTS code SI03) and Western Slovenia (SI04).
Computer Assisted Personal Interview [capi]
Questionnaires have common questions (core module) and respectfully additional manufacturing- and services-specific questions. The eligible manufacturing industries have been surveyed using the Manufacturing questionnaire (includes the core module, plus manufacturing specific questions). Retail firms have been interviewed using the Services questionnaire (includes the core module plus retail specific questions) and the residual eligible services have been covered using the Services questionnaire (includes the core module). Each variation of the questionnaire is identified by the index variable, a0.
Survey non-response must be differentiated from item non-response. The former refers to refusals to participate in the survey altogether whereas the latter refers to the refusals to answer some specific questions. Enterprise Surveys suffer from both problems and different strategies were used to address these issues.
Item non-response was addressed by two strategies: a- For sensitive questions that may generate negative reactions from the respondent, such as corruption or tax evasion, enumerators were instructed to collect the refusal to respond as (-8). b- Establishments with incomplete information were re-contacted in order to complete this information, whenever necessary. However, there were clear cases of low response.
For 2009 and 2013 Slovenia ES, the survey non-response was addressed by maximizing efforts to contact establishments that were initially selected for interview. Up to 4 attempts were made to contact the establishment for interview at different times/days of the week before a replacement establishment (with similar strata characteristics) was suggested for interview. Survey non-response did occur but substitutions were made in order to potentially achieve strata-specific goals. Further research is needed on survey non-response in the Enterprise Surveys regarding potential introduction of bias.
For 2009, the number of contacted establishments per realized interview was 6.18. This number is the result of two factors: explicit refusals to participate in the survey, as reflected by the rate of rejection (which includes rejections of the screener and the main survey) and the quality of the sample frame, as represented by the presence of ineligible units. The relatively low ratio of contacted establishments per realized interview (6.18) suggests that the main source of error in estimates in the Slovenia may be selection bias and not frame inaccuracy.
For 2013, the number of realized interviews per contacted establishment was 25%. This number is the result of two factors: explicit refusals to participate in the survey, as reflected by the rate of rejection (which includes rejections of the screener and the main survey) and the quality of the sample frame, as represented by the presence of ineligible units. The number of rejections per contact was 44%.
Finally, for 2019, the number of interviews per contacted establishments was 9.7%. This number is the result of two factors: explicit refusals to participate in the survey, as reflected by the rate of rejection (which includes rejections of the screener and the main survey) and the quality of the sample frame, as represented by the presence of ineligible units. The share of rejections per contact was 75.2%.