Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
INTRODUCTION. GSS has run annually since 1972; it surveys a representative sample of the adult population in the American society; It is widely used by politicians, policy makers, and researchers. in order to monitor and explain trends and constants in attitudes, behaviors, and attributes. and asks questions about standard core of demographics, beliefs about social and political issues, behavioral, and attitudinal questions, plus topics of special interest. Among the topics covered are civil liberties, crime and violence, intergroup tolerance, morality, national spending priorities, psychological well-being, social mobility, and stress and traumatic events. Altogether the GSS is the single best source for sociological and attitudinal trend data covering the United States. It allows researchers to examine the structure and functioning of society in general as well as the role played by relevant subgroups and to compare the United States to other nations. Source http://gss.norc.org/About-The-GSS About the Data The survey is conducted face-to-face with an in-person interview by National Opinion Research Center (NORC) at the University of Chicago. However, participation in the study is strictly voluntary. Therefore, study based on the GSS sample data is: • generalizable to the target population if we ignore the non-response bias; • definitely not causal, because the study does not employ random assignments and is only observational.
Research Question As for the research question, I'm interested in exploring the relationship between people's job preference and their education status, using latest data. More specifically, are people's preference in a job (like job security, high income, short working hours etc.) associated with their highest degree received? Motivation: Aside from sleeping, working is the activity that takes away the most of our lifetime hours and has a huge impact on people's well-being and happiness. I would be really interested in the factors that determines peoples' attitude toward job.
Reading the data. The data we’ll use is from the General Social Survey (GSS). Using the GSS Data Explorer, I selected a subset of the variables in the GSS and made it available along with this notebook. The survey contains more that 5000 of variables with data on a wide range of subjects, I have selected just a few.
Facebook
TwitterThis file contains all of the cases and variables that are in the original 2016 General Social Survey, but is prepared for easier use in the classroom. Changes have been made in two areas. First, to avoid confusion when constructing tables or interpreting basic analysis, all missing data codes have been set to system missing. Second, many of the continuous variables have been categorized into fewer categories, and added as additional variables to the file.
The General Social Surveys (GSS) have been conducted by the National Opinion Research Center (NORC) annually since 1972, except for the years 1979, 1981, and 1992 (a supplement was added in 1992), and biennially beginning in 1994. The GSS are designed to be part of a program of social indicator research, replicating questionnaire items and wording in order to facilitate time-trend studies.
The 2016 General Social Survey Instructional Dataset has been updated as of October 2024. This release includes additional interview-specific variables and respondent demographic information. Please check the "/research/syntax-repository-list" Target="_blank">NORC website for any future updates on this file.
To download syntax files for the GSS that reproduce well-known religious group recodes, including RELTRAD, please visit the "/research/syntax-repository-list" Target="_blank">ARDA's Syntax Repository.
Facebook
TwitterThis file contains all of the cases and variables that are in the original 2012 General Social Survey, but is prepared for easier use in the classroom. Changes have been made in two areas. First, to avoid confusion when constructing tables or interpreting basic analysis, all missing data codes have been set to system missing. Second, many of the continuous variables have been categorized into fewer categories, and added as additional variables to the file.
The General Social Surveys (GSS) have been conducted by the National Opinion Research Center (NORC) annually since 1972, except for the years 1979, 1981, and 1992 (a supplement was added in 1992), and biennially beginning in 1994. The GSS are designed to be part of a program of social indicator research, replicating questionnaire items and wording in order to facilitate time-trend studies. This data file has all cases and variables asked on the 2012 GSS. There are a total of 4,820 cases in the data set but their initial sampling years vary because the GSS now contains panel cases. Sampling years can be identified with the variable SAMPTYPE.
The 2012 GSS featured special modules on religious scriptures, the environment, dance and theater performances, health care system, government involvement, health concerns, emotional health, financial independence and income inequality.
The GSS has switched from a repeating, cross-section design to a combined repeating cross-section and panel-component design. This file has a rolling panel design, with the 2008 GSS as the base year for the first panel. A sub-sample of 2,000 GSS cases from 2008 was selected for reinterview in 2010 and again in 2012 as part of the GSSs in those years. The 2010 GSS consisted of a new cross-section plus the reinterviews from 2008. The 2012 GSS consists of a new cross-section of 1,974, the first reinterview wave of the 2010 panel cases with 1,551 completed cases, and the second and final reinterview of the 2008 panel with 1,295 completed cases. Altogether, the 2012 GSS had 4,820 cases (1,974 in the new 2012 panel, 1,551 in the 2010 panel, and 1,295 in the 2008 panel).
To download syntax files for the GSS that reproduce well-known religious group recodes, including RELTRAD, please visit the "/research/syntax-repository-list" Target="_blank">ARDA's Syntax Repository.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
The GSS gathers data on contemporary American society in order to monitor and explain trends and constants in attitudes, behaviors, and attributes. Hundreds of trends have been tracked since 1972. In addition, since the GSS adopted questions from earlier surveys, trends can be followed for up to 70 years.
The GSS contains a standard core of demographic, behavioral, and attitudinal questions, plus topics of special interest. Among the topics covered are civil liberties, crime and violence, intergroup tolerance, morality, national spending priorities, psychological well-being, social mobility, and stress and traumatic events.
Altogether the GSS is the single best source for sociological and attitudinal trend data covering the United States. It allows researchers to examine the structure and functioning of society in general as well as the role played by relevant subgroups and to compare the United States to other nations. (Source)
This dataset is a csv version of the Cumulative Data File, a cross-sectional sample of the GSS from 1972-current.
Facebook
TwitterThe General Social Surveys (GSS) have been conducted by the "https://www.norc.org/Pages/default.aspx" Target="_blank">National Opinion Research Center (NORC) annually since 1972, except for the years 1979, 1981, and 1992 (a supplement was added in 1992), and biennially beginning in 1994. The GSS are designed to be part of a program of social indicator research, replicating questionnaire items and wording in order to facilitate time-trend studies. This GSS panel dataset has three waves of interviews: originally sampled and interviewed in 2006, interviewed for the second time in 2008, and interviewed for the third wave in 2010. This file contains those 2,000 respondents who were pre-selected among the 2006 samples and those variables that were asked at least twice in three waves. Survey items on religion include the following: religious preference, religion raised in, spouse's religious preference, frequency of religious service attendance, religious experiences, and religious salience.
Facebook
TwitterThe General Social Surveys (GSS) have been conducted by the National Opinion Research Center (NORC) annually since 1972, except for the years 1979, 1981, and 1992 (a supplement was added in 1992), and biennially beginning in 1994. The GSS are designed to be part of a program of social indicator research, replicating questionnaire items and wording in order to facilitate time-trend studies. This data file has all cases and variables asked on the 2018 GSS.
The 2018 cross-sectional General Social Survey has been updated as of June 2024. This release includes additional interview-specific variables and survey weights. Please check the "https://gss.norc.org/" Target="_blank">NORC website for any future updates on this file.
To download syntax files for the GSS that reproduce well-known religious group recodes, including RELTRAD, please visit the "/research/syntax-repository-list" Target="_blank">ARDA's Syntax Repository.
Facebook
Twitterhttps://borealisdata.ca/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=doi:10.5683/SP3/1LFX0Fhttps://borealisdata.ca/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=doi:10.5683/SP3/1LFX0F
The 2020 GSS on Social Identity interviewed individuals 15 years and over in Canada's ten provinces and was conducted from August 2020 to February 2021. The interviews were conducted via self-assisted electronic questionnaire (respondent EQ, or rEQ) and by telephone via interviewer-assisted electronic questionnaire (interviewer EQ, or iEQ, formerly known as Computer Assisted Telephone Interviewing (CATI)). Data are subject to both sampling and non-sampling errors. These topics are discussed in detail in this guide. The 2020 SI survey is the fourth cycle of the GSS to collect data on social identity, social engagement, and social networks. The previous iteration of the survey (Cycle 27 - Social Identity) was collected in 2013, the second was Cycle 22 - Social Networks in 2008, and the first was Cycle 17 - Social Engagement in 2003.
Facebook
TwitterSurvey based Harmonized Indicators (SHIP) files are harmonized data files from household surveys that are conducted by countries in Africa. To ensure the quality and transparency of the data, it is critical to document the procedures of compiling consumption aggregation and other indicators so that the results can be duplicated with ease. This process enables consistency and continuity that make temporal and cross-country comparisons consistent and more reliable.
Four harmonized data files are prepared for each survey to generate a set of harmonized variables that have the same variable names. Invariably, in each survey, questions are asked in a slightly different way, which poses challenges on consistent definition of harmonized variables. The harmonized household survey data present the best available variables with harmonized definitions, but not identical variables. The four harmonized data files are
a) Individual level file (Labor force indicators in a separate file): This file has information on basic characteristics of individuals such as age and sex, literacy, education, health, anthropometry and child survival. b) Labor force file: This file has information on labor force including employment/unemployment, earnings, sectors of employment, etc. c) Household level file: This file has information on household expenditure, household head characteristics (age and sex, level of education, employment), housing amenities, assets, and access to infrastructure and services. d) Household Expenditure file: This file has consumption/expenditure aggregates by consumption groups according to Purpose (COICOP) of Household Consumption of the UN.
National
The survey covered all de jure household members (usual residents).
Sample survey data [ssd]
Sampling Frame and Units As in all probability sample surveys, it is important that each sampling unit in the surveyed population has a known, non-zero probability of selection. To achieve this, there has to be an appropriate list, or sampling frame of the primary sampling units (PSUs).The universe defined for the GLSS 5 is the population living within private households in Ghana. The institutional population (such as schools, hospitals etc), which represents a very small percentage in the 2000 Population and Housing Census (PHC), is excluded from the frame for the GLSS 5.
The Ghana Statistical Service (GSS) maintains a complete list of census EAs, together with their respective population and number of households as well as maps, with well defined boundaries, of the EAs. . This information was used as the sampling frame for the GLSS 5. Specifically, the EAs were defined as the primary sampling units (PSUs), while the households within each EA constituted the secondary sampling units (SSUs).
Stratification In order to take advantage of possible gains in precision and reliability of the survey estimates from stratification, the EAs were first stratified into the ten administrative regions. Within each region, the EAs were further sub-divided according to their rural and urban areas of location. The EAs were also classified according to ecological zones and inclusion of Accra (GAMA) so that the survey results could be presented according to the three ecological zones, namely 1) Coastal, 2) Forest, and 3) Northern Savannah, and for Accra.
Sample size and allocation The number and allocation of sample EAs for the GLSS 5 depend on the type of estimates to be obtained from the survey and the corresponding precision required. It was decided to select a total sample of around 8000 households nationwide.
To ensure adequate numbers of complete interviews that will allow for reliable estimates at the various domains of interest, the GLSS 5 sample was designed to ensure that at least 400 households were selected from each region.
A two-stage stratified random sampling design was adopted. Initially, a total sample of 550 EAs was considered at the first stage of sampling, followed by a fixed take of 15 households per EA. The distribution of the selected EAs into the ten regions or strata was based on proportionate allocation using the population.
For example, the number of selected EAs allocated to the Western Region was obtained as: 1924577/18912079*550 = 56
Under this sampling scheme, it was observed that the 400 households minimum requirement per region could be achieved in all the regions but not the Upper West Region. The proportionate allocation formula assigned only 17 EAs out of the 550 EAs nationwide and selecting 15 households per EA would have yielded only 255 households for the region. In order to surmount this problem, two options were considered: retaining the 17 EAs in the Upper West Region and increasing the number of selected households per EA from 15 to about 25, or increasing the number of selected EAs in the region from 17 to 27 and retaining the second stage sample of 15 households per EA.
The second option was adopted in view of the fact that it was more likely to provide smaller sampling errors for the separate domains of analysis. Based on this, the number of EAs in Upper East and the Upper West were adjusted from 27 and 17 to 40 and 34 respectively, bringing the total number of EAs to 580 and the number of households to 8,700.
A complete household listing exercise was carried out between May and June 2005 in all the selected EAs to provide the sampling frame for the second stage selection of households. At the second stage of sampling, a fixed number of 15 households per EA was selected in all the regions. In addition, five households per EA were selected as replacement samples.The overall sample size therefore came to 8,700 households nationwide.
Face-to-face [f2f]
Facebook
TwitterThe efficient development, maintenance and administration of transport infrastructure and services are critical to the socio-economic development of any country. Scarce government resources and support from donor funds are required to provide these essential services to all sectors for the economic development of the country and for attaining equity and the participation of the populace in the creation of wealth and reduction of poverty.
To ascertain the effectiveness of implementation of policies and development programs, for transport related infrastructure and services key performance indicators are required. The data for developing these performance indicators must be collected on a sustainable basis by the various sectors for collation and analysis. Although most of the relevant basic data exist in many establishments, these are often scattered and are not collated nor disseminated in any structured manner. The Transportation sector is no exception. A recent study of the Ghana Road Sub-sector Programme finds that there is an urgent need to reinforce the monitoring system of MRT as performance indicators have only partially been collected and used; the road condition mix is monitored on an annual basis while other basic performance indicators are lacking. A good monitoring system will help improve the policy formulation within the sub-sector while its absence may result in a major fund funding reduction because the contribution to national development objectives, such as poverty alleviation, cannot be substantiated and demonstrated.
Objectives of survey The development objective of the TSPS-II as defined in the Ghana Poverty Reduction Strategy (GPRS), to sustain economic growth through the provision of safe, reliable, efficient and affordable services for all transport users. The focus of the transport sector under the GPRS is to provide access through better distribution of the transport network with special emphasis on high poverty areas in order to reduce transport disparities between the urban and rural communities. The household survey is a component of a bigger programme which will serve as a reliable and sustainable one-stop shop for all the data and performance indicators for the transport sector. The immediate objective of the sub-component is to improve the effectiveness of implementation of policies and development programmes for the transport sector, including related infrastructure and services. The direct aim of the sub-component will be the collection, processing, analysis, documentation and dissemination of transport related data, which will be useful for:
National level Region Level
Household and Individual
The survey covered all household members (Usual residents)
Sample survey data [ssd]
The sample was representative of all households in Ghana. To achieve the study objectives, the sample size chosen was based on the type of variables under consideration, the required precision of the survey estimates and available resources. Taking all of these into consideration, a sample size of 6,000 households was deemed sufficient to achieve the survey objectives. This was enough to yield reliable estimates of all the important survey variables as well as being manageable to control and minimize non-sampling errors.
Stratification and Sample Selection Procedures The total list of the Enumeration Areas (EAs) from the demarcation for the 2010 Population and Housing Census formed the sampling frame for the Phase II of the Transport Indicators Survey. The sampling frame was stratified into urban/rural residence and the 10 administrative regions of the country for the selection of the sample. The sample was selected in two stages.
The first stage selection involved the systematic selection of 400 EAs with probability proportional to size, the measure of size being the number of households in each EA. The second stage selection involved the systematic selection of 15 households from each EA. See Appendix A for more details on the sample design.
No deviations
Face-to-face [f2f]
The questionnaire had the following sections:
Section A: a household roster which collected basic information on all households members and household characteristics to determine eligible household members
Section B: an education section which was administered to household members aged 3 years and older on the use of transport services to school
Section C: a health section that was used to collect information on all household members on access and the use of transport services to health facilities
Section D: an economic activity section administered to household members 7 years and older to collect information on their economic activities and the use of transport services a market access section administered to household members engaged in agricultural activities to collect information on access to transport services for sale of farm produce
Section E: a general transport services section administered to all household members on the access and use of various modes of transport.
Section F: a general transport services section administered to all households and use of various modes of transport.
Control mechanisms were inbuilt in the data capturing application. Range checks and skip patterns were incorporated into the data capturing application. Partial double entry was done in order to compare and correct errors. After data capture secondary editng was done in the form of consistency checks. CSPro 4.1 was used to capture the data.
National: (5996/6000)*100=99.93%
By Regions: Western=99.8% Central= 100.0% Greater Accra= 100.0% Volta = 99.5% Eastern=100.0% Ashanti = 100.0% Brong Ahafo = 100.0% Northern = 100.0% Upper East = 100.0% Upper West = 100.0%
Region Hhs completed Hhs Expected Response rate
Western 569 570 99.8
Central 510 510 100.0
Greater Accra 855 855 100.0
Volta 567 570 99.5
Eastern 705 705 100.0
Ashanti 1,125 1,125 100.0
Brong Ahafo 585 585 100.0
Northern 615 615 100.0
Upper East 285 285 100.0
Upper West 180 180 100.0
Total 5,996 6,000 99.9
Causes of non response
Region
Result of Interview Western Volta Total
Refused 1 0 1
No HHold Member at Home 0 2 2
Other 0 1 1
Total 1 3 4
Sample errors was calculated but not in the report.
No other forms of data appraisal
Facebook
TwitterThe Integrated Business Establishment Survey was an establishment census conducted by the Ghana Statistical Service (GSS) in 2014. IBES 2014 phase 1 collected data on 638 000 establishments in Ghana across all sectors. Phase 2 was a roughly 5% stratified sample of the phase 1 firms undertaken in 2015 (GSS, 2017). Phase 2 contained many more questions to the firms sampled, for example their costs, revenues and assets. GSS allowed DataFirst to release a 40% sample of the phase 2 data, through a project funded by the Project for Enterprise Development in Low Income Countries (PEDL). Thus the observations we have represent aronud 2% of the total census.
The survey was designed to be representative at the regional level.
Establishments
All non-household establishments with a fixed site and any household-based business with a sign indicating its presence within a household.
Census/enumeration data
IBES 2014 consisted of two phases. Phase 1 was a census of non-household establishments with a fixed site and any household-based business with a sign indicating its presence within a household. Phase 2 was a stratified roughly 5% sample of the phase I firms.
Face-to-face [f2f]
For Phase II there were nine different questionnaires (or "forms") used, depending on the sector that the firms main business activity was in. Manufacturing was split further, with small firms (<30 persons engaged) getting form 3A and larger firms getting 3B.
The data are released with a detailed description of the how the final dataset was prepared - please see the documentation.
78% of the firms selected for re-interview in phase II were successfully interviewed.
The data are released with a detailed description of the how the final dataset was prepared and the data quality issues - please see the documentation.
Facebook
TwitterThe 2022 Ghana Demographic and Health Survey (2022 GDHS) is the seventh in the series of DHS surveys conducted by the Ghana Statistical Service (GSS) in collaboration with the Ministry of Health/Ghana Health Service (MoH/GHS) and other stakeholders, with funding from the United States Agency for International Development (USAID) and other partners.
The primary objective of the 2022 GDHS is to provide up-to-date estimates of basic demographic and health indicators. Specifically, the GDHS collected information on: - Fertility levels and preferences, contraceptive use, antenatal and delivery care, maternal and child health, childhood mortality, childhood immunisation, breastfeeding and young child feeding practices, women’s dietary diversity, violence against women, gender, nutritional status of adults and children, awareness regarding HIV/AIDS and other sexually transmitted infections, tobacco use, and other indicators relevant for the Sustainable Development Goals - Haemoglobin levels of women and children - Prevalence of malaria parasitaemia (rapid diagnostic testing and thick slides for malaria parasitaemia in the field and microscopy in the lab) among children age 6–59 months - Use of treated mosquito nets - Use of antimalarial drugs for treatment of fever among children under age 5
The information collected through the 2022 GDHS is intended to assist policymakers and programme managers in designing and evaluating programmes and strategies for improving the health of the country’s population.
National coverage
The survey covered all de jure household members (usual residents), all women aged 15-49, men aged 15-59, and all children aged 0-4 resident in the household.
Sample survey data [ssd]
To achieve the objectives of the 2022 GDHS, a stratified representative sample of 18,450 households was selected in 618 clusters, which resulted in 15,014 interviewed women age 15–49 and 7,044 interviewed men age 15–59 (in one of every two households selected).
The sampling frame used for the 2022 GDHS is the updated frame prepared by the GSS based on the 2021 Population and Housing Census.1 The sampling procedure used in the 2022 GDHS was stratified two-stage cluster sampling, designed to yield representative results at the national level, for urban and rural areas, and for each of the country’s 16 regions for most DHS indicators. In the first stage, 618 target clusters were selected from the sampling frame using a probability proportional to size strategy for urban and rural areas in each region. Then the number of targeted clusters were selected with equal probability systematic random sampling of the clusters selected in the first phase for urban and rural areas. In the second stage, after selection of the clusters, a household listing and map updating operation was carried out in all of the selected clusters to develop a list of households for each cluster. This list served as a sampling frame for selection of the household sample. The GSS organized a 5-day training course on listing procedures for listers and mappers with support from ICF. The listers and mappers were organized into 25 teams consisting of one lister and one mapper per team. The teams spent 2 months completing the listing operation. In addition to listing the households, the listers collected the geographical coordinates of each household using GPS dongles provided by ICF and in accordance with the instructions in the DHS listing manual. The household listing was carried out using tablet computers, with software provided by The DHS Program. A fixed number of 30 households in each cluster were randomly selected from the list for interviews.
For further details on sample design, see APPENDIX A of the final report.
Face-to-face computer-assisted interviews [capi]
Four questionnaires were used in the 2022 GDHS: the Household Questionnaire, the Woman’s Questionnaire, the Man’s Questionnaire, and the Biomarker Questionnaire. The questionnaires, based on The DHS Program’s model questionnaires, were adapted to reflect the population and health issues relevant to Ghana. In addition, a self-administered Fieldworker Questionnaire collected information about the survey’s fieldworkers.
The GSS organized a questionnaire design workshop with support from ICF and obtained input from government and development partners expected to use the resulting data. The DHS Program optional modules on domestic violence, malaria, and social and behavior change communication were incorporated into the Woman’s Questionnaire. ICF provided technical assistance in adapting the modules to the questionnaires.
DHS staff installed all central office programmes, data structure checks, secondary editing, and field check tables from 17–20 October 2022. Central office training was implemented using the practice data to test the central office system and field check tables. Seven GSS staff members (four male and three female) were trained on the functionality of the central office menu, including accepting clusters from the field, data editing procedures, and producing reports to monitor fieldwork.
From 27 February to 17 March, DHS staff visited the Ghana Statistical Service office in Accra to work with the GSS central office staff on finishing the secondary editing and to clean and finalize all data received from the 618 clusters.
A total of 18,540 households were selected for the GDHS sample, of which 18,065 were found to be occupied. Of the occupied households, 17,933 were successfully interviewed, yielding a response rate of 99%. In the interviewed households, 15,317 women age 15–49 were identified as eligible for individual interviews. Interviews were completed with 15,014 women, yielding a response rate of 98%. In the subsample of households selected for the male survey, 7,263 men age 15–59 were identified as eligible for individual interviews and 7,044 were successfully interviewed.
The estimates from a sample survey are affected by two types of errors: (1) nonsampling errors and (2) sampling errors. Nonsampling errors are the results of mistakes made in implementing data collection and data processing, such as failure to locate and interview the correct household, misunderstanding of the questions on the part of either the interviewer or the respondent, and data entry errors. Although numerous efforts were made during the implementation of the 2022 Ghana Demographic and Health Survey (2022 GDHS) to minimize this type of error, nonsampling errors are impossible to avoid and difficult to evaluate statistically.
Sampling errors, on the other hand, can be evaluated statistically. The sample of respondents selected in the 2022 GDHS is only one of many samples that could have been selected from the same population, using the same design and identical size. Each of these samples would yield results that differ somewhat from the results of the actual sample selected. Sampling errors are a measure of the variability between all possible samples. Although the degree of variability is not known exactly, it can be estimated from the survey results. A sampling error is usually measured in terms of the standard error for a particular statistic (mean, percentage, etc.), which is the square root of the variance. The standard error can be used to calculate confidence intervals within which the true value for the population can reasonably be assumed to fall. For example, for any given statistic calculated from a sample survey, the value of that statistic will fall within a range of plus or minus two times the standard error of that statistic in 95% of all possible samples of identical size and design.
If the sample of respondents had been selected as a simple random sample, it would have been possible to use straightforward formulas for calculating sampling errors. However, the 2022 GDHS sample was the result of a multistage stratified design, and, consequently, it was necessary to use more complex formulas. The computer software used to calculate sampling errors for the GDHS 2022 is an SAS program. This program used the Taylor linearization method to estimate variances for survey estimates that are means, proportions, or ratios. The Jackknife repeated replication method is used for variance estimation of more complex statistics such as fertility and mortality rates.
A more detailed description of estimates of sampling errors are presented in APPENDIX B of the survey report.
Data Quality Tables
Facebook
TwitterThe survey was conducted in Ghana between December 2012 and July 2014 as part of the Africa Enterprise Survey 2013 roll-out, an initiative of the World Bank. The objective of the survey is to obtain feedback from enterprises on the state of the private sector as well as to help in building a panel of enterprise data that will make it possible to track changes in the business environment over time, thus allowing, for example, impact assessments of reforms. Through interviews with firms in the manufacturing and services sectors, the survey assesses the constraints to private sector growth and creates statistically significant business environment indicators that are comparable across countries.
Data from 720 establishments was analyzed. Stratified random sampling was used to select the surveyed businesses. The data was collected using face-to-face interviews.
The standard Enterprise Survey topics include firm characteristics, gender participation, access to finance, annual sales, costs of inputs and labor, workforce composition, bribery, licensing, infrastructure, trade, crime, competition, capacity utilization, land and permits, taxation, informality, business-government relations, innovation and technology, and performance measures. Over 90 percent of the questions objectively ascertain characteristics of a country’s business environment. The remaining questions assess the survey respondents’ opinions on what are the obstacles to firm growth and performance.
National
The primary sampling unit of the study is an establishment. An establishment is a physical location where business is carried out and where industrial operations take place or services are provided. A firm may be composed of one or more establishments. For example, a brewery may have several bottling plants and several establishments for distribution. For the purposes of this survey an establishment must make its own financial decisions and have its own financial statements separate from those of the firm. An establishment must also have its own management and control over its payroll.
The whole population, or the universe, covered in the Enterprise Surveys is the non-agricultural private economy. It comprises: all manufacturing sectors according to the ISIC Revision 3.1 group classification (group D), construction sector (group F), services sector (groups G and H), and transport, storage, and communications sector (group I). Note that this population definition excludes the following sectors: financial intermediation (group J), real estate and renting activities (group K, except sub-sector 72, IT, which was added to the population under study), and all public or utilities sectors. Companies with 100% government ownership are not eligible to participate in the Enterprise Surveys.
Sample survey data [ssd]
The sample for Ghana was selected using stratified random sampling. Three levels of stratification were used in this country: firm sector, firm size, and geographic region.
Industry stratification was designed in the way that follows: the universe was stratified into four manufacturing industries (food, textiles and garments, chemicals and plastics, other manufacturing) and two service sectors (retail and other services).
Size stratification was defined following the standardized definition for the Enterprise Surveys: small (5 to 19 employees), medium (20 to 99 employees), and large (more than 99 employees).
Regional stratification for the Ghana ES was defined in four regions: Accra, North (Kumasi and Tamale), Takoradi, and Tema.
For the Ghana ES, several sample frames were used. The first was supplied by the World Bank and consists of enterprises interviewed in Ghana 2007. The World Bank required that attempts should be made to re-interview establishments responding to the Ghana 2007 survey where they were within the selected geographical regions and met eligibility criteria. Due to the fact that the previous round of surveys seemed to have utilized different stratification criteria (or no stratification at all) and due to the prevalence of small firms and firms located in the capital city in the 2007 sample the following convention was used. The presence of panel firms was limited to a maximum of 50% of the achieved interviews in each cell. That sample is referred to as the Panel.
The second frame was constructed using different lists acquired from relevant institutions in Ghana. The main lists used were obtained from the Ghana Statistical Service (GSS). These include: 1) The 2012 Firm Registry. The registry lacked information on firm employee size. 2) The list of firms paying VAT. The VAT dataset included a variable on firms; turnover. The VAT dataset and Firm Registry were merged by using the firms' identification number (TIN). VAT information was not available for all firms in the Firm Registry. 3) The list of Large Tax Payers. The Large Tax Payers file also lacked information on firm employee size.
Since firm size was missing from all lists mentioned above, after having discussed with GSS and with the local contractor the following methods were used to predict firm size. - All firms who were in the Firm Registry but not in the VAT dataset were considered to be micro firms and therefore not use in the current survey. - Firms who were in the Firm Registry and in the VAT dataset were considered to be small firms. - Firms in the Large Tax Payers dataset were considered medium or large firms. The original design was divided into two size groups: small firms and medium and large firms.
During fieldwork the GSS lists proved to be very inaccurate and not sufficient to reach the target sample design, As such they were complemented with additional lists of firms from the Ghana Chamber of Commerce and Industry and Business Associations. The list from the Ghana Chamber of Commerce lacked information on firm employee size or firm turnover. Given the impact that non-eligible units included in the sample universe may have on the results, adjustments may be needed when computing the appropriate weights for individual observations. The percentage of confirmed non-eligible units as a proportion of the total number of sampled establishments contacted for the survey was 1.3% (26 out of 1,990 establishments).
Finally, a block enumeration was also undertaken in order to build an additional list. The block enumeration allowed to physically creating a list of establishments from which to sample from. A total of 41 blocks were enumerated in the four locations included in the project out of the total 804 blocks identified. The enumeration was conducted without major problems in the time planned. The list of enumerated firms contained 958 records eligible for main Enterprise Survey.
Note: Unlike the standard ES, the universe for the Ghana ES is characterized by the presence of 5 size categories. The category medium&large was added as stratum in order to sample from the GSS large payers list, while the category "unknow size" was included in order to sample the firms in the Chamber of Commerce and Industry list.
Face-to-face [f2f]
The following survey instruments are available: - Manufacturing Module Questionnaire - Services Module Questionnaire
The survey is fielded via manufacturing or services questionnaires in order not to ask questions that are irrelevant to specific types of firms, e.g. a question that relates to production and nonproduction workers should not be asked of a retail firm. In addition to questions that are asked across countries, all surveys are customized and contain country-specific questions. An example of customization would be including tourism-related questions that are asked in certain countries when tourism is an existing or potential sector of economic growth.
There is a skip pattern in the Service Module Questionnaire for questions that apply only to retail firms.
Data entry and quality controls are implemented by the contractor and data is delivered to the World Bank in batches (typically 10%, 50% and 100%). These data deliveries are checked for logical consistency, out of range values, skip patterns, and duplicate entries. Problems are flagged by the World Bank and corrected by the implementing contractor through data checks, callbacks, and revisiting establishments.
Survey non-response must be differentiated from item non-response. The former refers to refusals to participate in the survey altogether whereas the latter refers to the refusals to answer some specific questions. Enterprise Surveys suffer from both problems and different strategies were used to address these issues.
Item non-response was addressed by two strategies: a- For sensitive questions that may generate negative reactions from the respondent, such as corruption or tax evasion, enumerators were instructed to collect "Refusal to respond" (-8) as a different option from "Don't know" (-9). b- Establishments with incomplete information were re-contacted in order to complete this information, whenever necessary.
Survey non-response was addressed by maximizing efforts to contact establishments that were initially selected for interview. Attempts were made to contact the establishment for interview at different times/days of the week before a replacement establishment (with similar strata characteristics) was suggested for interview. Survey non-response did occur but substitutions were made in order to potentially achieve
Facebook
TwitterThe primary objective of the 2014 GDHS was to generate recent reliable information on fertility, family planning, infant and child mortality, maternal and child health, and nutrition. In addition, the survey collected specialised data on malaria treatment, prevention, and prevalence among children age 6-59 months; blood pressure among adults; anaemia among women and children; and HIV prevalence among adults. This information is essential for making informed policy decisions and for planning, monitoring, and evaluating programmes related to health in general, and reproductive health in particular, at both the national and regional levels. Analysis of data collected in the 2014 GDHS provides updated estimates of basic demographic and health indicators covered in the earlier rounds of the 1988, 1993, 1998, 2003, and 2008 surveys.
The GDHS will assist policymakers and programme managers in evaluating and designing programmes and strategies for improving the health of Ghana’s population. The 2014 GDHS also provides comparable data for long-term trend analysis in Ghana, since the surveys were implemented by the same organisation, using similar data collection procedures. Furthermore, the survey adds to the international database on demographic and health–related information for research purposes.
National
Sample survey data [ssd]
The sampling frame used for the 2014 GDHS is an updated frame from the 2010 Ghana Population and Housing Census provided by the Ghana Statistical Service (GSS 2013b). The sampling frame excluded nomadic and institutional populations such as persons in hotels, barracks, and prisons.
The 2014 GDHS followed a two-stage sample design and was intended to allow estimates of key indicators at the national level as well as for urban and rural areas and each of Ghana's 10 administrative regions. The first stage involved selecting sample points (clusters) consisting of enumeration areas (EAs) delineated for the 2010 PHC. A total of 427 clusters were selected, 216 in urban areas and 211 in rural areas.
The second stage involved the systematic sampling of households. A household listing operation was undertaken in all the selected EAs in January-March 2014, and households to be included in the survey were randomly selected from the list. About 30 households were selected from each cluster to constitute the total sample size of 12,831 households. Because of the approximately equal sample sizes in each region, the sample is not self-weighting at the national level, and weighting factors have been added to the data file so that the results will be proportional at the national level.
All women age 15-49 who were either permanent residents of the selected households or visitors who stayed in the household the night before the survey were eligible to be interviewed and have their blood pressure measured.
In half of the households, all men age 15-59 who were either permanent residents of the selected households or visitors who stayed in the households the night before the survey were eligible to be interviewed. In addition, in the subsample of households selected for the male survey: • blood pressure measurements were performed among eligible men who consented to being tested; • children age 6-59 months were tested for anaemia and malaria with the parent's or guardian's consent; • eligible women who consented were tested for anaemia; • blood samples were collected for laboratory testing of HIV from eligible women and men who consented; and • height and weight information was collected from eligible women, men, and children age 0- 59 months.
For further details on sample selection, see Appendix A of the final report.
Face-to-face [f2f]
Three questionnaires were used for the 2014 GDHS: the Household Questionnaire, the Woman’s Questionnaire, and the Man’s Questionnaire. These questionnaires, which were based on standard Demographic and Health Survey (DHS) questionnaires, were adapted to reflect the population and health issues relevant to Ghana. Comments on the questionnaires were solicited from various stakeholders representing government ministries and agencies, nongovernmental organisations, and international donors. The definitive questionnaires were first prepared in English; they were then translated into the major local languages, namely Akan, Ga, and Ewe.
The Household Questionnaire was used to list all the members of and visitors to the selected households. Basic demographic information was collected on the characteristics of each person listed, including his or her age, sex, marital status, education, and relationship to the head of the household. For children under age 18, parents’ survival status was determined. The data on age and sex of household members obtained in the Household Questionnaire were used to identify women and men who were eligible for individual interviews. The Household Questionnaire also included questions on child education as well as the characteristics of the household’s dwelling unit, such as source of water, type of toilet facilities, materials used for the floor of the dwelling unit, and ownership of various durable goods.
The Woman’s Questionnaire was used to collect information from all eligible women age 15-49.
In half of the selected households, the Man’s Questionnaire was administered to all men age 15-59. The Man’s Questionnaire collected much of the same information found in the Woman’s Questionnaire but was shorter because it did not contain a detailed reproductive history or questions on maternal and child health.
The data processing operation included 100 percent verification (also called second data entry) and secondary editing, which involved resolution of computer-identified inconsistencies. The data processing activities at the central office were led by one key GSS officer who took part in the main fieldwork training. Data processing was accomplished using CSPro software. Data entry and editing were initiated in September 2014 and completed in February 2015.
A total of 12,831 households were selected for the sample, of which 12,010 were occupied. Of the occupied households, 11,835 were successfully interviewed, yielding a response rate of 99 percent, the same as the 2008 GDHS household response rate (GSS, GHS, and ICF Macro 2009).
In the interviewed households, 9,656 eligible women were identified for individual interviews; interviews were completed with 9,396 women, yielding a response rate of 97 percent. In the subsample of households selected for the male survey, 4,609 eligible men were identified and 4,388 were successfully interviewed, yielding a response rate of 95 percent. The lower response rate for men was likely due to their more frequent and longer absences from the household.
The estimates from a sample survey are affected by two types of errors: non-sampling errors and sampling errors. Non-sampling errors are the results of mistakes made in implementing data collection and data processing, such as failure to locate and interview the correct household, misunderstanding of the questions on the part of either the interviewer or the respondent, and data entry errors. Although numerous efforts were made during the implementation of the 2014 Ghana DHS (GDHS) to minimize this type of error, non-sampling errors are impossible to avoid and difficult to evaluate statistically.
Sampling errors, on the other hand, can be evaluated statistically. The sample of respondents selected in the 2014 GDHS is only one of many samples that could have been selected from the same population, using the same design and expected size. Each of these samples would yield results that differ somewhat from the results of the actual sample selected. Sampling errors are a measure of the variability between all possible samples. Although the degree of variability is not known exactly, it can be estimated from the survey results.
Sampling error is usually measured in terms of the standard error for a particular statistic (mean, percentage, etc.), which is the square root of the variance. The standard error can be used to calculate confidence intervals within which the true value for the population can reasonably be assumed to fall. For example, for any given statistic calculated from a sample survey, the value of that statistic will fall within a range of plus or minus two times the standard error of that statistic in 95 percent of all possible samples of identical size and design.
If the sample of respondents had been selected as a simple random sample, it would have been possible to use straightforward formulas for calculating sampling errors. However, the 2014 GDHS sample is the result of a multi-stage stratified design, and, consequently, it was necessary to use more complex formulae. Sampling errors are computed in either ISSA or SAS, using programs developed by ICF International. These programs use the Taylor linearization method of variance estimation for survey estimates that are means, proportions or ratios. The Jackknife repeated replication method is used for variance estimation of more complex statistics such as fertility and mortality rates.
The Taylor linearization method treats any percentage or average as a ratio estimate, r = y x , where y represents the total sample value for variable y, and x represents the
Facebook
TwitterThe National Congregations Study (NCS) dataset "fills a void in the sociological study of congregations by providing, for the first time, data that can be used to draw a nationally aggregate picture of congregations" (Chaves et al. 1999, p.460). Thanks to innovations in sampling techniques, the NCS data is the first nationally representative sample of American congregations. In 2006-07, a panel component was added to the NCS. In addition to the new cross-section of congregations generated in conjunction with the 2006 General Social Survey (GSS), a stratified random sample was drawn from congregations who participated in the 1998 NCS. A full codebook, prepared by the primary investigator, is available for download "https://sites.duke.edu/ncsweb/" Target="_blank">here. The codebook contains the original questionnaire, as well as detailed information on survey methodology, weights, coding, and more.
Variable names have been shortened to allow for downloading of the data set as an SPSS portable file. Original variable names are shown in parentheses at the beginning of each variable description.
The "/data-archive?fid=NCSIV" Target="_blank">NCS Cumulative Dataset is also available from the ARDA.
Facebook
TwitterThe National Congregations Study (NCS) dataset fills a void in the sociological study of congregations by providing data that can be used to draw a nationally aggregate picture of congregations. Thanks to innovations in sampling techniques, the 1998 NCS data was the first nationally representative sample of American congregations. Subsequent NCS waves were conducted in 2006-07, 2012, and 2018-19.
Like Wave II, Wave IV again included a panel component. In addition to the new cross-section of congregations generated in conjunction with the 2018 GSS, the NCS-IV included all Wave III congregations that were nominated by GSS respondents who participated in the GSS for the first time in 2012. That is, the panel did not include Wave III congregations that had been nominated by GSS respondents who were in the 2012 GSS because they were part of the GSS's own panel of re-interviewees. The 2018-19 NCS, then, includes a subset of congregations that also were interviewed in 2012. A full codebook, prepared by the primary investigator and containing a section with details about the panel datasets, is available for download "https://sites.duke.edu/ncsweb/files/2020/09/NCS-I-IV-Cumulative-Codebook_FINAL_8Sept2020.pdf" Target="_blank">here. The codebook contains the original questionnaire, as well as detailed information on survey methodology, weights, coding, and more.
The "/data-archive?fid=NCSIV" Target="_blank">NCS Cumulative Dataset is also available from the ARDA.
Facebook
TwitterThe Ghana Living Standards Survey (GLSS7) primarily focused on consumption poverty and inequality in Ghana. It also examined some poverty-related issues such as asset ownership and access to services and human development. The GLSS7 survey analyzed macroeconomic developments in the country since 2005, focusing on growth in gross domestic product (GDP), trends in inflation, balance of payments, and public expenditures.
In the previous survey in 2012/13, a new consumption basket was derived, and this produced new poverty lines and a new set of items to be included in the welfare measurement. A review of this basket reveals that there is no drastic change in the consumption pattern, and therefore the basket was maintained for the current survey. GLSS7 examined the pattern of poverty in Ghana since 2005 based on the 2012/13 basket.
The data collection for the survey was carried out by the Ghana Statistical Service (GSS). A nationally representative sample of about 15,000 households, in 1,000 Enumeration Areas (EAs), was interviewed over a period of 12 months. The specific objectives of the GLSS7 survey were:
National
Sample survey data [ssd]
A nationally representative sample of households was selected in order to achieve the survey objectives. After the selection of EAs and before the main survey, a household listing operation was carried out in all the selected EAs. The household listing operation consists of visiting each of the 1,000 selected EAs to record all structures and households within the EAs with the addresses and the names of the heads of the households using Computer Assisted Personal Interviewing (CAPI). The listed households served as the sampling frame for the selection of 15 households in the second stage selection for the main survey using a systematic sampling method.
There was a two-stage sampling procedure. In the first stage enumeration areas (EAs) were selected based on the 2010 Population and Housing Census, with probability proportional to size (number of households). At the second stage a fixed number of households were selected by systematic sampling within each of the selected EAs.
Face-to-face [f2f]
The GLSS was comprised of the following questionnaires: 1. Household Questionnaire Module A 2. Household Questionnaire Module B 3. Section 13: Governance, Peace, Security and Data protection 4. Price Data Questionnaire 5. Community Questionnaire 6. Non- farm Enterprise Questionnaire
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
INTRODUCTION. GSS has run annually since 1972; it surveys a representative sample of the adult population in the American society; It is widely used by politicians, policy makers, and researchers. in order to monitor and explain trends and constants in attitudes, behaviors, and attributes. and asks questions about standard core of demographics, beliefs about social and political issues, behavioral, and attitudinal questions, plus topics of special interest. Among the topics covered are civil liberties, crime and violence, intergroup tolerance, morality, national spending priorities, psychological well-being, social mobility, and stress and traumatic events. Altogether the GSS is the single best source for sociological and attitudinal trend data covering the United States. It allows researchers to examine the structure and functioning of society in general as well as the role played by relevant subgroups and to compare the United States to other nations. Source http://gss.norc.org/About-The-GSS About the Data The survey is conducted face-to-face with an in-person interview by National Opinion Research Center (NORC) at the University of Chicago. However, participation in the study is strictly voluntary. Therefore, study based on the GSS sample data is: • generalizable to the target population if we ignore the non-response bias; • definitely not causal, because the study does not employ random assignments and is only observational.
Research Question As for the research question, I'm interested in exploring the relationship between people's job preference and their education status, using latest data. More specifically, are people's preference in a job (like job security, high income, short working hours etc.) associated with their highest degree received? Motivation: Aside from sleeping, working is the activity that takes away the most of our lifetime hours and has a huge impact on people's well-being and happiness. I would be really interested in the factors that determines peoples' attitude toward job.
Reading the data. The data we’ll use is from the General Social Survey (GSS). Using the GSS Data Explorer, I selected a subset of the variables in the GSS and made it available along with this notebook. The survey contains more that 5000 of variables with data on a wide range of subjects, I have selected just a few.