Facebook
TwitterThe Project for Statistics on Living standards and Development was a countrywide World Bank Living Standards Measurement Survey. It covered approximately 9000 households, drawn from a representative sample of South African households. The fieldwork was undertaken during the nine months leading up to the country's first democratic elections at the end of April 1994. The purpose of the survey was to collect statistical information about the conditions under which South Africans live in order to provide policymakers with the data necessary for planning strategies. This data would aid the implementation of goals such as those outlined in the Government of National Unity's Reconstruction and Development Programme.
National
Households
All Household members. Individuals in hospitals, old age homes, hotels and hostels of educational institutions were not included in the sample. Migrant labour hostels were included. In addition to those that turned up in the selected ESDs, a sample of three hostels was chosen from a national list provided by the Human Sciences Research Council and within each of these hostels a representative sample was drawn on a similar basis as described above for the households in ESDs.
Sample survey data [ssd]
(a) SAMPLING DESIGN
Sample size is 9,000 households. The sample design adopted for the study was a two-stage self-weighting design in which the first stage units were Census Enumerator Subdistricts (ESDs, or their equivalent) and the second stage were households. The advantage of using such a design is that it provides a representative sample that need not be based on accurate census population distribution in the case of South Africa, the sample will automatically include many poor people, without the need to go beyond this and oversample the poor. Proportionate sampling as in such a self-weighting sample design offers the simplest possible data files for further analysis, as weights do not have to be added. However, in the end this advantage could not be retained, and weights had to be added.
(b) SAMPLE FRAME
The sampling frame was drawn up on the basis of small, clearly demarcated area units, each with a population estimate. The nature of the self-weighting procedure adopted ensured that this population estimate was not important for determining the final sample, however. For most of the country, census ESDs were used. Where some ESDs comprised relatively large populations as for instance in some black townships such as Soweto, aerial photographs were used to divide the areas into blocks of approximately equal population size. In other instances, particularly in some of the former homelands, the area units were not ESDs but villages or village groups. In the sample design chosen, the area stage units (generally ESDs) were selected with probability proportional to size, based on the census population. Systematic sampling was used throughout that is, sampling at fixed interval in a list of ESDs, starting at a randomly selected starting point. Given that sampling was self-weighting, the impact of stratification was expected to be modest. The main objective was to ensure that the racial and geographic breakdown approximated the national population distribution. This was done by listing the area stage units (ESDs) by statistical region and then within the statistical region by urban or rural. Within these sub-statistical regions, the ESDs were then listed in order of percentage African. The sampling interval for the selection of the ESDs was obtained by dividing the 1991 census population of 38,120,853 by the 300 clusters to be selected. This yielded 105,800. Starting at a randomly selected point, every 105,800th person down the cluster list was selected. This ensured both geographic and racial diversity (ESDs were ordered by statistical sub-region and proportion of the population African). In three or four instances, the ESD chosen was judged inaccessible and replaced with a similar one. In the second sampling stage the unit of analysis was the household. In each selected ESD a listing or enumeration of households was carried out by means of a field operation. From the households listed in an ESD a sample of households was selected by systematic sampling. Even though the ultimate enumeration unit was the household, in most cases "stands" were used as enumeration units. However, when a stand was chosen as the enumeration unit all households on that stand had to be interviewed.
Face-to-face [f2f]
All the questionnaires were checked when received. Where information was incomplete or appeared contradictory, the questionnaire was sent back to the relevant survey organization. As soon as the data was available, it was captured using local development platform ADE. This was completed in February 1994. Following this, a series of exploratory programs were written to highlight inconsistencies and outlier. For example, all person level files were linked together to ensure that the same person code reported in different sections of the questionnaire corresponded to the same person. The error reports from these programs were compared to the questionnaires and the necessary alterations made. This was a lengthy process, as several files were checked more than once, and completed at the beginning of August 1994. In some cases, questionnaires would contain missing values, or comments that the respondent did not know, or refused to answer a question.
These responses are coded in the data files with the following values: VALUE MEANING -1 : The data was not available on the questionnaire or form -2 : The field is not applicable -3 : Respondent refused to answer -4 : Respondent did not know answer to question
The data collected in clusters 217 and 218 should be viewed as highly unreliable and therefore removed from the data set. The data currently available on the web site has been revised to remove the data from these clusters. Researchers who have downloaded the data in the past should revise their data sets. For information on the data in those clusters, contact SALDRU http://www.saldru.uct.ac.za/.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Pen-and-paper homework and project-based learning are both commonly used instructional methods in introductory statistics courses. However, there have been few studies comparing these two methods exclusively. In this case study, each was used in two different sections of the same introductory statistics course at a regional state university. Students’ statistical literacy was measured by exam scores across the course, including the final. The comparison of the two instructional methods includes using descriptive statistics and two-sample t-tests, as well authors’ reflections on the instructional methods. Results indicated that there is no statistically discernible difference between the two instructional methods in the introductory statistics course.
Facebook
TwitterBACKGROUND The data contained in the compressed file has been extracted from the Marketing Carrier On-Time Performance (Beginning January 2018) data table of the "On-Time" database from the TranStats data library. The time period is indicated in the name of the compressed file; for example, XXX_XXXXX_2001_1 contains data of the first month of the year 2001.
RECORD LAYOUT Below are fields in the order that they appear on the records: Year Year Quarter Quarter (1-4) Month Month DayofMonth Day of Month DayOfWeek Day of Week FlightDate Flight Date (yyyymmdd) Marketing_Airline_Network Unique Marketing Carrier Code. When the same code has been used by multiple carriers, a numeric suffix is used for earlier users, for example, PA, PA(1), PA(2). Use this field for analysis across a range of years. Operated_or_Branded_Code_Share_Partners Reporting Carrier Operated or Branded Code Share Partners DOT_ID_Marketing_Airline An identification number assigned by US DOT to identify a unique airline (carrier). A unique airline (carrier) is defined as one holding and reporting under the same DOT certificate regardless of its Code, Name, or holding company/corporation. IATA_Code_Marketing_Airline Code assigned by IATA and commonly used to identify a carrier. As the same code may have been assigned to different carriers over time, the code is not always unique. For analysis, use the Unique Carrier Code. Flight_Number_Marketing_Airline Flight Number Originally_Scheduled_Code_Share_Airline Unique Scheduled Operating Carrier Code. When the same code has been used by multiple carriers, a numeric suffix is used for earlier users,for example, PA, PA(1), PA(2). Use this field for analysis across a range of years. DOT_ID_Originally_Scheduled_Code_Share_Airline An identification number assigned by US DOT to identify a unique airline (carrier). A unique airline (carrier) is defined as one holding and reporting under the same DOT certificate regardless of its Code, Name, or holding company/corporation. IATA_Code_Originally_Scheduled_Code_Share_Airline Code assigned by IATA and commonly used to identify a carrier. As the same code may have been assigned to different carriers over time, the code is not always unique. For analysis, use the Unique Carrier Code. Flight_Num_Originally_Scheduled_Code_Share_Airline Flight Number Operating_Airline Unique Carrier Code. When the same code has been used by multiple carriers, a numeric suffix is used for earlier users, for example, PA, PA(1), PA(2). Use this field for analysis across a range of years. DOT_ID_Operating_Airline An identification number assigned by US DOT to identify a unique airline (carrier). A unique airline (carrier) is defined as one holding and reporting under the same DOT certificate regardless of its Code, Name, or holding company/corporation. IATA_Code_Operating_Airline Code assigned by IATA and commonly used to identify a carrier. As the same code may have been assigned to different carriers over time, the code is not always unique. For analysis, use the Unique Carrier Code. Tail_Number Tail Number Flight_Number_Operating_Airline Flight Number OriginAirportID Origin Airport, Airport ID. An identification number assigned by US DOT to identify a unique airport. Use this field for airport analysis across a range of years because an airport can change its airport code and airport codes can be reused. OriginAirportSeqID Origin Airport, Airport Sequence ID. An identification number assigned by US DOT to identify a unique airport at a given point of time. Airport attributes, such as airport name or coordinates, may change over time. OriginCityMarketID Origin Airport, City Market ID. City Market ID is an identification number assigned by US DOT to identify a city market. Use this field to consolidate airports serving the same city market. Origin Origin Airport OriginCityName Origin Airport, City Name OriginState Origin Airport, State Code OriginStateFips Origin Airport, State Fips OriginStateName Origin Airport, State Name OriginWac Origin Airport, World Area Code DestAirportID Destination Airport, Airport ID. An identification number assigned by US DOT to identify a unique airport. Use this field for airport analysis across a range of years because an airport can change its airport code and airport codes can be reused. DestAirportSeqID Destination Airport, Airport Sequence ID. An identification number assigned by US DOT to identify a unique airport at a given point of time. Airport attributes, such as airport name or coordinates, may change over time. DestCityMarketID Destination Airport, City Market ID. City Market ID is an identification number assigned by US DOT to identify a city market. Use this field to consolidate airports serving the same city market. Dest Destination Airport DestCityName Destination Airport, City Name DestState Destination Airport, State Code DestStateFips D...
Facebook
TwitterFinancial overview and grant giving statistics of Metro Ideas Project
Facebook
TwitterComprehensive YouTube channel statistics for 5-Minute Projects and Design Ideas, featuring 313,000 subscribers and 48,555,113 total views. This dataset includes detailed performance metrics such as subscriber growth, video views, engagement rates, and estimated revenue. The channel operates in the Lifestyle category and is based in US. Track 1,212 videos with daily and monthly performance data, including view counts, subscriber changes, and earnings estimates. Analyze growth trends, engagement patterns, and compare performance against similar channels in the same category.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This text aims to foster the reflection and criticism in the process of developing research projects in clinical nutrition. We present aspects regarding the evidence, validity, and reliability of results of studies in this field. Appropriate study planning is critical, from defining the design and type of experiment, going through the ethical aspects, population choice, and calculation of sample size, to the assessment of the feasibility of the risks involved in study execution. Once the information is collected, the next stages correspond to the description of the results, statistical analyses, verification of the consistency of these results, and ultimately their correct interpretation.
Facebook
TwitterThis is a collection of statistical projects where I used Microsoft Excel. The definition of each project was given by ProfessionAI, while the statistical analysis part was done by me. More specifically: - customer_complaints_assignment is an example of Introduction to Data Analytics where, given a dataset with complaints of customers of financial companies, tasks about filtering, counting and basic analytics were done; - trades_on_exchanges is a project for Advanced Data Analytics where statistical analysis about trading operations where done; - progetto_finale_inferenza is a project about Statistica Inference where, from a toy dataset about the population of a city, inference analysis was made.
Facebook
TwitterThe Community Survey (CS) is a nationally representative, large-scale household survey which was conducted from February to March 2007. The Community Survey is designed to provide information on the trends and levels of demographic and socio-economic data, such as population size and distribution; the extent of poor households; access to facilities and services, and the levels of employment/unemployment at national, provincial and municipality level. The data can be used to assist government and the private sector in the planning, evaluation and monitoring of programmes and policies. The information collected can also be used to assess the impact of socio-economic policies and provide an indication as to how far the country has gone in its strides to eradicate poverty.
Censuses 1996 and 2001 are the only all-inclusive censuses that Statistics South Africa has thus far conducted under the new democratic dispensation. Demographic and socio-economic data were collected and the results have enabled government and all other users of this information to make informed decisions. When cabinet took a decision that Stats SA should not conduct a census in 2006, it created a gap in information or data between Census 2001 and the next Census scheduled to be carried out in 2011. A decision was therefore taken to carry out the Community Survey in 2007.
The main objectives of the survey were: · To provide estimates at lower geographical levels than existing household surveys; · To build human, management and logistical capacities for Census 2011; and · To provide inputs into the preparation of the mid-year population projections.
The wider project strategic theme is to provide relevant statistical information that meets user needs and aspirations. Some of the main topics that are covered by the survey include demography, migration, disability and social grants, educational levels, employment and economic activities.
The survey covered the whole of South Africa, including all nine provinces as well as the four settlement types - urban-formal, urban-informal, rural-formal (commercial farms) and rural-informal (tribal areas).
Households
The Community Survey covered all de jure household members (usual residents) in South Africa. The survey excluded collective living quarters (institutions) and some households in EAs classified as recreational areas or institutions. However, an approximation of the out-of-scope population was made from the 2001 Census and added to the final estimates of the CS 2007 results.
Sample survey data [ssd]
Sample Design
The sampling procedure that was adopted for the CS was a two-stage stratified random sampling process. Stage one involved the selection of enumeration areas, and stage tow was the selection of dwelling units.
Since the data are required for each local municipality, each municipality was considered as an explicit stratum. The stratification is done for those municipalities classified as category B municipalities (local municipalities) and category A municipalities (metropolitan areas) as proclaimed at the time of Census 2001. However, the newly proclaimed boundaries as well as any other higher level of geography such as province or district municipality, were considered as any other domain variable based on their link to the smallest geographic unit - the enumeration area.
The Frame
The Census 2001 enumeration areas were used because they give a full geographic coverage of the country without any overlap. Although changes in settlement type, growth or movement of people have occurred, the enumeration areas assisted in getting a spatial comparison over time. Out of 80 787 enumeration areas countrywide, 79 466 were considered in the frame. A total of 1 321 enumeration areas were excluded (919 covering institutions and 402 recreational areas).
On the second level, the listing exercise yielded the dwelling frame which facilitated the selection of dwellings to be visited. The dwelling unit is a structure or part of a structure or group of structures occupied or meant to be occupied by one or more households. Some of these structures may be vacant and/or under construction, but can be lived in at the time of the survey. A dwelling unit may also be within collective living quarters where applicable (examples of each are a house, a group of huts, a flat, hostels, etc.).
The Community Survey universe at the second-level frame is dependent on whether the different structures are classified as dwelling units (DUs) or not. Structures where people stay/live were listed and classified as dwelling units. However, there are special cases of collective living quarters that were also included in the CS frame. These are religious institutions such as convents or monasteries, and guesthouses where people stay for an extended period (more than a month). Student residences - based on how long people have stayed (more than a month) - and old-age homes not similar to hospitals (where people are living in a communal set-up) were treated the same as hostels, thereby listing either the bed or room. In addition, any other family staying in separate quarters within the premises of an institution (like wardens' quarters, military family quarters, teachers' quarters and medical staff quarters) were considered as part of the CS frame. The inclusion of such group quarters in the frame is based on the living circumstances within these structures. Members are independent of each other with the exception that they sleep under one roof.
The remaining group quarters were excluded from the CS frame because they are difficult to access and have no stable composition. Excluded dwelling types were prisons, hotels, hospitals, military barracks, etc. This is in addition to the exclusion on first level of the enumeration areas (EAs) classified as institutions (military bases) or recreational areas (national parks).
The Selection of Enumeration Areas (EAs)
The EAs within each municipality were ordered by geographic type and EA type. The selection was done by using systematic random sampling. The criteria used were as follows: In municipalities with fewer than 30 EAs, all EAs were automatically selected. In municipalities with 30 or more EAs, the sample selection used a fixed proportion of 19% of all sampled EAs. However, if the selected EAs in a municipality were less than 30 EAs, the sample in the municipality was increased to 30 EAs.
The Selection of Dwelling Units
The second level of the frame required a full re-listing of dwelling units. The listing exercise was undertaken before the selection of DUs. The adopted listing methodology ensured that the listing route was determined by the lister. Thisapproach facilitated the serpentine selection of dwelling units. The listing exercise provided a complete list of dwelling units in the selected EAs. Only those structures that were classified as dwelling units were considered for selection, whether vacant or occupied. This exercise yielded a total of 2 511 314 dwelling units.
The selection of the dwelling units was also based on a fixed proportion of 10% of the total listed dwellings in an EA. A constraint was imposed on small-size EAs where, if the listed dwelling units were less than 10 dwellings, the selection was increased to 10 dwelling units. All households within the selected dwelling units were covered. There was no replacement of refusals, vacant dwellings or non-contacts owing to their impact on the probability of selection.
Face-to-face [f2f]
Consultation on Questionnaire Design Ten stakeholder workshops were held across the country during August and September 2004. Approximately 367 stakeholders, predominantly from national, provincial and local government departments, as well as from research and educational institutions, attended. The workshops aimed to achieve two objectives, namely to better understand the type of information stakeholders need to meet their objectives, and to consider the proposed data items to be included in future household surveys. The output from this process was a set of data items relating to a specific, defined focus area and outcomes that culminated with the data collection instrument (see Annexure B for all the data items).
Questionnaire Design The design of the CS questionnaire was household-based and intended to collect information on 10 people. It was developed in line with the household-based survey questionnaires conducted by Stats SA. The questions were based on the data items generated out of the consultation process described above. Both the design and questionnaire layout were pre-tested in October 2005 and adjustments were made for the pilot in February 2006. Further adjustments were done after the pilot results had been finalised.
Editing The automated cleaning was implemented based on an editing rules specification defined with reference to the approved questionnaire. Most of the editing rules were categorised into structural edits looking into the relationship between different record type, the minimum processability rules that removed false positive readings or noise, the logical editing that determine the inconsistency between fields of the same statistical unit, and the inferential editing that search similarities across the domain. The edit specifications document for the structural, population, mortality and housing edits was developed by a team of Stats SA subject-matter specialists, demographers, and programmers. The process was successfully
Facebook
TwitterComprehensive YouTube channel statistics for Cement Craft Ideas - DIY Projects, featuring 758,000 subscribers and 209,665,371 total views. This dataset includes detailed performance metrics such as subscriber growth, video views, engagement rates, and estimated revenue. The channel operates in the Lifestyle category and is based in US. Track 253 videos with daily and monthly performance data, including view counts, subscriber changes, and earnings estimates. Analyze growth trends, engagement patterns, and compare performance against similar channels in the same category.
Facebook
TwitterThis project provides a freely accessible three-dimensional statistical shape model (SSM) of the tibia, the MATLAB scripts for generating a SSM and the segmented surface models of the cortical and trabecular bone. Information on the use of code and data can be found in the read-me file contained within the download.
Further, this dataset and associated statistical shape models can be used in several ways to assist with skeletal focused research of the tibia-fibula. We do not have the scope to highlight each and every potential application, however have provided a series of example cases of where and how the shape models may be used. Our hope is that these examples can be directly used, or assist in guiding other uses.
Case 1: Generating Surface Samples — this example case demonstrates how to use the shape model data to reconstruct a randomly sampled 'population' of surfaces.
Case 2: Predicting and Generating Trabecular Volumes — this example case demonstrates how to combine the tibia and trabecular shape models to predict and generate the trabecular volume from a tibial surface.
Case 3: Generating Tibia-Fibula Surfaces from Landmarks — this example case demonstrates how to use the tibia-fibula shape model to estimate and reconstruct surfaces from palpable landmarks on the tibia and fibula.
Please cite our work if you use this code or data.
https://widgets.figshare.com/articles/20454462/embed?show_title=1
This project includes the following software/data packages:
Facebook
TwitterThe Viet Nam Multiple Indicator Cluster Survey (MICS) was carried by General Statistics Office of Viet Nam (GSO) in collaboration with Viet Nam Committee for Population, Family and Children (VCPFC). Financial and technical support by the United Nations Children's Fund (UNICEF).
In the World Summit for children held in New York in 1990, the Government of Vietnam committed itself to the implementation of the World Declaration and Plan of Action for children.
In implementation of directive 34/1999/CT-TTg on 27 December 1999 on promoting the implementation of the end-decade goals for children, reviewing the National Plan of Action for children, 1991-2000 and designing the National Plan of Action for children, 2001-2010, in the framework of the “Development of Social Indicators” project, the General Statistical Office (GSO) has chaired and coordinated with the Viet Nam Committee for the Protection and Care for Children (CPCC) to conduct the survey evaluating the end- decade goals for children, 1991-2000 (MICS). MICS has covered a sample size of 7628 households in 240 communes and wards representing the whole country, the urban area, the rural area and the 8 geographical areas in 61 towns/provinces. Field activities to collect data lasted 2 months, May- June/2000. The survey was technically supported by statisticians from EAPRO, UNICEF regional offices, UNICEF Hanoi on sample and questionnaire designing, data input software, not least the software analyzing and calculating the estimates generalizing the results of survey.
Survey Objectives: The end-decade survey on children is aimed at. · Providing up-to-date and reliable data to analyse the situation of children and women in 2000. · Providing data to assess the implementation of the World summit goals for children and of the National Plan of Action for Vietnamese Children, 1991-2000. · Serving as a basis (with baseline data and information) for development of the National Plan of Action for Children, 2001-2010. · Building professional capacity in monitoring, managing and evaluating all the goals of child protection, care and education at all levels.
The 2000 MICS of Vietnam was a nationally representative sample survey.
Households, Women, Child.
Sample survey data [ssd]
The sample for the Viet Nam Multiple Indicator Cluster Survey (MICSII) was designed to provide reliable estimates on a large number of indicators on the situation of children and women at the national level, for urban and rural areas, and for 8 regions: Red River Delta, North West, North East, North Central Coast, South Central Coast, Central Highlands, South East, and Mekong River Delta. Regions were identified as the main sampling domains and the sample was selected in two stages: At the first stage, 240 EAs are sellected. After a household listing was carried out within the selected enumeration areas, a systematic sample of 1/3 of households in each EA was drawn. The survey managed to visit all of 240 selected EAs during the fieldwork period. The sample was stratified by region and is not self-weighting. For reporting national level results, sample weights are used.
No major deviations from the original sample design were made. All sample enumeration areas were accessed and successfully interviewed with good response rates.
Face-to-face [f2f]
The questionnaires for MICS in Vietnam are based on the New York UNICEF module questionnaires with some modifications and additions to fit in with Vietnam's context and to evaluate the goals set out in the National Plan of Action. The questionnaires have been arranged in such a way as to prevent the loss of questionnaire sheets and to facilitate the logic control between the items in the modules. Questionnaires include 3 sections. Section 1: general questions to be administered to families and family members. Section 2: questions for child bearing-age women (aged 15-49). Section 3: for children under 5.
Section 1: Household questionnaire Part A: Household information panel Part B: Household listing form Part C: Education Part D: Child labour Part E: Maternal mortality Part F: Water and sanitation Part G: Salt iodization
Section 2: Questionnaire for child bearing-age women Part A: Child mortality Part B: Tetanus toxoid (TT) Part C: Maternal and newborn health Part D: Contraceptive use Part E: HIV/AIDS
Section 3: Questionnaire for children under five Part A:Birth registration and early learning Part B: Vitamin A Part C: Breastfeeding Part D: Care of illness Part E: Malaria Part F: Immunization Part G: Anthropometry
Apart from the questionnaires to collect information at family level, questionnaires are also designed to gather information at community level supplementary to some indicators that can not have data collected at family level. The information garnered includes local population, socio-economic and physical conditions, education, health and progress of projects/plans of actions for children.
To minimize the errors made by data entry staff members, all the records were double- entered by two different members. Any error detected between the two entries was re-checked to find out which one is wrong. Data cleaning started in to early September. This process was closely observed to ensure the accuracy, quality and practicality of all the data collected.
To minimize the errors due to wrong statements of respondents or wrong registration by interviewers, a cleaning programme was used to check the consistency and logic in the items of questionnaires and between the questionnaires. The cleaning programme printed out all the errors, then questionnaires were checked by qualified officials.
8356 households were selected for the sample. Of these all were found to be occupied households and 8355 were successfully interviewed for a response rate of 100%. Within these households, 10063 eligible women aged 15-49 were identified for interview, of which 9473 were successfully interviewed (response rate 94.1%), and 2707 children aged 0-4 were identified for whom the mother or caretaker was successfully interviewed for 2680 children (response rate 99%).
Estimates from a sample survey are affected by two types of errors: 1) non-sampling errors and 2) sampling errors. Non-sampling errors are the results of mistakes made in the implementation of data collection and data processing. Numerous efforts were made during implementation of the MICS - 3 to minimize this type of error, however, non-sampling errors are impossible to avoid and difficult to evaluate statistically.
Sampling errors can be evaluated statistically. The sample of respondents to the MICS - 3 is only one of many possible samples that could have been selected from the same population, using the same design and expected size. Each of these samples would yield results that different somewhat from the results of the actual sample selected. Sampling errors are a measure of the variability in the results of the survey between all possible samples, and, although, the degree of variability is not known exactly, it can be estimated from the survey results. The sampling errors are measured in terms of the standard error for a particular statistic (mean or percentage), which is the square root of the variance. Confidence intervals are calculated for each statistic within which the true value for the population can be assumed to fall. Plus or minus two standard errors of the statistic is used for key statistics presented in MICS, equivalent to a 95 percent confidence interval.
If the sample of respondents had been a simple random sample, it would have been possible to use straightforward formulae for calculating sampling errors. However, the MICS - 3 sample is the result of a two-stage stratified design, and consequently needs to use more complex formulae. The SPSS complex samples module has been used to calculate sampling errors for the MICS - 3. This module uses the Taylor linearization method of variance estimation for survey estimates that are means or proportions. This method is documented in the SPSS file CSDescriptives.pdf found under the Help, Algorithms options in SPSS.
Sampling errors have been calculated for a select set of statistics (all of which are proportions due to the limitations of the Taylor linearization method) for the national sample, urban and rural areas, and for each of the five regions. For each statistic, the estimate, its standard error, the coefficient of variation (or relative error -- the ratio between the standard error and the estimate), the design effect, and the square root design effect (DEFT -- the ratio between the standard error using the given sample design and the standard error that would result if a simple random sample had been used), as well as the 95 percent confidence intervals (+/-2 standard errors).
A series of data quality tables and graphs are available to review the quality of the data and include the following:
Age distribution of the household population Age distribution of eligible women and interviewed women Age distribution of eligible children and children for whom the mother or caretaker was interviewed Age distribution of children under age 5 by 3 month groups Age and period ratios at boundaries of eligibility Percent of observations with missing information on selected variables Presence of mother in
Facebook
TwitterThe programme for the World Census of Agriculture 2000 is the eighth in the series for promoting a global approach to agricultural census taking. The first and second programmes were sponsored by the International Institute for Agriculture (IITA) in 1930 and 1940. Subsequent ones up to 1990 were promoted by the Food and Agriculture Organization of the United Nations(FAO). FAO recommends that each country should conduct at least one agricultural census in each census programme decade and its programme for the World Census of Agriculture 2000 for instance corresponds to agricultural census to be undertaken during the decade 1996 to 2005. Many countries do not have sufficient resources for conducting an agricultural census. It therefore became an acceptable practice since 1960 to conduct agricultural census on sample basis for those countries lacking the resources required for a complete enumeration.
In Nigeria's case, a combination of complete enumeration and sample enumeration is adopted whereby the rural (peasant) holdings are covered on sample basis while the modern holdings are covered on complete enumeration. The project named “National Agricultural Sample Census” derives from this practice. Nigeria through the National Agricultural Sample Census (NASC) participated in the 1970's, 1980's, 1990's programmes of the World Census of Agriculture. Nigeria failed to conduct the Agricultural Census in 2003/2004 because of lack of funding. The NBS regular annual agriculture surveys since 1996 had been epileptic and many years of backlog of data set are still unprocessed. The baseline agricultural data is yet to be updated while the annual regular surveys suffered set back. There is an urgent need by the governments (Federal, State, LGA), sector agencies, FAO and other International Organizations to come together to undertake the agricultural census exercise which is long overdue. The conduct of 2006/2008 National Agricultural Sample Census Survey is now on course with the pilot exercise carried out in the third quarter of 2007.
The National Agricultural Sample Census (NASC) 2006/08 is imperative to the strengthening of the weak agricultural data in Nigeria. The project is phased into three sub-projects for ease of implementation; the Pilot Survey, Modern Agricultural Holding and the Main Census. It commenced in the third quarter of 2006 and to terminate in the first quarter of 2008. The pilot survey was implemented collaboratively by National Bureau of Statistics.
The main objective of the pilot survey was to test the adequacy of the survey instruments, equipments and administration of questionnaires, data processing arrangement and report writing. The pilot survey conducted in July 2007 covered the two NBS survey system-the National Integrated Survey of Households (NISH) and National Integrated Survey of Establishment (NISE). The survey instruments were designed to be applied using the two survey systems while the use of Geographic Positioning System (GPS) was introduced as additional new tool for implementing the project.
The Stakeholders workshop held at Kaduna on 21st-23rd May 2007 was one of the initial bench marks for the take off of the pilot survey. The pilot survey implementation started with the first level training (training of trainers) at the NBS headquarters between 13th - 15th June 2007. The second level training for all levels of field personnels was implemented at headquarters of the twelve (12) concerned states between 2nd - 6th July 2007. The field work of the pilot survey commenced on the 9th July and ended on the 13th of July 07. The IMPS and SPSS were the statistical packages used to develop the data entry programme.
State
Household based of fish farmers
The survey covered all de jure household members (usual residents), who were into fish production
Census/enumeration data [cen]
The survey was carried out in 12 states falling under 6 geo-political zones. 2 states were covered in each geo-political zone. 2 local government areas per selected state were studied. 2 Rural enumeration areas per local government area were covered and 3 Fishing farming housing units were systematically selected and canvassed .
There was deviations from the original sample design
Face-to-face [f2f]
The NASC fishery questionnaire was divided into the following sections: - Holding identification: This is to identify the holder through HU serial number, HH serial number, and demographic characteristics. - Type of fishing sites used by holder. - Sources and quantities of fishing inputs. - Quantity of aquatic production by type. - Quantity sold and value of sale of aquatic products. - Funds committed to fishing by source and others
The data processing and analysis plan involved five main stages: training of data processing staff; manual editing and coding; development of data entry programme; data entry and editing and tabulation. Census and Surveys Processing System (CSPro) software were used for data entry, Statistical Package for Social Sciences (SPSS) and CSPro for editing and a combination of SPSS, Statistical Analysis Software (SAS) and EXCEL for table generation. The subject-matter specialists and computer personnel from the NBS and CBN implemented the data processing work. Tabulation Plans were equally developed by these officers for their areas and topics covered in the three-survey system used for the exercise. The data editing is in 2 phases namely manual editing before the data entry were done. This involved using editors at the various zones to manually edit and ensure consistency in the information on the questionnaire. The second editing is the computer editing, this is the cleaning of the already enterd data. The completed questionnaires were collated and edited manually (a) Office editing and coding were done by the editor using visual control of the questionnaire before data entry (b) Cspro was used to design the data entry template provided as external resource (c) Ten operator plus two suppervissor and two progammer were used (d) Ten machines were used for data entry (e) After data entry data entry supervisor runs fequency on each section to see that all the questionnaire were enterd
Both Enumeration Area (EA) and Fish holders' level Response Rate was 100 per cent.
No computation of sampling error
The Quality Control measures were carried out during the survey, essentially to ensure quality of data
Facebook
TwitterSince the beginning of the 1960s, Statistics Sweden, in collaboration with various research institutions, has carried out follow-up surveys in the school system. These surveys have taken place within the framework of the IS project (Individual Statistics Project) at the University of Gothenburg and the UGU project (Evaluation through follow-up of students) at the University of Teacher Education in Stockholm, which since 1990 have been merged into a research project called 'Evaluation through Follow-up'. The follow-up surveys are part of the central evaluation of the school and are based on large nationally representative samples from different cohorts of students.
Evaluation through follow-up (UGU) is one of the country's largest research databases in the field of education. UGU is part of the central evaluation of the school and is based on large nationally representative samples from different cohorts of students. The longitudinal database contains information on nationally representative samples of school pupils from ten cohorts, born between 1948 and 2004. The sampling process was based on the student's birthday for the first two and on the school class for the other cohorts.
For each cohort, data of mainly two types are collected. School administrative data is collected annually by Statistics Sweden during the time that pupils are in the general school system (primary and secondary school), for most cohorts starting in compulsory school year 3. This information is provided by the school offices and, among other things, includes characteristics of school, class, special support, study choices and grades. Information obtained has varied somewhat, e.g. due to changes in curricula. A more detailed description of this data collection can be found in reports published by Statistics Sweden and linked to datasets for each cohort.
Survey data from the pupils is collected for the first time in compulsory school year 6 (for most cohorts). Questionnaire in survey in year 6 includes questions related to self-perception and interest in learning, attitudes to school, hobbies, school motivation and future plans. For some cohorts, questionnaire data are also collected in year 3 and year 9 in compulsory school and in upper secondary school.
Furthermore, results from various intelligence tests and standartized knowledge tests are included in the data collection year 6. The intelligence tests have been identical for all cohorts (except cohort born in 1987 from which questionnaire data were first collected in year 9). The intelligence test consists of a verbal, a spatial and an inductive test, each containing 40 tasks and specially designed for the UGU project. The verbal test is a vocabulary test of the opposite type. The spatial test is a so-called ‘sheet metal folding test’ and the inductive test are made up of series of numbers. The reliability of the test, intercorrelations and connection with school grades are reported by Svensson (1971).
For the first three cohorts (1948, 1953 and 1967), the standartized knowledge tests in year 6 consist of the standard tests in Swedish, mathematics and English that up to and including the beginning of the 1980s were offered to all pupils in compulsory school year 6. For the cohort 1972, specially prepared tests in reading and mathematics were used. The test in reading consists of 27 tasks and aimed to identify students with reading difficulties. The mathematics test, which was also offered for the fifth cohort, (1977) includes 19 assignments. After a changed version of the test, caused by the previously used test being judged to be somewhat too simple, has been used for the cohort born in 1982. Results on the mathematics test are not available for the 1987 cohort. The mathematics test was not offered to the students in the cohort in 1992, as the test did not seem to fully correspond with current curriculum intentions in mathematics. For further information, see the description of the dataset for each cohort.
For several of the samples, questionnaires were also collected from the students 'parents and teachers in year 6. The teacher questionnaire contains questions about the teacher, class size and composition, the teacher's assessments of the class' knowledge level, etc., school resources, working methods and parental involvement and questions about the existence of evaluations. The questionnaire for the guardians includes questions about the child's upbringing conditions, ambitions and wishes regarding the child's education, views on the school's objectives and the parents' own educational and professional situation.
The students are followed up even after they have left primary school. Among other things, data collection is done during the time they are in high school. Then school administrative data such as e.g. choice of upper secondary school line / program and grades after completing studies. For some of the cohorts, in addition to school administrative data, questionnaire data were also collected from the students.
he sample consisted of students born on the 5th, 15th and 25th of any month in 1953, a total of 10,723 students.
The data obtained in 1966 were: 1. School administrative data (school form, class type, year and grades). 2. Information about the parents' profession and education, number of siblings, the distance between home and school, etc.
This information was collected for 93% of all born on the current days. The reason for this is reduced resources for Statistics Sweden for follow-up work - reminders etc. Annual data for cohorts in 1953 were collected by Statistics Sweden up to and including academic year 1972/73.
Response rate for test and questionnaire data is 88% Standard test results were received for just over 85% of those who took the tests.
The sample included a total of 9955 students, for whom some form of information was obtained.
Part of the "Individual Statistics Project" together with cohort 1953.
Facebook
TwitterIn 1992, Bosnia-Herzegovina, one of the six republics in former Yugoslavia, became an independent nation. A civil war started soon thereafter, lasting until 1995 and causing widespread destruction and losses of lives. Following the Dayton accord, BosniaHerzegovina (BiH) emerged as an independent state comprised of two entities, namely, the Federation of Bosnia-Herzegovina (FBiH) and the Republika Srpska (RS), and the district of Brcko. In addition to the destruction caused to the physical infrastructure, there was considerable social disruption and decline in living standards for a large section of the population. Alongside these events, a period of economic transition to a market economy was occurring. The distributive impacts of this transition, both positive and negative, are unknown. In short, while it is clear that welfare levels have changed, there is very little information on poverty and social indicators on which to base policies and programs. In the post-war process of rebuilding the economic and social base of the country, the government has faced the problems created by having little relevant data at the household level. The three statistical organizations in the country (State Agency for Statistics for BiH -BHAS, the RS Institute of Statistics-RSIS, and the FBiH Institute of Statistics-FIS) have been active in working to improve the data available to policy makers: both at the macro and the household level. One facet of their activities is to design and implement a series of household series. The first of these surveys is the Living Standards Measurement Study survey (LSMS). Later surveys will include the Household Budget Survey (an Income and Expenditure Survey) and a Labour Force Survey. A subset of the LSMS households will be re-interviewed in the two years following the LSMS to create a panel data set.
The three statistical organizations began work on the design of the Living Standards Measurement Study Survey (LSMS) in 1999. The purpose of the survey was to collect data needed for assessing the living standards of the population and for providing the key indicators needed for social and economic policy formulation. The survey was to provide data at the country and the entity level and to allow valid comparisons between entities to be made. The LSMS survey was carried out in the Fall of 2001 by the three statistical organizations with financial and technical support from the Department for International Development of the British Government (DfID), United Nations Development Program (UNDP), the Japanese Government, and the World Bank (WB). The creation of a Master Sample for the survey was supported by the Swedish Government through SIDA, the European Commission, the Department for International Development of the British Government and the World Bank. The overall management of the project was carried out by the Steering Board, comprised of the Directors of the RS and FBiH Statistical Institutes, the Management Board of the State Agency for Statistics and representatives from DfID, UNDP and the WB. The day-to-day project activities were carried out by the Survey Management Team, made up of two professionals from each of the three statistical organizations. The Living Standard Measurement Survey LSMS, in addition to collecting the information necessary to obtain a comprehensive as possible measure of the basic dimensions of household living standards, has three basic objectives, as follows: 1. To provide the public sector, government, the business community, scientific institutions, international donor organizations and social organizations with information on different indicators of the population's living conditions, as well as on available resources for satisfying basic needs. 2. To provide information for the evaluation of the results of different forms of government policy and programs developed with the aim to improve the population's living standard. The survey will enable the analysis of the relations between and among different aspects of living standards (housing, consumption, education, health, labour) at a given time, as well as within a household. 3. To provide key contributions for development of government's Poverty Reduction Strategy Paper, based on analysed data.
National coverage
Households
Sample survey data [ssd]
(a) SAMPLE SIZE A total sample of 5,400 households was determined to be adequate for the needs of the survey: with 2,400 in the Republika Srpska and 3,000 in the Federation of BiH. The difficulty was in selecting a probability sample that would be representative of the country's population. The sample design for any survey depends upon the availability of information on the universe of households and individuals in the country. Usually this comes from a census or administrative records. In the case of BiH the most recent census was done in 1991. The data from this census were rendered obsolete due to both the simple passage of time but, more importantly, due to the massive population displacements that occurred during the war. At the initial stages of this project it was decided that a master sample should be constructed. Experts from Statistics Sweden developed the plan for the master sample and provided the procedures for its construction. From this master sample, the households for the LSMS were selected. Master Sample [This section is based on Peter Lynn's note "LSMS Sample Design and Weighting - Summary". April, 2002. Essex University, commissioned by DfID.] The master sample is based on a selection of municipalities and a full enumeration of the selected municipalities. Optimally, one would prefer smaller units (geographic or administrative) than municipalities. However, while it was considered that the population estimates of municipalities were reasonably accurate, this was not the case for smaller geographic or administrative areas. To avoid the error involved in sampling smaller areas with very uncertain population estimates, municipalities were used as the base unit for the master sample. The Statistics Sweden team proposed two options based on this same method, with the only difference being in the number of municipalities included and enumerated.
(b) SAMPLE DESIGN For reasons of funding, the smaller option proposed by the team was used, or Option B. Stratification of Municipalities The first step in creating the Master Sample was to group the 146 municipalities in the country into three strata- Urban, Rural and Mixed - within each of the two entities. Urban municipalities are those where 65 percent or more of the households are considered to be urban, and rural municipalities are those where the proportion of urban households is below 35 percent. The remaining municipalities were classified as Mixed (Urban and Rural) Municipalities. Brcko was excluded from the sampling frame. Urban, Rural and Mixed Municipalities: It is worth noting that the urban-rural definitions used in BiH are unusual with such large administrative units as municipalities classified as if they were completely homogeneous. Their classification into urban, rural, mixed comes from the 1991 Census which used the predominant type of income of households in the municipality to define the municipality. This definition is imperfect in two ways. First, the distribution of income sources may have changed dramatically from the pre-war times: populations have shifted, large industries have closed, and much agricultural land remains unusable due to the presence of land mines. Second, the definition is not comparable to other countries' where villages, towns and cities are classified by population size into rural or urban or by types of services and infrastructure available. Clearly, the types of communities within a municipality vary substantially in terms of both population and infrastructure. However, these imperfections are not detrimental to the sample design (the urban/rural definition may not be very useful for analysis purposes, but that is a separate issue).
Face-to-face [f2f]
(a) DATA ENTRY
An integrated approach to data entry and fieldwork was adopted in Bosnia and Herzegovina. Data entry proceeded side by side with data gathering to ensure verification and correction in the field. Data entry stations were located in the regional offices of the entity institutes and were equipped with computers, modem and a dedicated telephone line. The completed questionnaires were delivered to these stations each day for data entry. Twenty data entry operators (10 from Federation and 10 from RS) were trained in two training sessions held for a week each in Sarajevo and Banja Luka. The trainers were the staff of the two entity institutes who had undergone training in the CSPro software earlier and had participated in the workshops of the Pilot survey. Prior to the training, laptop computers were provided to the entity institutes, and the CSPro software was installed in them. The training for the data entry operators covered the following elements:
Facebook
TwitterThe 1997 Jordan Population and Family Health Survey (JPFHS) is a national sample survey carried out by the Department of Statistics (DOS) as part of its National Household Surveys Program (NHSP). The JPFHS was specifically aimed at providing information on fertility, family planning, and infant and child mortality. Information was also gathered on breastfeeding, on maternal and child health care and nutritional status, and on the characteristics of households and household members. The survey will provide policymakers and planners with important information for use in formulating informed programs and policies on reproductive behavior and health.
National
Sample survey data
SAMPLE DESIGN AND IMPLEMENTATION
The 1997 JPFHS sample was designed to produce reliable estimates of major survey variables for the country as a whole, for urban and rural areas, for the three regions (each composed of a group of governorates), and for the three major governorates, Amman, Irbid, and Zarqa.
The 1997 JPFHS sample is a subsample of the master sample that was designed using the frame obtained from the 1994 Population and Housing Census. A two-stage sampling procedure was employed. First, primary sampling units (PSUs) were selected with probability proportional to the number of housing units in the PSU. A total of 300 PSUs were selected at this stage. In the second stage, in each selected PSU, occupied housing units were selected with probability inversely proportional to the number of housing units in the PSU. This design maintains a self-weighted sampling fraction within each governorate.
UPDATING OF SAMPLING FRAME
Prior to the main fieldwork, mapping operations were carried out and the sample units/blocks were selected and then identified and located in the field. The selected blocks were delineated and the outer boundaries were demarcated with special signs. During this process, the numbers on buildings and housing units were updated, listed and documented, along with the name of the owner/tenant of the unit or household and the name of the household head. These activities took place between January 7 and February 28, 1997.
Note: See detailed description of sample design in APPENDIX A of the survey report.
Face-to-face
The 1997 JPFHS used two questionnaires, one for the household interview and the other for eligible women. Both questionnaires were developed in English and then translated into Arabic. The household questionnaire was used to list all members of the sampled households, including usual residents as well as visitors. For each member of the household, basic demographic and social characteristics were recorded and women eligible for the individual interview were identified. The individual questionnaire was developed utilizing the experience gained from previous surveys, in particular the 1983 and 1990 Jordan Fertility and Family Health Surveys (JFFHS).
The 1997 JPFHS individual questionnaire consists of 10 sections: - Respondent’s background - Marriage - Reproduction (birth history) - Contraception - Pregnancy, breastfeeding, health and immunization - Fertility preferences - Husband’s background, woman’s work and residence - Knowledge of AIDS - Maternal mortality - Height and weight of children and mothers.
Fieldwork and data processing activities overlapped. After a week of data collection, and after field editing of questionnaires for completeness and consistency, the questionnaires for each cluster were packaged together and sent to the central office in Amman where they were registered and stored. Special teams were formed to carry out office editing and coding.
Data entry started after a week of office data processing. The process of data entry, editing, and cleaning was done by means of the ISSA (Integrated System for Survey Analysis) program DHS has developed especially for such surveys. The ISSA program allows data to be edited while being entered. Data entry was completed on November 14, 1997. A data processing specialist from Macro made a trip to Jordan in November and December 1997 to identify problems in data entry, editing, and cleaning, and to work on tabulations for both the preliminary and final report.
A total of 7,924 occupied housing units were selected for the survey; from among those, 7,592 households were found. Of the occupied households, 7,335 (97 percent) were successfully interviewed. In those households, 5,765 eligible women were identified, and complete interviews were obtained with 5,548 of them (96 percent of all eligible women). Thus, the overall response rate of the 1997 JPFHS was 93 percent. The principal reason for nonresponse among the women was the failure of interviewers to find them at home despite repeated callbacks.
Note: See summarized response rates by place of residence in Table 1.1 of the survey report.
The estimates from a sample survey are subject to two types of errors: nonsampling errors and sampling errors. Nonsampling errors are the result of mistakes made in implementing data collection and data processing (such as failure to locate and interview the correct household, misunderstanding questions either by the interviewer or the respondent, and data entry errors). Although during the implementation of the 1997 JPFHS numerous efforts were made to minimize this type of error, nonsampling errors are not only impossible to avoid but also difficult to evaluate statistically.
Sampling errors, on the other hand, can be evaluated statistically. The respondents selected in the 1997 JPFHS constitute only one of many samples that could have been selected from the same population, given the same design and expected size. Each of those samples would have yielded results differing somewhat from the results of the sample actually selected. Sampling errors are a measure of the variability among all possible samples. Although the degree of variability is not known exactly, it can be estimated from the survey results.
A sampling error is usually measured in terms of the standard error for a particular statistic (mean, percentage, etc.), which is the square root of the variance. The standard error can be used to calculate confidence intervals within which the true value for the population can reasonably be assumed to fall. For example, for any given statistic calculated from a sample survey, the value of that statistic will fall within a range of plus or minus two times the standard error of that statistic in 95 percent of all possible samples of identical size and design.
If the sample of respondents had been selected as a simple random sample, it would have been possible to use straightforward formulas for calculating sampling errors. However, since the 1997 JDHS-II sample resulted from a multistage stratified design, formulae of higher complexity had to be used. The computer software used to calculate sampling errors for the 1997 JDHS-II was the ISSA Sampling Error Module, which uses the Taylor linearization method of variance estimation for survey estimates that are means or proportions. The Jackknife repeated replication method is used for variance estimation of more complex statistics, such as fertility and mortality rates.
Note: See detailed estimate of sampling error calculation in APPENDIX B of the survey report.
Data Quality Tables - Household age distribution - Age distribution of eligible and interviewed women - Completeness of reporting - Births by calendar years - Reporting of age at death in days - Reporting of age at death in months
Note: See detailed tables in APPENDIX C of the survey report.
Facebook
TwitterCreated a multi-tab Excel statistical project where I generated synthetic normally-distributed data, built random sample extraction logic, calculated descriptive and inferential statistics, analysed variable correlations and performed linear regression with visualisation.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Transcriptome statistics from samples obtained on LMG1411.
Diatom isolates were obtained from the Western Antarctic Peninsula surface waters.
Facebook
Twitterhttps://www.icpsr.umich.edu/web/ICPSR/studies/3565/termshttps://www.icpsr.umich.edu/web/ICPSR/studies/3565/terms
This survey, the sixth in the Bureau of Justice Statistics' program on Law Enforcement and Administrative Statistics (LEMAS), presents information on law enforcement agencies in the United States: state police, county police, special police (state and local), municipal police, and sheriff's departments. Variables include size of the population served by the police or sheriff's department, levels of employment and spending, various functions of the department, average salary levels for uniformed officers, policies and programs, and other matters related to management and personnel.This survey, the sixth in the Bureau of Justice Statistics' program on Law Enforcement and Administrative Statistics (LEMAS), presents information on law enforcement agencies in the United States: state police, county police, special police (state and local), municipal police, and sheriff's departments. Variables include size of the population served by the police or sheriff's department, levels of employment and spending, various functions of the department, average salary levels for uniformed officers, policies and programs, and other matters related to management and personnel.
Facebook
TwitterThis data set records the statistical data of GDP of Qinghai Province from 1978 to 2000, which is divided by project and year. The data are collected from the statistical yearbook of Qinghai Province issued by the Bureau of statistics of Qinghai Province. The data set consists of three data tables, which are: GDP of sub projects and sub industries 1978-1998.xls, GDP of sub projects and sub industries 1978-1999.xls, GDP of sub projects and sub industries 1978-2000.xls. The data table structure is the same. For example, there are 13 fields in the data table from 1978 to 1998 Field 1: Project Field 2: 1978 Field 3: 1980 Field 4: 1985 Field 5: 1990 Field 6: 1991 Field 7: 1992 Field 8: 1993 Field 9:1994 Field 10:1995 Field 11:1996 Field 12:1997 Field 13:1998
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Statistically underpowered studies can result in experimental failure even when all other experimental considerations have been addressed impeccably. In fMRI the combination of a large number of dependent variables, a relatively small number of observations (subjects), and a need to correct for multiple comparisons can decrease statistical power dramatically. This problem has been clearly addressed yet remains controversial—especially in regards to the expected effect sizes in fMRI, and especially for between-subjects effects such as group comparisons and brain-behavior correlations. We aimed to clarify the power problem by considering and contrasting two simulated scenarios of such possible brain-behavior correlations: weak diffuse effects and strong localized effects. Sampling from these scenarios shows that, particularly in the weak diffuse scenario, common sample sizes (n = 20–30) display extremely low statistical power, poorly represent the actual effects in the full sample, and show large variation on subsequent replications. Empirical data from the Human Connectome Project resembles the weak diffuse scenario much more than the localized strong scenario, which underscores the extent of the power problem for many studies. Possible solutions to the power problem include increasing the sample size, using less stringent thresholds, or focusing on a region-of-interest. However, these approaches are not always feasible and some have major drawbacks. The most prominent solutions that may help address the power problem include model-based (multivariate) prediction methods and meta-analyses with related synthesis-oriented approaches.
Facebook
TwitterThe Project for Statistics on Living standards and Development was a countrywide World Bank Living Standards Measurement Survey. It covered approximately 9000 households, drawn from a representative sample of South African households. The fieldwork was undertaken during the nine months leading up to the country's first democratic elections at the end of April 1994. The purpose of the survey was to collect statistical information about the conditions under which South Africans live in order to provide policymakers with the data necessary for planning strategies. This data would aid the implementation of goals such as those outlined in the Government of National Unity's Reconstruction and Development Programme.
National
Households
All Household members. Individuals in hospitals, old age homes, hotels and hostels of educational institutions were not included in the sample. Migrant labour hostels were included. In addition to those that turned up in the selected ESDs, a sample of three hostels was chosen from a national list provided by the Human Sciences Research Council and within each of these hostels a representative sample was drawn on a similar basis as described above for the households in ESDs.
Sample survey data [ssd]
(a) SAMPLING DESIGN
Sample size is 9,000 households. The sample design adopted for the study was a two-stage self-weighting design in which the first stage units were Census Enumerator Subdistricts (ESDs, or their equivalent) and the second stage were households. The advantage of using such a design is that it provides a representative sample that need not be based on accurate census population distribution in the case of South Africa, the sample will automatically include many poor people, without the need to go beyond this and oversample the poor. Proportionate sampling as in such a self-weighting sample design offers the simplest possible data files for further analysis, as weights do not have to be added. However, in the end this advantage could not be retained, and weights had to be added.
(b) SAMPLE FRAME
The sampling frame was drawn up on the basis of small, clearly demarcated area units, each with a population estimate. The nature of the self-weighting procedure adopted ensured that this population estimate was not important for determining the final sample, however. For most of the country, census ESDs were used. Where some ESDs comprised relatively large populations as for instance in some black townships such as Soweto, aerial photographs were used to divide the areas into blocks of approximately equal population size. In other instances, particularly in some of the former homelands, the area units were not ESDs but villages or village groups. In the sample design chosen, the area stage units (generally ESDs) were selected with probability proportional to size, based on the census population. Systematic sampling was used throughout that is, sampling at fixed interval in a list of ESDs, starting at a randomly selected starting point. Given that sampling was self-weighting, the impact of stratification was expected to be modest. The main objective was to ensure that the racial and geographic breakdown approximated the national population distribution. This was done by listing the area stage units (ESDs) by statistical region and then within the statistical region by urban or rural. Within these sub-statistical regions, the ESDs were then listed in order of percentage African. The sampling interval for the selection of the ESDs was obtained by dividing the 1991 census population of 38,120,853 by the 300 clusters to be selected. This yielded 105,800. Starting at a randomly selected point, every 105,800th person down the cluster list was selected. This ensured both geographic and racial diversity (ESDs were ordered by statistical sub-region and proportion of the population African). In three or four instances, the ESD chosen was judged inaccessible and replaced with a similar one. In the second sampling stage the unit of analysis was the household. In each selected ESD a listing or enumeration of households was carried out by means of a field operation. From the households listed in an ESD a sample of households was selected by systematic sampling. Even though the ultimate enumeration unit was the household, in most cases "stands" were used as enumeration units. However, when a stand was chosen as the enumeration unit all households on that stand had to be interviewed.
Face-to-face [f2f]
All the questionnaires were checked when received. Where information was incomplete or appeared contradictory, the questionnaire was sent back to the relevant survey organization. As soon as the data was available, it was captured using local development platform ADE. This was completed in February 1994. Following this, a series of exploratory programs were written to highlight inconsistencies and outlier. For example, all person level files were linked together to ensure that the same person code reported in different sections of the questionnaire corresponded to the same person. The error reports from these programs were compared to the questionnaires and the necessary alterations made. This was a lengthy process, as several files were checked more than once, and completed at the beginning of August 1994. In some cases, questionnaires would contain missing values, or comments that the respondent did not know, or refused to answer a question.
These responses are coded in the data files with the following values: VALUE MEANING -1 : The data was not available on the questionnaire or form -2 : The field is not applicable -3 : Respondent refused to answer -4 : Respondent did not know answer to question
The data collected in clusters 217 and 218 should be viewed as highly unreliable and therefore removed from the data set. The data currently available on the web site has been revised to remove the data from these clusters. Researchers who have downloaded the data in the past should revise their data sets. For information on the data in those clusters, contact SALDRU http://www.saldru.uct.ac.za/.