https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
Professional organizations in STEM (science, technology, engineering, and mathematics) can use demographic data to quantify recruitment and retention (R&R) of underrepresented groups within their memberships. However, variation in the types of demographic data collected can influence the targeting and perceived impacts of R&R efforts - e.g., giving false signals of R&R for some groups. We obtained demographic surveys from 73 U.S.-affiliated STEM organizations, collectively representing 712,000 members and conference-attendees. We found large differences in the demographic categories surveyed (e.g., disability status, sexual orientation) and the available response options. These discrepancies indicate a lack of consensus regarding the demographic groups that should be recognized and, for groups that are omitted from surveys, an inability of organizations to prioritize and evaluate R&R initiatives. Aligning inclusive demographic surveys across organizations will provide baseline data that can be used to target and evaluate R&R initiatives to better serve underrepresented groups throughout STEM. Methods We surveyed 164 STEM organizations (73 responses, rate = 44.5%) between December 2020 and July 2021 with the goal of understanding what demographic data each organization collects from its constituents (i.e., members and conference-attendees) and how the data are used. Organizations were sourced from a list of professional societies affiliated with the American Association for the Advancement of Science, AAAS, (n = 156) or from social media (n = 8). The survey was sent to the elected leadership and management firms for each organization, and follow-up reminders were sent after one month. The responding organizations represented a wide range of fields: 31 life science organizations (157,000 constituents), 5 mathematics organizations (93,000 constituents), 16 physical science organizations (207,000 constituents), 7 technology organizations (124,000 constituents), and 14 multi-disciplinary organizations spanning multiple branches of STEM (131,000 constituents). A list of the responding organizations is available in the Supplementary Materials. Based on the AAAS-affiliated recruitment of the organizations and the similar distribution of constituencies across STEM fields, we conclude that the responding organizations are a representative cross-section of the most prominent STEM organizations in the U.S. Each organization was asked about the demographic information they collect from their constituents, the response rates to their surveys, and how the data were used. Survey description The following questions are written as presented to the participating organizations. Question 1: What is the name of your STEM organization? Question 2: Does your organization collect demographic data from your membership and/or meeting attendees? Question 3: When was your organization’s most recent demographic survey (approximate year)? Question 4: We would like to know the categories of demographic information collected by your organization. You may answer this question by either uploading a blank copy of your organization’s survey (linked provided in online version of this survey) OR by completing a short series of questions. Question 5: On the most recent demographic survey or questionnaire, what categories of information were collected? (Please select all that apply)
Disability status Gender identity (e.g., male, female, non-binary) Marital/Family status Racial and ethnic group Religion Sex Sexual orientation Veteran status Other (please provide)
Question 6: For each of the categories selected in Question 5, what options were provided for survey participants to select? Question 7: Did the most recent demographic survey provide a statement about data privacy and confidentiality? If yes, please provide the statement. Question 8: Did the most recent demographic survey provide a statement about intended data use? If yes, please provide the statement. Question 9: Who maintains the demographic data collected by your organization? (e.g., contracted third party, organization executives) Question 10: How has your organization used members’ demographic data in the last five years? Examples: monitoring temporal changes in demographic diversity, publishing diversity data products, planning conferences, contributing to third-party researchers. Question 11: What is the size of your organization (number of members or number of attendees at recent meetings)? Question 12: What was the response rate (%) for your organization’s most recent demographic survey? *Organizations were also able to upload a copy of their demographics survey instead of responding to Questions 5-8. If so, the uploaded survey was used (by the study authors) to evaluate Questions 5-8.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Open Science in (Higher) Education – data of the February 2017 survey
This data set contains:
Survey structure
The survey includes 24 questions and its structure can be separated in five major themes: material used in courses (5), OER awareness, usage and development (6), collaborative tools used in courses (2), assessment and participation options (5), demographics (4). The last two questions include an open text questions about general issues on the topics and singular open education experiences, and a request on forwarding the respondent’s e-mail address for further questionings. The online survey was created with Limesurvey[1]. Several questions include filters, i.e. these questions were only shown if a participants did choose a specific answer beforehand ([n/a] in Excel file, [.] In SPSS).
Demographic questions
Demographic questions asked about the current position, the discipline, birth year and gender. The classification of research disciplines was adapted to general disciplines at German higher education institutions. As we wanted to have a broad classification, we summarised several disciplines and came up with the following list, including the option “other” for respondents who do not feel confident with the proposed classification:
The current job position classification was also chosen according to common positions in Germany, including positions with a teaching responsibility at higher education institutions. Here, we also included the option “other” for respondents who do not feel confident with the proposed classification:
We chose to have a free text (numerical) for asking about a respondent’s year of birth because we did not want to pre-classify respondents’ age intervals. It leaves us options to have different analysis on answers and possible correlations to the respondents’ age. Asking about the country was left out as the survey was designed for academics in Germany.
Remark on OER question
Data from earlier surveys revealed that academics suffer confusion about the proper definition of OER[2]. Some seem to understand OER as free resources, or only refer to open source software (Allen & Seaman, 2016, p. 11). Allen and Seaman (2016) decided to give a broad explanation of OER, avoiding details to not tempt the participant to claim “aware”. Thus, there is a danger of having a bias when giving an explanation. We decided not to give an explanation, but keep this question simple. We assume that either someone knows about OER or not. If they had not heard of the term before, they do not probably use OER (at least not consciously) or create them.
Data collection
The target group of the survey was academics at German institutions of higher education, mainly universities and universities of applied sciences. To reach them we sent the survey to diverse institutional-intern and extern mailing lists and via personal contacts. Included lists were discipline-based lists, lists deriving from higher education and higher education didactic communities as well as lists from open science and OER communities. Additionally, personal e-mails were sent to presidents and contact persons from those communities, and Twitter was used to spread the survey.
The survey was online from Feb 6th to March 3rd 2017, e-mails were mainly sent at the beginning and around mid-term.
Data clearance
We got 360 responses, whereof Limesurvey counted 208 completes and 152 incompletes. Two responses were marked as incomplete, but after checking them turned out to be complete, and we added them to the complete responses dataset. Thus, this data set includes 210 complete responses. From those 150 incomplete responses, 58 respondents did not answer 1st question, 40 respondents discontinued after 1st question. Data shows a constant decline in response answers, we did not detect any striking survey question with a high dropout rate. We deleted incomplete responses and they are not in this data set.
Due to data privacy reasons, we deleted seven variables automatically assigned by Limesurvey: submitdate, lastpage, startlanguage, startdate, datestamp, ipaddr, refurl. We also deleted answers to question No 24 (email address).
References
Allen, E., & Seaman, J. (2016). Opening the Textbook: Educational Resources in U.S. Higher Education, 2015-16.
First results of the survey are presented in the poster:
Heck, Tamara, Blümel, Ina, Heller, Lambert, Mazarakis, Athanasios, Peters, Isabella, Scherp, Ansgar, & Weisel, Luzian. (2017). Survey: Open Science in Higher Education. Zenodo. http://doi.org/10.5281/zenodo.400561
Contact:
Open Science in (Higher) Education working group, see http://www.leibniz-science20.de/forschung/projekte/laufende-projekte/open-science-in-higher-education/.
[1] https://www.limesurvey.org
[2] The survey question about the awareness of OER gave a broad explanation, avoiding details to not tempt the participant to claim “aware”.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The questions list for questionnaire – Demographics and basic work characteristics of survey respondents
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
The STAMINA study examined the nutritional risks of low-income peri-urban mothers, infants and young children (IYC), and households in Peru during the COVID-19 pandemic. The study was designed to capture information through three, repeated cross-sectional surveys at approximately 6 month intervals over an 18 month period, starting in December 2020. The surveys were carried out by telephone in November-December 2020, July-August 2021 and in February-April 2022. The third survey took place over a longer period to allow for a household visit after the telephone interview.The study areas were Manchay (Lima) and Huánuco district in the Andean highlands (~ 1900m above sea level).In each study area, we purposively selected the principal health centre and one subsidiary health centre. Peri-urban communities under the jurisdiction of these health centres were then selected to participate. Systematic random sampling was employed with quotas for IYC age (6-11, 12-17 and 18-23 months) to recruit a target sample size of 250 mother-infant pairs for each survey.Data collected included: household socio-demographic characteristics; infant and young child feeding practices (IYCF), child and maternal qualitative 24-hour dietary recalls/7 day food frequency questionnaires, household food insecurity experience measured using the validated Food Insecurity Experience Scale (FIES) survey module (Cafiero, Viviani, & Nord, 2018), and maternal mental health.In addition, questions that assessed the impact of COVID-19 on households including changes in employment status, adaptations to finance, sources of financial support, household food insecurity experience as well as access to, and uptake of, well-child clinics and vaccination health services were included.This folder includes the questionnaire for survey 3 in both English and Spanish languages.The corresponding dataset and dictionary of variables for survey 3 are available at 10.17028/rd.lboro.21741014
https://borealisdata.ca/api/datasets/:persistentId/versions/7.1/customlicense?persistentId=doi:10.7939/DVN/10004https://borealisdata.ca/api/datasets/:persistentId/versions/7.1/customlicense?persistentId=doi:10.7939/DVN/10004
The Population Research Laboratory (PRL), a member of the Association of Academic Survey Research Organizations (AASRO), seeks to advance the research, education and service goals of the University of Alberta by helping academic researchers and policy makers design and implement applied social science research projects. The PRL specializes in the gathering, analysis, and presentation of data about demographic, social and public issues. The PRL research team provides expert consultation and implementation of quantitative and qualitative research methods, project design, sample design, web-based, paper-based and telephone surveys, field site testing, data analysis and report writing. The PRL follows scientifically rigorous and transparent methods in each phase of a research project. Research Coordinators are members of the American Association for Public Opinion Research (AAPOR) and use best practices when conducting all types of research. The PRL has particular expertise in conducting computer-assisted telephone interviews (referred to as CATI surveys). When conducting telephone surveys, all calls are displayed as being from the "U of A PRL", a procedure that assures recipients that the call is not from a telemarketer, and thus helps increase response rates. The PRL maintains a complement of highly skilled telephone interviewers and supervisors who are thoroughly trained in FOIPP requirements, respondent selection procedures, questionnaire instructions, and neutral probing. A subset of interviewers are specially trained to convince otherwise reluctant respondents to participate in the study, a practice that increases response rates and lowers selection bias. PRL staff monitors data collection on a daily basis to allow any necessary adjustments to the volume and timing of calls and respondent selection criteria. The Population Research Laboratory (PRL) administered the 2012 Alberta Survey B. This survey of households across the province of Alberta continues to enable academic researchers, government departments, and non-profit organizations to explore a wide range of topics in a structured research framework and environment. Sponsors' research questions are asked together with demographic questions in a telephone interview of Alberta households. This data consists of the information from 1207 Alberta residence, interviewed between June 5, 2012 and June 27, 2012. The amount of responses indicates that the response rate, as calculated percentages representing the number of people who participated in the survey divided by the number selected in the eligible sample, was 27.6% for survey B. The subject ares included in the 2012 Alberta Survey B includes socio-demographic and background variables such as: household composition, age, gender, marital status, highest level of education, household income, religion, ethnic background, place of birth, employment status, home ownership, political party support and perceptions of financial status. In addition, the topics of public health and injury control, tobacco reduction, activity limitations and personal directives, unions, politics and health.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The research employed a mixed methods online survey to understand better the meaning, use, and development of academic research software at the University of Illinois Urbana-Champaign. Other objectives include understanding academic research software support and training needs to make projects successful at Illinois, as well as investigating the use of generative AI tools in using and creating research software.
At the beginning of the survey, all participants gave informed consent. The University of Illinois Urbana-Champaign Institutional Review Board (IRB Protocol no.: Project IRB24-0989) reviewed the study and gave it an exempt determination.
Data collection took place from August 2024 to October 2024. Prior to data analysis, identifiable respondent details were removed during the data cleaning process. Not Applicable and Unsure style responses were used for descriptive statistics, but these responses were excluded for inferential statistics.
Survey design
At the beginning of the online survey, a consent form was provided based on guidelines from the University of Illinois Institutional Review Board to the respondents stating the aims of the study, its benefits and risks, ethical guidelines, being a voluntary survey for participation and withdrawal, privacy and confidentiality, data security, estimated time for survey completion, and contact information of researchers for asking questions. Respondents clicked to indicate their consent. Survey questions were divided into four parts: demographic information, using software for research, creating software for research, and the protocol of citing software for research. The survey had to stop points, whereby not all questions applied to respondents, which led to different sample sizes at the stop points. At the opening of the survey, the number of respondents was 251 with the funding demographic question being answered by all respondents, while other demographic questions had between 225 and 228 respondents answering them. For the first stop question, using research software in their research, the total respondents was 212, and at the last stop question, respondents considering themselves to be research developers, the total number of respondents was 74. The last question of the survey was answered by 71 respondents. Respondents may also have left the survey for other reasons. The questions were primarily closed-type questions with single choice, multiple choice, or Likert scale, as well as a few open-ended questions. Likert scale responses were created utilizing validated scales from Vagias' (2006) Likert Type Scale Response Anchors.
Sampling
Survey Respondents’ Demographics
While most respondents were Tenure Track Faculty (34.7%, f=227), other key categories included Principal Investigator (22.4%, f=227) and Research Scientist (12.1%, f=227). Computer Science, Information Science, Mathematics, and Engineering fields combined for 16% (f=228) of the respondents surveyed, but it should be noted the remaining respondents were from various academic fields across campus from various arts, humanities, and social science fields (25%, f=228) to agriculture (10%, f=228), education (5%, f=228), economics (3%, f=228), medical sciences (4%, f=228), and politics and policy/law (1%, f=228). Most respondents were likely to receive funding from various government agencies. A more detailed breakdown of the demographic information can be found in the supplemental figures. Of the 74 respondents who answered whether they were a research software developer, most respondents did not consider themselves a research software developer, with respondents stating Not at All (39%, n=74) and Slightly (22%, n=74). In addition, open-ended questions asked for further detail about research software titles used in research, research software developer challenges, how generative AI assisted in creating research software, and how research software is preserved (e.g., reproducibility).
Table 1: Survey Respondents’ Demographics
Characteristics
Respondent (%)
Age
18-24
25-34
35-44
45-54
55-64
Over 64
Preferred Not Answer
3%
14%
33%
27%
14%
7%
2%
Gender
Woman
Man
Non-binary / non-conforming
Prefer not to answer
49%
44%
2%
4%
Race
Asian
Black or African American
Hispanic or Latino
Middle Eastern or North African (MENA; new)
White
Prefer not to answer
Other
12%
5%
6%
1%
67%
8%
1%
Highest Degree
Bachelors
Masters
Professional degree (e.g., J.D.)
Doctorate
6%
19%
5%
70%
Professional Title
Tenure Track Faculty
Principal Investigator
Research Scientist
Staff
Research Faculty
Other
Teaching Faculty
Postdoc
Research Assistant
Research Software Engineer
35%
22%
12%
8%
7%
4%
4%
4%
2%
2%
Academic Field
Biological Sciences
Other
Agriculture
Engineering
Psychology
Earth Sciences
Physical Sciences
Education
Medical & Health Sciences
Computer Science
Library
Chemical Sciences
Human Society
Economics
Information Science
Environment
Veterinary
Mathematical Sciences
History
Architecture
Politics and Policy
Law
18%
10%
10%
9%
8%
6%
6%
5%
4%3%
3%
3%
3%
3%
2%
2%
2%
2%
1%
1%
1%
0%
Years Since Last Degree
Less than 1 Year
1-2 Years
3-5 Years
6-9 Years
10-15 Years
More than 15 Years
4%
8%
11%
14%
24%
40%
Receive Funding
Yes
No
73%
27%
Funders for Research
Other
National Science Foundation (NSF)
United States Department of Agriculture (USDA)
National Institute of Health (NIH)
Department of Energy (DOE)
Department of Defense (DOD)
Environmental Protection Agency (EPA)
National Aeronautics and Space Administration (NASA)
Bill and Melinda Gates Foundation
Advanced Research Projects Agency - Energy (ARPA-E)
Institute of Education Sciences
Alfred P. Sloan Foundation
W.M. Keck Foundation
Simons Foundation
Gordon and Betty Moore Foundation
Department of Justice (DOJ)
National Endowment for the Humanities (NEH)
Congressionally Directed Medical Research Programs (CDMRP)
Andrew W. Mellon Foundation
22%
18%
18%
11%
9%
5%
4%
4%
2%
2%
1%
1%
1%
1%
1%
1%
0%
0%
0%
Table 2: Survey Codebook
QuestionID
Variable
Variable Label
Survey Item
Response Options
1
age
Respondent’s Age
Section Header:
Demographics Thank you for your participation in this survey today! Before you begin to answer questions about academic research software, please answer a few demographic questions to better contextualize your responses to other survey questions.
What is your age?
Select one choice.
Years
1-Under 18
2-18-24
3-25-34
4-35-44
5-45-54
6-55-64
7-Over 64
8-Prefer not to answer
2
gender
Respondent’s Gender
What is your gender?
Select one choice.
1-Female
2-Male
3-Transgender
4-Non-binary / non-conforming
5-Prefer not to answer
6-Other:
3
race
Respondent’s Race
What is your race?
Select one choice.
1-American Indian or Alaska Native
2-Asian
3-Black or African American
4-Hispanic or Latino
5-Middle Eastern or North African (MENA; new)
6-Native Hawaiian or Pacific Islander
7-White
8-Prefer not to answer
9-Other:
4
highest_degree
Respondent’s Highest Degree
What is the highest degree you have completed?
Select one choice.
1-None
2-High school
3-Associate
4-Bachelor's
5-Master's
6-Professional degree (e.g., J.D.)
7-Doctorate
8-Other:
5
professional_title
Respondent’s Professional Title
What is your professional title?
Select all that apply.
1-professional_title_1
Principal Investigator
2-professional_title_2
Tenure Track Faculty
3-professional_title_3
Teaching Faculty
4-professional_title_4
Research Faculty
5-professional_title_5
Research Scientist
6-professional_title_6
Research Software Engineer
7-professional_title_7
Staff
8-professional_title_8
Postdoc
9-professional_title_9
Research Assistant
10-professional_title_10
Other:
6
academic_field
Respondent’s most strongly identified Academic Field
What is the academic field or discipline you most strongly identify with (e.g., Psychology, Computer Science)?
Select one choice.
1-Chemical sciences
2-Biological sciences
3-Medical & health sciences
4-Physical sciences
5-Mathematical sciences
6-Earth sciences
7-Agriculture
8-Veterinary
9-Environment
10-Psychology
11-Law
12-Philosophy
13-Economics
14-Human society
15-Journalism
16-Library
17-Education
18-Art & Design Management
19-Engineering
20-Language
21-History
22-Politics and policy
23-Architecture
24-Computer Science
25-Information science
26-Other:
7
years_since_last_degree
Number of years since last respondent’s last degree
How many years since the award of your last completed degree?
Select one choice.
1-Less than 1 year
2-1-2 years
3-3-5 years
4-6-9 years
5-10-15 years
6-More than 15 years
8
receive_funding_for_research
Whether respondent received funding for research
Do you receive funding for your research?
1-Yes
0-No
9
funders_for_research
Respondent’s funding sources if they answered yes in Question 8
Who funds your research or work (e.g., NIH, Gates Foundation)?
Select all that apply.
1-funders_for_research_1
United States Department of Agriculture (USDA)
2-funders_for_research_2
Department of Energy (DOE)
3-funders_for_research_3
National Science
The Afrobarometer is a comparative series of public attitude surveys that assess African citizen's attitudes to democracy and governance, markets, and civil society, among other topics. The surveys have been undertaken at periodic intervals since 1999. The Afrobarometer's coverage has increased over time. Round 1 (1999-2001) initially covered 7 countries and was later extended to 12 countries. Round 2 (2002-2004) surveyed citizens in 16 countries. Round 3 (2005-2006) 18 countries, Round 4 (2008) 20 countries, Round 5 (2011-2013) 34 countries, Round 6 (2014-2015) 36 countries, and Round 7 (2016-2018) 34 countries. The survey covered 34 countries in Round 8 (2019-2021).
National coverage
Individual
Citizens aged 18 years and above excluding those living in institutionalized buildings.
Sample survey data [ssd]
Afrobarometer uses national probability samples designed to meet the following criteria. Samples are designed to generate a sample that is a representative cross-section of all citizens of voting age in a given country. The goal is to give every adult citizen an equal and known chance of being selected for an interview. They achieve this by:
• using random selection methods at every stage of sampling; • sampling at all stages with probability proportionate to population size wherever possible to ensure that larger (i.e., more populated) geographic units have a proportionally greater probability of being chosen into the sample.
The sampling universe normally includes all citizens age 18 and older. As a standard practice, we exclude people living in institutionalized settings, such as students in dormitories, patients in hospitals, and persons in prisons or nursing homes. Occasionally, we must also exclude people living in areas determined to be inaccessible due to conflict or insecurity. Any such exclusion is noted in the technical information report (TIR) that accompanies each data set.
Sample size and design Samples usually include either 1,200 or 2,400 cases. A randomly selected sample of n=1200 cases allows inferences to national adult populations with a margin of sampling error of no more than +/-2.8% with a confidence level of 95 percent. With a sample size of n=2400, the margin of error decreases to +/-2.0% at 95 percent confidence level.
The sample design is a clustered, stratified, multi-stage, area probability sample. Specifically, we first stratify the sample according to the main sub-national unit of government (state, province, region, etc.) and by urban or rural location.
Area stratification reduces the likelihood that distinctive ethnic or language groups are left out of the sample. Afrobarometer occasionally purposely oversamples certain populations that are politically significant within a country to ensure that the size of the sub-sample is large enough to be analysed. Any oversamples is noted in the TIR.
Sample stages Samples are drawn in either four or five stages:
Stage 1: In rural areas only, the first stage is to draw secondary sampling units (SSUs). SSUs are not used in urban areas, and in some countries they are not used in rural areas. See the TIR that accompanies each data set for specific details on the sample in any given country. Stage 2: We randomly select primary sampling units (PSU). Stage 3: We then randomly select sampling start points. Stage 4: Interviewers then randomly select households. Stage 5: Within the household, the interviewer randomly selects an individual respondent. Each interviewer alternates in each household between interviewing a man and interviewing a woman to ensure gender balance in the sample.
To keep the costs and logistics of fieldwork within manageable limits, eight interviews are clustered within each selected PSU.
Gabon - Sample size: 1,200 - Sampling Frame: Recensement Général de la Population et des Logements (RGPL) de 2013 réalisée par la Direction Générale de la Statistique et des Etudes Economiques - Sample design: Representative, random, clustered, stratified, multi-stage area probability sample - Stratification: Province, Department, and urban-rural location - Stages: Primary sampling unit (PSU), start points, households, respondents - PSU selection: Probability Proportionate to Population Size (PPPS) - Cluster size: 8 households per PSU - Household selection: Randomly selected start points, followed by walk pattern using 5/10 interval - Respondent selection: Gender quota to be achieved by alternating interviews between men and women; potential respondents (i.e. household members) of the appropriate gender are listed, then the computer chooses the individual random
Face-to-face [f2f]
The Round 8 questionnaire has been developed by the Questionnaire Committee after reviewing the findings and feedback obtained in previous Rounds, and securing input on preferred new topics from a host of donors, analysts, and users of the data.
The questionnaire consists of three parts: 1. Part 1 captures the steps for selecting households and respondents, and includes the introduction to the respondent and (pp.1-4). This section should be filled in by the Fieldworker. 2. Part 2 covers the core attitudinal and demographic questions that are asked by the Fieldworker and answered by the Respondent (Q1 – Q100). 3. Part 3 includes contextual questions about the setting and atmosphere of the interview, and collects information on the Fieldworker. This section is completed by the Fieldworker (Q101 – Q123).
Outcome rates: - Contact rate: 99% - Cooperation rate: 92% - Refusal rate: 3% - Response rate: 91%
+/- 3% at 95% confidence level
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Data is from responses to demographic questions in the questionnaire on randomization.Older participants (66 years, ±16 vs. 61 years ±16, p = 0.02) and Maori (66% vs. 29%, p<0.001) were less likely to complete the questionnaire, however there were no differences between randomized groups. The total completion rate was higher for the simplified ICF + booklet (75%) compared to the standard ICF’s (64%, p = 0.05) and the short ICF + booklet (62%, p = 0.04).
The primary objective of the 2018 ZDHS was to provide up-to-date estimates of basic demographic and health indicators. Specifically, the ZDHS collected information on: - Fertility levels and preferences; contraceptive use; maternal and child health; infant, child, and neonatal mortality levels; maternal mortality; and gender, nutrition, and awareness regarding HIV/AIDS and other health issues relevant to the achievement of the Sustainable Development Goals (SDGs) - Ownership and use of mosquito nets as part of the national malaria eradication programmes - Health-related matters such as breastfeeding, maternal and childcare (antenatal, delivery, and postnatal), children’s immunisations, and childhood diseases - Anaemia prevalence among women age 15-49 and children age 6-59 months - Nutritional status of children under age 5 (via weight and height measurements) - HIV prevalence among men age 15-59 and women age 15-49 and behavioural risk factors related to HIV - Assessment of situation regarding violence against women
National coverage
The survey covered all de jure household members (usual residents), all women age 15-49, all men age 15-59, and all children age 0-5 years who are usual members of the selected households or who spent the night before the survey in the selected households.
Sample survey data [ssd]
The sampling frame used for the 2018 ZDHS is the Census of Population and Housing (CPH) of the Republic of Zambia, conducted in 2010 by ZamStats. Zambia is divided into 10 provinces. Each province is subdivided into districts, each district into constituencies, and each constituency into wards. In addition to these administrative units, during the 2010 CPH each ward was divided into convenient areas called census supervisory areas (CSAs), and in turn each CSA was divided into enumeration areas (EAs). An enumeration area is a geographical area assigned to an enumerator for the purpose of conducting a census count; according to the Zambian census frame, each EA consists of an average of 110 households.
The current version of the EA frame for the 2010 CPH was updated to accommodate some changes in districts and constituencies that occurred between 2010 and 2017. The list of EAs incorporates census information on households and population counts. Each EA has a cartographic map delineating its boundaries, with identification information and a measure of size, which is the number of residential households enumerated in the 2010 CPH. This list of EAs was used as the sampling frame for the 2018 ZDHS.
The 2018 ZDHS followed a stratified two-stage sample design. The first stage involved selecting sample points (clusters) consisting of EAs. EAs were selected with a probability proportional to their size within each sampling stratum. A total of 545 clusters were selected.
The second stage involved systematic sampling of households. A household listing operation was undertaken in all of the selected clusters. During the listing, an average of 133 households were found in each cluster, from which a fixed number of 25 households were selected through an equal probability systematic selection process, to obtain a total sample size of 13,625 households. Results from this sample are representative at the national, urban and rural, and provincial levels.
For further details on sample selection, see Appendix A of the final report.
Face-to-face [f2f]
Four questionnaires were used in the 2018 ZDHS: the Household Questionnaire, the Woman’s Questionnaire, the Man’s Questionnaire, and the Biomarker Questionnaire. The questionnaires, based on The DHS Program’s Model Questionnaires, were adapted to reflect the population and health issues relevant to Zambia. Input on questionnaire content was solicited from various stakeholders representing government ministries and agencies, nongovernmental organisations, and international cooperating partners. After all questionnaires were finalised in English, they were translated into seven local languages: Bemba, Kaonde, Lozi, Lunda, Luvale, Nyanja, and Tonga. In addition, information about the fieldworkers for the survey was collected through a self-administered Fieldworker Questionnaire.
All electronic data files were transferred via a secure internet file streaming system to the ZamStats central office in Lusaka, where they were stored on a password-protected computer. The data processing operation included secondary editing, which required resolution of computer-identified inconsistencies and coding of open-ended questions. The data were processed by two IT specialists and one secondary editor who took part in the main fieldwork training; they were supervised remotely by staff from The DHS Program. Data editing was accomplished using CSPro software. During the fieldwork, field-check tables were generated to check various data quality parameters, and specific feedback was given to the teams to improve performance. Secondary editing and data processing were initiated in July 2018 and completed in March 2019.
Of the 13,595 households in the sample, 12,943 were occupied. Of these occupied households, 12,831 were successfully interviewed, yielding a response rate of 99%.
In the interviewed households, 14,189 women age 15-49 were identified as eligible for individual interviews; 13,683 women were interviewed, yielding a response rate of 96% (the same rate achieved in the 2013-14 survey). A total of 13,251 men were eligible for individual interviews; 12,132 of these men were interviewed, producing a response rate of 92% (a 1 percentage point increase from the previous survey).
Of the households successfully interviewed, 12,505 were interviewed in 2018 and 326 in 2019. As the large majority of households were interviewed in 2018 and the year for reference indicators is 2018.
The estimates from a sample survey are affected by two types of errors: nonsampling errors and sampling errors. Nonsampling errors are the results of mistakes made in implementing data collection and data processing, such as failure to locate and interview the correct household, misunderstanding of the questions on the part of either the interviewer or the respondent, and data entry errors. Although numerous efforts were made during the implementation of the 2018 Zambia Demographic and Health Survey (ZDHS) to minimise this type of error, nonsampling errors are impossible to avoid and difficult to evaluate statistically.
Sampling errors, on the other hand, can be evaluated statistically. The sample of respondents selected in the 2018 ZDHS is only one of many samples that could have been selected from the same population, using the same design and expected size. Each of these samples would yield results that differ somewhat from the results of the actual sample selected. Sampling errors are a measure of the variability among all possible samples. Although the degree of variability is not known exactly, it can be estimated from the survey results.
Sampling error is usually measured in terms of the standard error for a particular statistic (mean, percentage, etc.), which is the square root of the variance. The standard error can be used to calculate confidence intervals within which the true value for the population can reasonably be assumed to fall. For example, for any given statistic calculated from a sample survey, the value of that statistic will fall within a range of plus or minus two times the standard error of that statistic in 95% of all possible samples of identical size and design.
If the sample of respondents had been selected as a simple random sample, it would have been possible to use straightforward formulas for calculating sampling errors. However, the 2018 ZDHS sample is the result of a multi-stage stratified design, and, consequently, it was necessary to use more complex formulas. Sampling errors are computed in SAS, using programs developed by ICF. These programs use the Taylor linearisation method to estimate variances for survey estimates that are means, proportions, or ratios. The Jackknife repeated replication method is used for variance estimation of more complex statistics such as fertility and mortality rates.
Note: A more detailed description of estimates of sampling errors are presented in APPENDIX B of the survey report.
Data Quality Tables - Household age distribution - Age distribution of eligible and interviewed women - Age distribution of eligible and interviewed men - Completeness of reporting - Births by calendar years - Reporting of age at death in days - Reporting of age at death in months - Completeness of information on siblings - Sibship size and sex ratio of siblings - Height and weight data completeness and quality for children - Number of enumeration areas completed by month, according to province, Zambia DHS 2018
Note: Data quality tables are presented in APPENDIX C of the report.
https://borealisdata.ca/api/datasets/:persistentId/versions/3.1/customlicense?persistentId=doi:10.7939/DVN/10573https://borealisdata.ca/api/datasets/:persistentId/versions/3.1/customlicense?persistentId=doi:10.7939/DVN/10573
The Population Research Laboratory (PRL) administered the 2013 Alberta Survey. This survey of households across the province of Alberta continues to enable academic researchers, government departments, and non-profit organizations to explore a wide range of topics in a structured research framework and environment. Sponsors’ research questions are asked together with demographic questions in a telephone interview of Alberta households.
The Afrobarometer is a comparative series of public attitude surveys that assess African citizen's attitudes to democracy and governance, markets, and civil society, among other topics. The surveys have been undertaken at periodic intervals since 1999. The Afrobarometer's coverage has increased over time. Round 1 (1999-2001) initially covered 7 countries and was later extended to 12 countries. Round 2 (2002-2004) surveyed citizens in 16 countries. Round 3 (2005-2006) 18 countries, Round 4 (2008) 20 countries, Round 5 (2011-2013) 34 countries, Round 6 (2014-2015) 36 countries, and Round 7 (2016-2018) 34 countries. The survey covered 34 countries in Round 8 (2019-2021).
National coverage.
Individual
Citizens who are 18 years and older.
Sample survey data [ssd]
Afrobarometer Sampling Procedure
Afrobarometer uses national probability samples designed to meet the following criteria. Samples are designed to generate a sample that is a representative cross-section of all citizens of voting age in a given country. The goal is to give every adult citizen an equal and known chance of being selected for an interview. They achieve this by:
• using random selection methods at every stage of sampling; • sampling at all stages with probability proportionate to population size wherever possible to ensure that larger (i.e., more populated) geographic units have a proportionally greater probability of being chosen into the sample.
The sampling universe normally includes all citizens age 18 and older. As a standard practice, we exclude people living in institutionalized settings, such as students in dormitories, patients in hospitals, and persons in prisons or nursing homes. Occasionally, we must also exclude people living in areas determined to be inaccessible due to conflict or insecurity. Any such exclusion is noted in the technical information report (TIR) that accompanies each data set.
Sample size and design Samples usually include either 1,200 or 2,400 cases. A randomly selected sample of n=1200 cases allows inferences to national adult populations with a margin of sampling error of no more than +/-2.8% with a confidence level of 95 percent. With a sample size of n=2400, the margin of error decreases to +/-2.0% at 95 percent confidence level.
The sample design is a clustered, stratified, multi-stage, area probability sample. Specifically, we first stratify the sample according to the main sub-national unit of government (state, province, region, etc.) and by urban or rural location.
Area stratification reduces the likelihood that distinctive ethnic or language groups are left out of the sample. Afrobarometer occasionally purposely oversamples certain populations that are politically significant within a country to ensure that the size of the sub-sample is large enough to be analysed. Any oversamples is noted in the TIR.
Sample stages Samples are drawn in either four or five stages:
Stage 1: In rural areas only, the first stage is to draw secondary sampling units (SSUs). SSUs are not used in urban areas, and in some countries they are not used in rural areas. See the TIR that accompanies each data set for specific details on the sample in any given country. Stage 2: We randomly select primary sampling units (PSU). Stage 3: We then randomly select sampling start points. Stage 4: Interviewers then randomly select households. Stage 5: Within the household, the interviewer randomly selects an individual respondent. Each interviewer alternates in each household between interviewing a man and interviewing a woman to ensure gender balance in the sample.
To keep the costs and logistics of fieldwork within manageable limits, eight interviews are clustered within each selected PSU.
Burkina Faso - Sample size: 1,200 - Sampling Frame: Recensement Général de la Population et de l'Habitation 2006 - Sample design: Nationally representative, random, clustered, stratified, multi-stage area probability sample - Stratification: Region and urban-rural location - PSU selection: Probability Proportionate to Population Size (PPPS) - Cluster size: 8 households per PSU - Household selection: Randomly selected start points, followed by walk pattern using 5/10 interval - Respondent selection: Gender quota filled by alternating interviews between men and women; respondents of appropriate gender listed, after which household member draws a numbered card to select individual.
Face-to-face [f2f]
The Round 8 questionnaire has been developed by the Questionnaire Committee after reviewing the findings and feedback obtained in previous Rounds, and securing input on preferred new topics from a host of donors, analysts, and users of the data.
The questionnaire consists of three parts: 1. Part 1 captures the steps for selecting households and respondents, and includes the introduction to the respondent and (pp.1-4). This section should be filled in by the Fieldworker. 2. Part 2 covers the core attitudinal and demographic questions that are asked by the Fieldworker and answered by the Respondent (Q1 – Q100). 3. Part 3 includes contextual questions about the setting and atmosphere of the interview, and collects information on the Fieldworker. This section is completed by the Fieldworker (Q101 – Q123).
Outcome rates: - Contact rate: 90% - Cooperation rate: 88% - Refusal rate: 3% - Response rate: 79%
+/- 3 % with 95% confidence level
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Granite State Poll is a quarterly poll conducted by the University of New Hampshire Survey Center. The poll sample consists of about 500 New Hampshire adults with a working telephone across the state. Each poll contains a series of basic demographic questions that are repeated in future polls, as well as a set of unique questions that are submitted by clients. This poll includes two questions related to preferences about dams. These questions were designed by Natallia Leuchanka Diessner, Catherine M. Ashcraft, Kevin H. Gardner, and Lawrence C. Hamilton as part of the "Future of Dams" project.This Technical Report was written by the UNH Survey Center and describes the protocols and standards of the Granite State Poll #68 (Client Poll), which includes questions related to preferences about dams, designed by Natallia Leuchanka Diessner, Catherine M. Ashcraft, Kevin H. Gardner, and Lawrence C. Hamilton as part of the "Future of Dams" project.The first file is a screenshot of the Technical Report to provide a preview for Figshare. The second file is the Technical Report in Microsoft Word format.
The 2017-18 Bangladesh Demographic and Health Survey (2017-18 BDHS) is a nationwide survey with a nationally representative sample of approximately 20,250 selected households. All ever-married women age 15-49 who are usual members of the selected households or who spent the night before the survey in the selected households were eligible for individual interviews. The survey was designed to produce reliable estimates for key indicators at the national level as well as for urban and rural areas and each of the country’s eight divisions: Barishal, Chattogram, Dhaka, Khulna, Mymensingh, Rajshahi, Rangpur, and Sylhet.
The main objective of the 2017-18 BDHS is to provide up-to-date information on fertility and fertility preferences; childhood mortality levels and causes of death; awareness, approval, and use of family planning methods; maternal and child health, including breastfeeding practices and nutritional status; newborn care; women’s empowerment; selected noncommunicable diseases (NCDS); and availability and accessibility of health and family planning services at the community level.
This information is intended to assist policymakers and program managers in monitoring and evaluating the 4th Health, Population and Nutrition Sector Program (4th HPNSP) 2017-2022 of the Ministry of Health and Family Welfare (MOHFW) and to provide estimates for 14 major indicators of the HPNSP Results Framework (MOHFW 2017).
National coverage
The survey covered all de jure household members (usual residents), all women aged 15-49 and all children aged 0-5 resident in the household.
Sample survey data [ssd]
The sample for the 2017-18 BDHS is nationally representative and covers the entire population residing in non-institutional dwelling units in the country. The survey used a list of enumeration areas (EAs) from the 2011 Population and Housing Census of the People’s Republic of Bangladesh, provided by the Bangladesh Bureau of Statistics (BBS), as a sampling frame (BBS 2011). The primary sampling unit (PSU) of the survey is an EA with an average of about 120 households.
Bangladesh consists of eight administrative divisions: Barishal, Chattogram, Dhaka, Khulna, Mymensingh, Rajshahi, Rangpur, and Sylhet. Each division is divided into zilas and each zila into upazilas. Each urban area in an upazila is divided into wards, which are further subdivided into mohallas. A rural area in an upazila is divided into union parishads (UPs) and, within UPs, into mouzas. These divisions allow the country as a whole to be separated into rural and urban areas.
The survey is based on a two-stage stratified sample of households. In the first stage, 675 EAs (250 in urban areas and 425 in rural areas) were selected with probability proportional to EA size. The sample in that stage was drawn by BBS, following the specifications provided by ICF that include cluster allocation and instructions on sample selection. A complete household listing operation was then carried out in all selected EAs to provide a sampling frame for the second-stage selection of households. In the second stage of sampling, a systematic sample of an average of 30 households per EA was selected to provide
statistically reliable estimates of key demographic and health variables for the country as a whole, for urban and rural areas separately, and for each of the eight divisions. Based on this design, 20,250 residential households were selected. Completed interviews were expected from about 20,100 ever-married women age 15-49. In addition, in a subsample of one-fourth of the households (about 7-8 households per EA), all ever-married women age 50 and older, never-married women age 18 and older, and men age 18 and older were weighed and had their height measured. In the same households, blood pressure and blood glucose testing were conducted for all adult men and women age 18 and older.
The survey was successfully carried out in 672 clusters after elimination of three clusters (one urban and two rural) that were completely eroded by floodwater. These clusters were in Dhaka (one urban cluster), Rajshahi (one rural cluster), and Rangpur (one rural cluster). A total of 20,160 households were selected for the survey.
For further details on sample selection, see Appendix A of the final report.
Computer Assisted Personal Interview [capi]
The 2017-18 BDHS used six types of questionnaires: (1) the Household Questionnaire, (2) the Woman’s Questionnaire (completed by ever-married women age 15-49), (3) the Biomarker Questionnaire, (4) two verbal autopsy questionnaires to collect data on causes of death among children under age 5, (5) the Community Questionnaire, and the Fieldworker Questionnaire. The first three questionnaires were based on the model questionnaires developed for the DHS-7 Program, adapted to the situation and needs in Bangladesh and taking into account the content of the instruments employed in prior BDHS surveys. The verbal autopsy module was replicated from the questionnaires used in the 2011 BDHS, as the objectives of the 2011 BDHS and the 2017-18 BDHS were the same. The module was adapted from the standardized WHO 2016 verbal autopsy module. The Community Questionnaire was adapted from the version used in the 2014 BDHS. The adaptation process for the 2017-18 BDHS involved a series of meetings with a technical working group. Additionally, draft questionnaires were circulated to other interested groups and were reviewed by the TWG and SAC. The questionnaires were developed in English and then translated into and printed in Bangla. Back translations were conducted by people not involved with the Bangla translations.
Completed BDHS questionnaires were returned to Dhaka every 2 weeks for data processing at Mitra and Associates offices. Data processing began shortly after fieldwork commenced and consisted of office editing, coding of open-ended questions, data entry, and editing of inconsistencies found by the computer program. The field teams were alerted regarding any inconsistencies or errors found during data processing. Eight data entry operators and two data entry supervisors performed the work, which commenced on November 17, 2017, and ended on March 27, 2018. Data processing was accomplished using Census and Survey Processing System (CSPro) software, jointly developed by the United States Census Bureau, ICF, and Serpro S.A.
Among the 20,160 households selected, 19,584 were occupied. Interviews were successfully completed in 19,457 (99%) of the occupied households. Among the 20,376 ever-married women age 15-49 eligible for interviews, 20,127 were interviewed, yielding a response rate of 99%. The principal reason for non-response among women was their absence from home despite repeated visits. Response rates did not vary notably by urbanrural residence.
The estimates from a sample survey are affected by two types of errors: nonsampling errors and sampling errors. Nonsampling errors are the results of mistakes made in implementing data collection and data processing, such as failure to locate and interview the correct household, misunderstanding of the questions on the part of either the interviewer or the respondent, and data entry errors. Although numerous efforts were made during the implementation of the 2017-18 Bangladesh Demographic and Health Survey (BDHS) to minimize this type of error, nonsampling errors are impossible to avoid and difficult to evaluate statistically.
Sampling errors, on the other hand, can be evaluated statistically. The sample of respondents selected in the 2017-18 BDHS is only one of many samples that could have been selected from the same population, using the same design and expected size. Each of these samples would yield results that differ somewhat from the results of the actual sample selected. Sampling errors are a measure of the variability among all possible samples. Although the degree of variability is not known exactly, it can be estimated from the survey results.
Sampling error is usually measured in terms of the standard error for a particular statistic (mean, percentage, etc.), which is the square root of the variance. The standard error can be used to calculate confidence intervals within which the true value for the population can reasonably be assumed to fall. For example, for any given statistic calculated from a sample survey, the value of that statistic will fall within a range of plus or minus two times the standard error of that statistic in 95% of all possible samples of identical size and design.
If the sample of respondents had been selected as a simple random sample, it would have been possible to use straightforward formulas for calculating sampling errors. However, the 2017-18 BDHS sample is the result of a multi-stage stratified design, and, consequently, it was necessary to use more complex formulas. Sampling errors are computed in SAS, using programs developed by ICF. These programs use the Taylor linearization method to estimate variances for survey estimates that are means, proportions, or ratios. The Jackknife repeated replication method is used for variance estimation of more complex statistics such as fertility and mortality rates.
Note: A more detailed description of estimates of sampling errors are presented in APPENDIX B of the survey report.
Data
The RECOVER Consortium developed a web-based interactive educational website and application to effectively disseminate oil spill science and research to students – ranging from elementary to collegiate levels – and the general public. The RECOVER Virtual Lab application allows users to conduct virtual experiments on the impacts of oil on fish physiology, similar to those of RECOVER researchers. By using the Virtual Lab, students, teachers and the general public are able to understand the real-world applications of data, experimental designs, and results generated by RECOVER researchers. Both Virtual Lab lessons utilize data produced by GoMRI scientists which are made available to students and the public to expand the reach of the oil spill science to individuals that may not otherwise have access to oil spill science and data. At the end of each lesson, students complete a demographic questionnaire and answer content-based questions through quizzes developed in Google Forms. From this data, the Virtual Lab has been used within 30 different states and 2 international countries, with a total usership of over 1,000 students.
This is the 21st annual survey in this series that explores changes in important values, behaviors, and lifestyle orientations of contemporary American youth. Two general types of tasks may be distinguished. The first is to provide a systematic and accurate description of the youth population of interest in a given year, and to quantify the direction and rate of change occurring over time. The second task, more analytic than descriptive, involves the explanation of the relationships and trends observed. Each year, a large, nationally representative sample of high school seniors in the United States is asked to respond to approximately 100 drug-use and demographic questions as well as to an average of 200 additional questions on a variety of subjects, including attitudes toward government, social institutions, race relations, changing roles for women, educational aspirations, occupational aims, and marital and family plans. The students are randomly assigned one of six questionnaires, each with a different subset of topical questions but all containing a set of "core" questions on demographics and drug use. There are about 1,400 variables across the questionnaires.
The 2022 Kenya Demographic and Health Survey (2022 KDHS) was implemented by the Kenya National Bureau of Statistics (KNBS) in collaboration with the Ministry of Health (MoH) and other stakeholders. The survey is the 7th KDHS implemented in the country.
The primary objective of the 2022 KDHS is to provide up-to-date estimates of basic sociodemographic, nutrition and health indicators. Specifically, the 2022 KDHS collected information on: • Fertility levels and contraceptive prevalence • Childhood mortality • Maternal and child health • Early Childhood Development Index (ECDI) • Anthropometric measures for children, women, and men • Children’s nutrition • Woman’s dietary diversity • Knowledge and behaviour related to the transmission of HIV and other sexually transmitted diseases • Noncommunicable diseases and other health issues • Extent and pattern of gender-based violence • Female genital mutilation.
The information collected in the 2022 KDHS will assist policymakers and programme managers in monitoring, evaluating, and designing programmes and strategies for improving the health of Kenya’s population. The 2022 KDHS also provides indicators relevant to monitoring the Sustainable Development Goals (SDGs) for Kenya, as well as indicators relevant for monitoring national and subnational development agendas such as the Kenya Vision 2030, Medium Term Plans (MTPs), and County Integrated Development Plans (CIDPs).
National coverage
The survey covered all de jure household members (usual residents), all women aged 15-49, men ageed 15-54, and all children aged 0-4 resident in the household.
Sample survey data [ssd]
The sample for the 2022 KDHS was drawn from the Kenya Household Master Sample Frame (K-HMSF). This is the frame that KNBS currently uses to conduct household-based sample surveys in Kenya. The frame is based on the 2019 Kenya Population and Housing Census (KPHC) data, in which a total of 129,067 enumeration areas (EAs) were developed. Of these EAs, 10,000 were selected with probability proportional to size to create the K-HMSF. The 10,000 EAs were randomised into four equal subsamples. A survey can utilise a subsample or a combination of subsamples based on the sample size requirements. The 2022 KDHS sample was drawn from subsample one of the K-HMSF. The EAs were developed into clusters through a process of household listing and geo-referencing. The Constitution of Kenya 2010 established a devolved system of government in which Kenya is divided into 47 counties. To design the frame, each of the 47 counties in Kenya was stratified into rural and urban strata, which resulted in 92 strata since Nairobi City and Mombasa counties are purely urban.
The 2022 KDHS was designed to provide estimates at the national level, for rural and urban areas separately, and, for some indicators, at the county level. The sample size was computed at 42,300 households, with 25 households selected per cluster, which resulted in 1,692 clusters spread across the country, 1,026 clusters in rural areas, and 666 in urban areas. The sample was allocated to the different sampling strata using power allocation to enable comparability of county estimates.
The 2022 KDHS employed a two-stage stratified sample design where in the first stage, 1,692 clusters were selected from the K-HMSF using the Equal Probability Selection Method (EPSEM). The clusters were selected independently in each sampling stratum. Household listing was carried out in all the selected clusters, and the resulting list of households served as a sampling frame for the second stage of selection, where 25 households were selected from each cluster. However, after the household listing procedure, it was found that some clusters had fewer than 25 households; therefore, all households from these clusters were selected into the sample. This resulted in 42,022 households being sampled for the 2022 KDHS. Interviews were conducted only in the pre-selected households and clusters; no replacement of the preselected units was allowed during the survey data collection stages.
For further details on sample design, see APPENDIX A of the survey report.
Computer Assisted Personal Interview [capi]
Four questionnaires were used in the 2022 KDHS: Household Questionnaire, Woman’s Questionnaire, Man’s Questionnaire, and the Biomarker Questionnaire. The questionnaires, based on The DHS Program’s model questionnaires, were adapted to reflect the population and health issues relevant to Kenya. In addition, a self-administered Fieldworker Questionnaire was used to collect information about the survey’s fieldworkers.
CAPI was used during data collection. The devices used for CAPI were Android-based computer tablets programmed with a mobile version of CSPro. The CSPro software was developed jointly by the U.S. Census Bureau, Serpro S.A., and The DHS Program. Programming of questionnaires into the Android application was done by ICF, while configuration of tablets was completed by KNBS in collaboration with ICF. All fieldwork personnel were assigned usernames, and devices were password protected to ensure the integrity of the data.
Work was assigned by supervisors and shared via Bluetooth® to interviewers’ tablets. After completion, assigned work was shared with supervisors, who conducted initial data consistency checks and edits and then submitted data to the central servers hosted at KNBS via SyncCloud. Data were downloaded from the central servers and checked against the inventory of expected returns to account for all data collected in the field. SyncCloud was also used to generate field check tables to monitor progress and identify any errors, which were communicated back to the field teams for correction.
Secondary editing was done by members of the KNBS and ICF central office team, who resolved any errors that were not corrected by field teams during data collection. A CSPro batch editing tool was used for cleaning and tabulation during data analysis.
A total of 42,022 households were selected for the survey, of which 38,731 (92%) were found to be occupied. Among the occupied households, 37,911 were successfully interviewed, yielding a response rate of 98%. The response rates for urban and rural households were 96% and 99%, respectively. In the interviewed households, 33,879 women age 15-49 were identified as eligible for individual interviews. Of these, 32,156 women were interviewed, yielding a response rate of 95%. The response rates among women selected for the full and short questionnaires were similar (95%). In the households selected for the men’s survey, 16,552 men age 15-54 were identified as eligible for individual interviews and 14,453 were successfully interviewed, yielding a response rate of 87%.
The estimates from a sample survey are affected by two types of errors: (1) non-sampling errors, and (2) sampling errors. Non-sampling errors are the results of mistakes made in implementing data collection and data processing, such as failure to locate and interview the correct household, misunderstanding of the questions on the part of either the interviewer or the respondent, and data entry errors. Although numerous efforts were made during the implementation of the 2022 Kenya Demographic and Health Survey (2022 KDHS) to minimise this type of error, non-sampling errors are impossible to avoid and difficult to evaluate statistically.
Sampling errors, on the other hand, can be evaluated statistically. The sample of respondents selected in the 2022 KDHS is only one of many samples that could have been selected from the same population, using the same design and identical size. Each of these samples would yield results that differ somewhat from the results of the actual sample selected. Sampling errors are a measure of the variability between all possible samples. Although the degree of variability is not known exactly, it can be estimated from the survey results.
A sampling error is usually measured in terms of the standard error for a particular statistic (mean, percentage, etc.), which is the square root of the variance. The standard error can be used to calculate confidence intervals within which the true value for the population can reasonably be assumed to fall. For example, for any given statistic calculated from a sample survey, the value of that statistic will fall within a range of plus or minus two times the standard error of that statistic in 95 percent of all possible samples of identical size and design.
If the sample of respondents had been selected as a simple random sample, it would have been possible to use straightforward formulas for calculating sampling errors. However, the 2022 KDHS sample is the result of a multi-stage stratified design, and, consequently, it was necessary to use more complex formulae. The computer software used to calculate sampling errors for the 2022 KDHS is a SAS program. This program used the Taylor linearisation method for variance estimation for survey estimates that are means, proportions or ratios. The Jackknife repeated replication method is used for variance estimation of more complex statistics such as fertility and mortality rates.
A more detailed description of estimates of sampling errors are presented in APPENDIX B of the survey report.
Data
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Although South Africa is the global epicenter of the HIV epidemic, the uptake of HIV testing and treatment among young people remains low. Concerns about confidentiality impede the utilization of HIV prevention services, which signals the need for discrete HIV prevention measures that leverage youth-friendly platforms. This paper describes the process of developing a youth-friendly internet-enabled HIV risk calculator in collaboration with young people, including young key populations aged between 18 and 24 years old. Using qualitative research, we conducted an exploratory study with 40 young people including young key population (lesbian, gay, bisexual, transgender (LGBT) individuals, men who have sex with men (MSM), and female sex workers). Eligible participants were young people aged between 18–24 years old and living in Soweto. Data was collected through two peer group discussions with young people aged 18–24 years, a once-off group discussion with the [Name of clinic removed for confidentiality] adolescent community advisory board members and once off face-to-face in-depth interviews with young key population groups: LGBT individuals, MSM, and female sex workers. LGBT individuals are identified as key populations because they face increased vulnerability to HIV/AIDS and other health risks due to societal stigma, discrimination, and obstacles in accessing healthcare and support services. The measures used to collect data included a socio-demographic questionnaire, a questionnaire on mobile phone usage, an HIV and STI risk assessment questionnaire, and a semi-structured interview guide. Framework analysis was used to analyse qualitative data through a qualitative data analysis software called NVivo. Descriptive statistics were summarized using SPSS for participant socio-demographics and mobile phone usage. Of the 40 enrolled participants, 58% were male, the median age was 20 (interquartile range 19–22.75), and 86% had access to the internet. Participants’ recommendations were considered in developing the HIV risk calculator. They indicated a preference for an easy-to-use, interactive, real-time assessment offering discrete and private means to self-assess HIV risk. In addition to providing feedback on the language and wording of the risk assessment tool, participants recommended creating a colorful, interactive and informational app. A collaborative and user-driven process is crucial for designing and developing HIV prevention tools for targeted groups. Participants emphasized that privacy, confidentiality, and ease of use contribute to the acceptability and willingness to use internet-enabled HIV prevention methods.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Context
The dataset tabulates the population of College Springs by gender, including both male and female populations. This dataset can be utilized to understand the population distribution of College Springs across both sexes and to determine which sex constitutes the majority.
Key observations
There is a majority of male population, with 56.68% of total population being male. Source: U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.
When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.
Scope of gender :
Please note that American Community Survey asks a question about the respondents current sex, but not about gender, sexual orientation, or sex at birth. The question is intended to capture data for biological sex, not gender. Respondents are supposed to respond with the answer as either of Male or Female. Our research and this dataset mirrors the data reported as Male and Female for gender distribution analysis. No further analysis is done on the data reported from the Census Bureau.
Variables / Data Columns
Good to know
Margin of Error
Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.
Custom data
If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.
Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.
This dataset is a part of the main dataset for College Springs Population by Race & Ethnicity. You can refer the same here
The 2022 Philippines National Demographic and Health Survey (NDHS) was implemented by the Philippine Statistics Authority (PSA). Data collection took place from May 2 to June 22, 2022.
The primary objective of the 2022 NDHS is to provide up-to-date estimates of basic demographic and health indicators. Specifically, the NDHS collected information on fertility, fertility preferences, family planning practices, childhood mortality, maternal and child health, nutrition, knowledge and attitudes regarding HIV/AIDS, violence against women, child discipline, early childhood development, and other health issues.
The information collected through the NDHS is intended to assist policymakers and program managers in designing and evaluating programs and strategies for improving the health of the country’s population. The 2022 NDHS also provides indicators anchored to the attainment of the Sustainable Development Goals (SDGs) and the new Philippine Development Plan for 2023 to 2028.
National coverage
The survey covered all de jure household members (usual residents), all women aged 15-49, and all children aged 0-4 resident in the household.
Sample survey data [ssd]
The sampling scheme provides data representative of the country as a whole, for urban and rural areas separately, and for each of the country’s administrative regions. The sample selection methodology for the 2022 NDHS was based on a two-stage stratified sample design using the Master Sample Frame (MSF) designed and compiled by the PSA. The MSF was constructed based on the listing of households from the 2010 Census of Population and Housing and updated based on the listing of households from the 2015 Census of Population. The first stage involved a systematic selection of 1,247 primary sampling units (PSUs) distributed by province or HUC. A PSU can be a barangay, a portion of a large barangay, or two or more adjacent small barangays.
In the second stage, an equal take of either 22 or 29 sample housing units were selected from each sampled PSU using systematic random sampling. In situations where a housing unit contained one to three households, all households were interviewed. In the rare situation where a housing unit contained more than three households, no more than three households were interviewed. The survey interviewers were instructed to interview only the preselected housing units. No replacements and no changes of the preselected housing units were allowed in the implementing stage in order to prevent bias. Survey weights were calculated, added to the data file, and applied so that weighted results are representative estimates of indicators at the regional and national levels.
All women age 15–49 who were either usual residents of the selected households or visitors who stayed in the households the night before the survey were eligible to be interviewed. Among women eligible for an individual interview, one woman per household was selected for a module on women’s safety.
For further details on sample design, see APPENDIX A of the final report.
Computer Assisted Personal Interview [capi]
Two questionnaires were used for the 2022 NDHS: the Household Questionnaire and the Woman’s Questionnaire. The questionnaires, based on The DHS Program’s model questionnaires, were adapted to reflect the population and health issues relevant to the Philippines. Input was solicited from various stakeholders representing government agencies, academe, and international agencies. The survey protocol was reviewed by the ICF Institutional Review Board.
After all questionnaires were finalized in English, they were translated into six major languages: Tagalog, Cebuano, Ilocano, Bikol, Hiligaynon, and Waray. The Household and Woman’s Questionnaires were programmed into tablet computers to allow for computer-assisted personal interviewing (CAPI) for data collection purposes, with the capability to choose any of the languages for each questionnaire.
Processing the 2022 NDHS data began almost as soon as fieldwork started, and data security procedures were in place in accordance with confidentiality of information as provided by Philippine laws. As data collection was completed in each PSU or cluster, all electronic data files were transferred securely via SyncCloud to a server maintained by the PSA Central Office in Quezon City. These data files were registered and checked for inconsistencies, incompleteness, and outliers. The field teams were alerted to any inconsistencies and errors while still in the area of assignment. Timely generation of field check tables allowed for effective monitoring of fieldwork, including tracking questionnaire completion rates. Only the field teams, project managers, and NDHS supervisors in the provincial, regional, and central offices were given access to the CAPI system and the SyncCloud server.
A team of secondary editors in the PSA Central Office carried out secondary editing, which involved resolving inconsistencies and recoding “other” responses; the former was conducted during data collection, and the latter was conducted following the completion of the fieldwork. Data editing was performed using the CSPro software package. The secondary editing of the data was completed in August 2022. The final cleaning of the data set was carried out by data processing specialists from The DHS Program in September 2022.
A total of 35,470 households were selected for the 2022 NDHS sample, of which 30,621 were found to be occupied. Of the occupied households, 30,372 were successfully interviewed, yielding a response rate of 99%. In the interviewed households, 28,379 women age 15–49 were identified as eligible for individual interviews. Interviews were completed with 27,821 women, yielding a response rate of 98%.
The estimates from a sample survey are affected by two types of errors: (1) nonsampling errors and (2) sampling errors. Nonsampling errors are the results of mistakes made in implementing data collection and in data processing, such as failure to locate and interview the correct household, misunderstanding of the questions on the part of either the interviewer or the respondent, and data entry errors. Although numerous efforts were made during the implementation of the 2022 Philippines National Demographic and Health Survey (2022 NDHS) to minimize this type of error, nonsampling errors are impossible to avoid and difficult to evaluate statistically.
Sampling errors, on the other hand, can be evaluated statistically. The sample of respondents selected in the 2022 NDHS is only one of many samples that could have been selected from the same population, using the same design and identical size. Each of these samples would yield results that differ somewhat from the results of the actual sample selected. Sampling errors are a measure of the variability between all possible samples. Although the degree of variability is not known exactly, it can be estimated from the survey results.
A sampling error is usually measured in terms of the standard error for a particular statistic (mean, percentage, etc.), which is the square root of the variance. The standard error can be used to calculate confidence intervals within which the true value for the population can reasonably be assumed to fall. For example, for any given statistic calculated from a sample survey, the value of that statistic will fall within a range of plus or minus two times the standard error of that statistic in 95% of all possible samples of identical size and design.
If the sample of respondents had been selected as a simple random sample, it would have been possible to use straightforward formulas for calculating sampling errors. However, the 2022 NDHS sample was the result of a multistage stratified design, and, consequently, it was necessary to use more complex formulas. Sampling errors are computed in SAS using programs developed by ICF. These programs use the Taylor linearization method to estimate variances for survey estimates that are means, proportions, or ratios. The Jackknife repeated replication method is used for variance estimation of more complex statistics such as fertility and mortality rates.
A more detailed description of estimates of sampling errors are presented in APPENDIX B of the survey report.
Data Quality Tables
See details of the data quality tables in Appendix C of the final report.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The dataset was used to address the research question 'What is the role of perfectionism in trichotillomania?'. It includes responses to demographic and standardised questionnaires, presented in both text and numeric formats, from 31 participants with trichotillomania; and anonymised transcripts of semi-structured interviews with 20 of the same participants. A copy of the semi-structured interview schedule and demographic questionnaire created for data collection are included.
https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
Professional organizations in STEM (science, technology, engineering, and mathematics) can use demographic data to quantify recruitment and retention (R&R) of underrepresented groups within their memberships. However, variation in the types of demographic data collected can influence the targeting and perceived impacts of R&R efforts - e.g., giving false signals of R&R for some groups. We obtained demographic surveys from 73 U.S.-affiliated STEM organizations, collectively representing 712,000 members and conference-attendees. We found large differences in the demographic categories surveyed (e.g., disability status, sexual orientation) and the available response options. These discrepancies indicate a lack of consensus regarding the demographic groups that should be recognized and, for groups that are omitted from surveys, an inability of organizations to prioritize and evaluate R&R initiatives. Aligning inclusive demographic surveys across organizations will provide baseline data that can be used to target and evaluate R&R initiatives to better serve underrepresented groups throughout STEM. Methods We surveyed 164 STEM organizations (73 responses, rate = 44.5%) between December 2020 and July 2021 with the goal of understanding what demographic data each organization collects from its constituents (i.e., members and conference-attendees) and how the data are used. Organizations were sourced from a list of professional societies affiliated with the American Association for the Advancement of Science, AAAS, (n = 156) or from social media (n = 8). The survey was sent to the elected leadership and management firms for each organization, and follow-up reminders were sent after one month. The responding organizations represented a wide range of fields: 31 life science organizations (157,000 constituents), 5 mathematics organizations (93,000 constituents), 16 physical science organizations (207,000 constituents), 7 technology organizations (124,000 constituents), and 14 multi-disciplinary organizations spanning multiple branches of STEM (131,000 constituents). A list of the responding organizations is available in the Supplementary Materials. Based on the AAAS-affiliated recruitment of the organizations and the similar distribution of constituencies across STEM fields, we conclude that the responding organizations are a representative cross-section of the most prominent STEM organizations in the U.S. Each organization was asked about the demographic information they collect from their constituents, the response rates to their surveys, and how the data were used. Survey description The following questions are written as presented to the participating organizations. Question 1: What is the name of your STEM organization? Question 2: Does your organization collect demographic data from your membership and/or meeting attendees? Question 3: When was your organization’s most recent demographic survey (approximate year)? Question 4: We would like to know the categories of demographic information collected by your organization. You may answer this question by either uploading a blank copy of your organization’s survey (linked provided in online version of this survey) OR by completing a short series of questions. Question 5: On the most recent demographic survey or questionnaire, what categories of information were collected? (Please select all that apply)
Disability status Gender identity (e.g., male, female, non-binary) Marital/Family status Racial and ethnic group Religion Sex Sexual orientation Veteran status Other (please provide)
Question 6: For each of the categories selected in Question 5, what options were provided for survey participants to select? Question 7: Did the most recent demographic survey provide a statement about data privacy and confidentiality? If yes, please provide the statement. Question 8: Did the most recent demographic survey provide a statement about intended data use? If yes, please provide the statement. Question 9: Who maintains the demographic data collected by your organization? (e.g., contracted third party, organization executives) Question 10: How has your organization used members’ demographic data in the last five years? Examples: monitoring temporal changes in demographic diversity, publishing diversity data products, planning conferences, contributing to third-party researchers. Question 11: What is the size of your organization (number of members or number of attendees at recent meetings)? Question 12: What was the response rate (%) for your organization’s most recent demographic survey? *Organizations were also able to upload a copy of their demographics survey instead of responding to Questions 5-8. If so, the uploaded survey was used (by the study authors) to evaluate Questions 5-8.