Facebook
Twitterhttps://paper.erudition.co.in/termshttps://paper.erudition.co.in/terms
Question Paper Solutions of chapter Sample Space of Numerical and statistical Methods, 5th Semester , Bachelor of Computer Application 2020-2021
Facebook
TwitterThe JPFHS is part of the worldwide Demographic and Health Surveys Program, which is designed to collect data on fertility, family planning, and maternal and child health. The primary objective of the Jordan Population and Family Health Survey (JPFHS) is to provide reliable estimates of demographic parameters, such as fertility, mortality, family planning, fertility preferences, as well as maternal and child health and nutrition that can be used by program managers and policy makers to evaluate and improve existing programs. In addition, the JPFHS data will be useful to researchers and scholars interested in analyzing demographic trends in Jordan, as well as those conducting comparative, regional or crossnational studies.
The content of the 2002 JPFHS was significantly expanded from the 1997 survey to include additional questions on women’s status, reproductive health, and family planning. In addition, all women age 15-49 and children less than five years of age were tested for anemia.
National
Sample survey data
The estimates from a sample survey are affected by two types of errors: 1) nonsampling errors and 2) sampling errors. Nonsampling errors are the result of mistakes made in implementing data collection and data processing, such as failure to locate and interview the correct household, misunderstanding of the questions on the part of either the interviewer or the respondent, and data entry errors. Although numerous efforts were made during the implementation of the 2002 JPFHS to minimize this type of error, nonsampling errors are impossible to avoid and difficult to evaluate statistically.
Sampling errors, on the other hand, can be evaluated statistically. The sample of respondents selected in the 2002 JPFHS is only one of many samples that could have been selected from the same population, using the same design and expected size. Each of these samples would yield results that differ somewhat from the results of the actual sample selected. Sampling errors are a measure of the variability between all possible samples. Although the degree of variability is not known exactly, it can be estimated from the survey results.
A sampling error is usually measured in terms of the standard error for a particular statistic (mean, percentage, etc.), which is the square root of the variance. The standard error can be used to calculate confidence intervals within which the true value for the population can reasonably be assumed to fall. For example, for any given statistic calculated from a sample survey, the value of that statistic will fall within a range of plus or minus two times the standard error of that statistic in 95 percent of all possible samples of identical size and design.
If the sample of respondents had been selected as a simple random sample, it would have been possible to use straightforward formulas for calculating sampling errors. However, the 2002 JPFHS sample is the result of a multistage stratified design and, consequently, it was necessary to use more complex formulas. The computer software used to calculate sampling errors for the 2002 JPFHS is the ISSA Sampling Error Module (ISSAS). This module used the Taylor linearization method of variance estimation for survey estimates that are means or proportions. The Jackknife repeated replication method is used for variance estimation of more complex statistics such as fertility and mortality rates.
Note: See detailed description of sample design in APPENDIX B of the survey report.
Face-to-face
The 2002 JPFHS used two questionnaires – namely, the Household Questionnaire and the Individual Questionnaire. Both questionnaires were developed in English and translated into Arabic. The Household Questionnaire was used to list all usual members of the sampled households and to obtain information on each member’s age, sex, educational attainment, relationship to the head of household, and marital status. In addition, questions were included on the socioeconomic characteristics of the household, such as source of water, sanitation facilities, and the availability of durable goods. The Household Questionnaire was also used to identify women who are eligible for the individual interview: ever-married women age 15-49. In addition, all women age 15-49 and children under five years living in the household were measured to determine nutritional status and tested for anemia.
The household and women’s questionnaires were based on the DHS Model “A” Questionnaire, which is designed for use in countries with high contraceptive prevalence. Additions and modifications to the model questionnaire were made in order to provide detailed information specific to Jordan, using experience gained from the 1990 and 1997 Jordan Population and Family Health Surveys. For each evermarried woman age 15 to 49, information on the following topics was collected:
In addition, information on births and pregnancies, contraceptive use and discontinuation, and marriage during the five years prior to the survey was collected using a monthly calendar.
Fieldwork and data processing activities overlapped. After a week of data collection, and after field editing of questionnaires for completeness and consistency, the questionnaires for each cluster were packaged together and sent to the central office in Amman where they were registered and stored. Special teams were formed to carry out office editing and coding of the open-ended questions.
Data entry and verification started after one week of office data processing. The process of data entry, including one hundred percent re-entry, editing and cleaning, was done by using PCs and the CSPro (Census and Survey Processing) computer package, developed specially for such surveys. The CSPro program allows data to be edited while being entered. Data processing operations were completed by the end of October 2002. A data processing specialist from ORC Macro made a trip to Jordan in October and November 2002 to follow up data editing and cleaning and to work on the tabulation of results for the survey preliminary report. The tabulations for the present final report were completed in December 2002.
A total of 7,968 households were selected for the survey from the sampling frame; among those selected households, 7,907 households were found. Of those households, 7,825 (99 percent) were successfully interviewed. In those households, 6,151 eligible women were identified, and complete interviews were obtained with 6,006 of them (98 percent of all eligible women). The overall response rate was 97 percent.
Note: See summarized response rates by place of residence in Table 1.1 of the survey report.
The estimates from a sample survey are affected by two types of errors: 1) nonsampling errors and 2) sampling errors. Nonsampling errors are the result of mistakes made in implementing data collection and data processing, such as failure to locate and interview the correct household, misunderstanding of the questions on the part of either the interviewer or the respondent, and data entry errors. Although numerous efforts were made during the implementation of the 2002 JPFHS to minimize this type of error, nonsampling errors are impossible to avoid and difficult to evaluate statistically.
Sampling errors, on the other hand, can be evaluated statistically. The sample of respondents selected in the 2002 JPFHS is only one of many samples that could have been selected from the same population, using the same design and expected size. Each of these samples would yield results that differ somewhat from the results of the actual sample selected. Sampling errors are a measure of the variability between all possible samples. Although the degree of variability is not known exactly, it can be estimated from the survey results.
A sampling error is usually measured in terms of the standard error for a particular statistic (mean, percentage, etc.), which is the square root of the variance. The standard error can be used to calculate confidence intervals within which the true value for the population can reasonably be assumed to fall. For example, for any given statistic calculated from a sample survey, the value of that statistic will fall within a range of plus or minus two times the standard error of that statistic in 95 percent of all possible samples of identical size and design.
If the sample of respondents had been selected as a simple random sample, it would have been possible to use straightforward formulas for calculating sampling errors. However, the 2002 JPFHS sample is the result of a multistage stratified design and, consequently, it was necessary to use more complex formulas. The computer software used to calculate sampling errors for the 2002 JPFHS is the ISSA Sampling Error Module (ISSAS). This module used the Taylor linearization method of variance estimation for survey estimates that are means or proportions. The Jackknife repeated replication method is used for variance estimation of more complex statistics such as fertility and mortality rates.
Note: See detailed
Facebook
Twitteranalyze the current population survey (cps) annual social and economic supplement (asec) with r the annual march cps-asec has been supplying the statistics for the census bureau's report on income, poverty, and health insurance coverage since 1948. wow. the us census bureau and the bureau of labor statistics ( bls) tag-team on this one. until the american community survey (acs) hit the scene in the early aughts (2000s), the current population survey had the largest sample size of all the annual general demographic data sets outside of the decennial census - about two hundred thousand respondents. this provides enough sample to conduct state- and a few large metro area-level analyses. your sample size will vanish if you start investigating subgroups b y state - consider pooling multiple years. county-level is a no-no. despite the american community survey's larger size, the cps-asec contains many more variables related to employment, sources of income, and insurance - and can be trended back to harry truman's presidency. aside from questions specifically asked about an annual experience (like income), many of the questions in this march data set should be t reated as point-in-time statistics. cps-asec generalizes to the united states non-institutional, non-active duty military population. the national bureau of economic research (nber) provides sas, spss, and stata importation scripts to create a rectangular file (rectangular data means only person-level records; household- and family-level information gets attached to each person). to import these files into r, the parse.SAScii function uses nber's sas code to determine how to import the fixed-width file, then RSQLite to put everything into a schnazzy database. you can try reading through the nber march 2012 sas importation code yourself, but it's a bit of a proc freak show. this new github repository contains three scripts: 2005-2012 asec - download all microdata.R down load the fixed-width file containing household, family, and person records import by separating this file into three tables, then merge 'em together at the person-level download the fixed-width file containing the person-level replicate weights merge the rectangular person-level file with the replicate weights, then store it in a sql database create a new variable - one - in the data table 2012 asec - analysis examples.R connect to the sql database created by the 'download all microdata' progr am create the complex sample survey object, using the replicate weights perform a boatload of analysis examples replicate census estimates - 2011.R connect to the sql database created by the 'download all microdata' program create the complex sample survey object, using the replicate weights match the sas output shown in the png file below 2011 asec replicate weight sas output.png statistic and standard error generated from the replicate-weighted example sas script contained in this census-provided person replicate weights usage instructions document. click here to view these three scripts for more detail about the current population survey - annual social and economic supplement (cps-asec), visit: the census bureau's current population survey page the bureau of labor statistics' current population survey page the current population survey's wikipedia article notes: interviews are conducted in march about experiences during the previous year. the file labeled 2012 includes information (income, work experience, health insurance) pertaining to 2011. when you use the current populat ion survey to talk about america, subract a year from the data file name. as of the 2010 file (the interview focusing on america during 2009), the cps-asec contains exciting new medical out-of-pocket spending variables most useful for supplemental (medical spending-adjusted) poverty research. confidential to sas, spss, stata, sudaan users: why are you still rubbing two sticks together after we've invented the butane lighter? time to transition to r. :D
Facebook
TwitterThe harmonized data set on health, created and published by the ERF, is a subset of Iraq Household Socio Economic Survey (IHSES) 2012. It was derived from the household, individual and health modules, collected in the context of the above mentioned survey. The sample was then used to create a harmonized health survey, comparable with the Iraq Household Socio Economic Survey (IHSES) 2007 micro data set.
----> Overview of the Iraq Household Socio Economic Survey (IHSES) 2012:
Iraq is considered a leader in household expenditure and income surveys where the first was conducted in 1946 followed by surveys in 1954 and 1961. After the establishment of Central Statistical Organization, household expenditure and income surveys were carried out every 3-5 years in (1971/ 1972, 1976, 1979, 1984/ 1985, 1988, 1993, 2002 / 2007). Implementing the cooperation between CSO and WB, Central Statistical Organization (CSO) and Kurdistan Region Statistics Office (KRSO) launched fieldwork on IHSES on 1/1/2012. The survey was carried out over a full year covering all governorates including those in Kurdistan Region.
The survey has six main objectives. These objectives are:
The raw survey data provided by the Statistical Office were then harmonized by the Economic Research Forum, to create a comparable version with the 2006/2007 Household Socio Economic Survey in Iraq. Harmonization at this stage only included unifying variables' names, labels and some definitions. See: Iraq 2007 & 2012- Variables Mapping & Availability Matrix.pdf provided in the external resources for further information on the mapping of the original variables on the harmonized ones, in addition to more indications on the variables' availability in both survey years and relevant comments.
National coverage: Covering a sample of urban, rural and metropolitan areas in all the governorates including those in Kurdistan Region.
1- Household/family. 2- Individual/person.
The survey was carried out over a full year covering all governorates including those in Kurdistan Region.
Sample survey data [ssd]
----> Design:
Sample size was (25488) household for the whole Iraq, 216 households for each district of 118 districts, 2832 clusters each of which includes 9 households distributed on districts and governorates for rural and urban.
----> Sample frame:
Listing and numbering results of 2009-2010 Population and Housing Survey were adopted in all the governorates including Kurdistan Region as a frame to select households, the sample was selected in two stages: Stage 1: Primary sampling unit (blocks) within each stratum (district) for urban and rural were systematically selected with probability proportional to size to reach 2832 units (cluster). Stage two: 9 households from each primary sampling unit were selected to create a cluster, thus the sample size of total survey clusters was 25488 households distributed on the governorates, 216 households in each district.
----> Sampling Stages:
In each district, the sample was selected in two stages: Stage 1: based on 2010 listing and numbering frame 24 sample points were selected within each stratum through systematic sampling with probability proportional to size, in addition to the implicit breakdown urban and rural and geographic breakdown (sub-district, quarter, street, county, village and block). Stage 2: Using households as secondary sampling units, 9 households were selected from each sample point using systematic equal probability sampling. Sampling frames of each stages can be developed based on 2010 building listing and numbering without updating household lists. In some small districts, random selection processes of primary sampling may lead to select less than 24 units therefore a sampling unit is selected more than once , the selection may reach two cluster or more from the same enumeration unit when it is necessary.
Face-to-face [f2f]
----> Preparation:
The questionnaire of 2006 survey was adopted in designing the questionnaire of 2012 survey on which many revisions were made. Two rounds of pre-test were carried out. Revision were made based on the feedback of field work team, World Bank consultants and others, other revisions were made before final version was implemented in a pilot survey in September 2011. After the pilot survey implemented, other revisions were made in based on the challenges and feedbacks emerged during the implementation to implement the final version in the actual survey.
----> Questionnaire Parts:
The questionnaire consists of four parts each with several sections: Part 1: Socio – Economic Data: - Section 1: Household Roster - Section 2: Emigration - Section 3: Food Rations - Section 4: housing - Section 5: education - Section 6: health - Section 7: Physical measurements - Section 8: job seeking and previous job
Part 2: Monthly, Quarterly and Annual Expenditures: - Section 9: Expenditures on Non – Food Commodities and Services (past 30 days). - Section 10 : Expenditures on Non – Food Commodities and Services (past 90 days). - Section 11: Expenditures on Non – Food Commodities and Services (past 12 months). - Section 12: Expenditures on Non-food Frequent Food Stuff and Commodities (7 days). - Section 12, Table 1: Meals Had Within the Residential Unit. - Section 12, table 2: Number of Persons Participate in the Meals within Household Expenditure Other Than its Members.
Part 3: Income and Other Data: - Section 13: Job - Section 14: paid jobs - Section 15: Agriculture, forestry and fishing - Section 16: Household non – agricultural projects - Section 17: Income from ownership and transfers - Section 18: Durable goods - Section 19: Loans, advances and subsidies - Section 20: Shocks and strategy of dealing in the households - Section 21: Time use - Section 22: Justice - Section 23: Satisfaction in life - Section 24: Food consumption during past 7 days
Part 4: Diary of Daily Expenditures: Diary of expenditure is an essential component of this survey. It is left at the household to record all the daily purchases such as expenditures on food and frequent non-food items such as gasoline, newspapers…etc. during 7 days. Two pages were allocated for recording the expenditures of each day, thus the roster will be consists of 14 pages.
----> Raw Data:
Data Editing and Processing: To ensure accuracy and consistency, the data were edited at the following stages: 1. Interviewer: Checks all answers on the household questionnaire, confirming that they are clear and correct. 2. Local Supervisor: Checks to make sure that questions has been correctly completed. 3. Statistical analysis: After exporting data files from excel to SPSS, the Statistical Analysis Unit uses program commands to identify irregular or non-logical values in addition to auditing some variables. 4. World Bank consultants in coordination with the CSO data management team: the World Bank technical consultants use additional programs in SPSS and STAT to examine and correct remaining inconsistencies within the data files. The software detects errors by analyzing questionnaire items according to the expected parameter for each variable.
----> Harmonized Data:
Iraq Household Socio Economic Survey (IHSES) reached a total of 25488 households. Number of households refused to response was 305, response rate was 98.6%. The highest interview rates were in Ninevah and Muthanna (100%) while the lowest rates were in Sulaimaniya (92%).
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Repeated measures correlation (rmcorr) is a statistical technique for determining the common within-individual association for paired measures assessed on two or more occasions for multiple individuals. Simple regression/correlation is often applied to non-independent observations or aggregated data; this may produce biased, specious results due to violation of independence and/or differing patterns between-participants versus within-participants. Unlike simple regression/correlation, rmcorr does not violate the assumption of independence of observations. Also, rmcorr tends to have much greater statistical power because neither averaging nor aggregation is necessary for an intra-individual research question. Rmcorr estimates the common regression slope, the association shared among individuals. To make rmcorr accessible, we provide background information for its assumptions and equations, visualization, power, and tradeoffs with rmcorr compared to multilevel modeling. We introduce the R package (rmcorr) and demonstrate its use for inferential statistics and visualization with two example datasets. The examples are used to illustrate research questions at different levels of analysis, intra-individual, and inter-individual. Rmcorr is well-suited for research questions regarding the common linear association in paired repeated measures data. All results are fully reproducible.
Facebook
TwitterThe Research and Development Survey (RANDS) is a platform designed for conducting survey question evaluation and statistical research. RANDS is an ongoing series of surveys from probability-sampled commercial survey panels used for methodological research at the National Center for Health Statistics (NCHS). RANDS estimates are generated using an experimental approach that differs from the survey design approaches generally used by NCHS, including possible biases from different response patterns and sampling frames as well as increased variability from lower sample sizes. Use of the RANDS platform allows NCHS to produce more timely data than would be possible using traditional data collection methods. RANDS is not designed to replace NCHS’ higher quality, core data collections. Below are experimental estimates of reduced access to healthcare for three rounds of RANDS during COVID-19. Data collection for the three rounds of RANDS during COVID-19 occurred between June 9, 2020 and July 6, 2020, August 3, 2020 and August 20, 2020, and May 17, 2021 and June 30, 2021. Information needed to interpret these estimates can be found in the Technical Notes. RANDS during COVID-19 included questions about unmet care in the last 2 months during the coronavirus pandemic. Unmet needs for health care are often the result of cost-related barriers. The National Health Interview Survey, conducted by NCHS, is the source for high-quality data to monitor cost-related health care access problems in the United States. For example, in 2018, 7.3% of persons of all ages reported delaying medical care due to cost and 4.8% reported needing medical care but not getting it due to cost in the past year. However, cost is not the only reason someone might delay or not receive needed medical care. As a result of the coronavirus pandemic, people also may not get needed medical care due to cancelled appointments, cutbacks in transportation options, fear of going to the emergency room, or an altruistic desire to not be a burden on the health care system, among other reasons. The Household Pulse Survey (https://www.cdc.gov/nchs/covid19/pulse/reduced-access-to-care.htm), an online survey conducted in response to the COVID-19 pandemic by the Census Bureau in partnership with other federal agencies including NCHS, also reports estimates of reduced access to care during the pandemic (beginning in Phase 1, which started on April 23, 2020). The Household Pulse Survey reports the percentage of adults who delayed medical care in the last 4 weeks or who needed medical care at any time in the last 4 weeks for something other than coronavirus but did not get it because of the pandemic. The experimental estimates on this page are derived from RANDS during COVID-19 and show the percentage of U.S. adults who were unable to receive medical care (including urgent care, surgery, screening tests, ongoing treatment, regular checkups, prescriptions, dental care, vision care, and hearing care) in the last 2 months. Technical Notes: https://www.cdc.gov/nchs/covid19/rands/reduced-access-to-care.htm#limitations
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
analyze the survey of consumer finances (scf) with r the survey of consumer finances (scf) tracks the wealth of american families. every three years, more than five thousand households answer a battery of questions about income, net worth, credit card debt, pensions, mortgages, even the lease on their cars. plenty of surveys collect annual income, only the survey of consumer finances captures such detailed asset data. responses are at the primary economic unit-level (peu) - the economically dominant, financially interdependent family members within a sampled household. norc at the university of chicago administers the data collection, but the board of governors of the federal reserve pay the bills and therefore call the shots. if you were so brazen as to open up the microdata and run a simple weighted median, you'd get the wrong answer. the five to six thousand respondents actually gobble up twenty-five to thirty thousand records in the final pub lic use files. why oh why? well, those tables contain not one, not two, but five records for each peu. wherever missing, these data are multiply-imputed, meaning answers to the same question for the same household might vary across implicates. each analysis must account for all that, lest your confidence intervals be too tight. to calculate the correct statistics, you'll need to break the single file into five, necessarily complicating your life. this can be accomplished with the meanit sas macro buried in the 2004 scf codebook (search for meanit - you'll need the sas iml add-on). or you might blow the dust off this website referred to in the 2010 codebook as the home of an alternative multiple imputation technique, but all i found were broken links. perhaps it's time for plan c, and by c, i mean free. read the imputation section of the latest codebook (search for imputation), then give these scripts a whirl. they've got that new r smell. the lion's share of the respondents in the survey of consumer finances get drawn from a pretty standard sample of american dwellings - no nursing homes, no active-duty military. then there's this secondary sample of richer households to even out the statistical noise at the higher end of the i ncome and assets spectrum. you can read more if you like, but at the end of the day the weights just generalize to civilian, non-institutional american households. one last thing before you start your engine: read everything you always wanted to know about the scf. my favorite part of that title is the word always. this new github repository contains t hree scripts: 1989-2010 download all microdata.R initiate a function to download and import any survey of consumer finances zipped stata file (.dta) loop through each year specified by the user (starting at the 1989 re-vamp) to download the main, extract, and replicate weight files, then import each into r break the main file into five implicates (each containing one record per peu) and merge the appropriate extract data onto each implicate save the five implicates and replicate weights to an r data file (.rda) for rapid future loading 2010 analysis examples.R prepare two survey of consumer finances-flavored multiply-imputed survey analysis functions load the r data files (.rda) necessary to create a multiply-imputed, replicate-weighted survey design demonstrate how to access the properties of a multiply-imput ed survey design object cook up some descriptive statistics and export examples, calculated with scf-centric variance quirks run a quick t-test and regression, but only because you asked nicely replicate FRB SAS output.R reproduce each and every statistic pr ovided by the friendly folks at the federal reserve create a multiply-imputed, replicate-weighted survey design object re-reproduce (and yes, i said/meant what i meant/said) each of those statistics, now using the multiply-imputed survey design object to highlight the statistically-theoretically-irrelevant differences click here to view these three scripts for more detail about the survey of consumer finances (scf), visit: the federal reserve board of governors' survey of consumer finances homepage the latest scf chartbook, to browse what's possible. (spoiler alert: everything.) the survey of consumer finances wikipedia entry the official frequently asked questions notes: nationally-representative statistics on the financial health, wealth, and assets of american hous eholds might not be monopolized by the survey of consumer finances, but there isn't much competition aside from the assets topical module of the survey of income and program participation (sipp). on one hand, the scf interview questions contain more detail than sipp. on the other hand, scf's smaller sample precludes analyses of acute subpopulations. and for any three-handed martians in the audience, ther e's also a few biases between these two data sources that you ought to consider. the survey methodologists at the federal reserve take their job...
Facebook
TwitterThe requests we receive at the Reference Desk keep surprising us. We'll take a look at some of the best examples from the year on data questions and data solutions.
Facebook
TwitterThe Participation Survey has run since October 2021 and is the key evidence source on engagement for DCMS. It is a continuous push-to-web household survey of adults aged 16 and over in England.
The Participation Survey provides reliable estimates of physical and digital engagement with the arts, heritage, museums and galleries, and libraries, as well as engagement with tourism, major events, digital and live sports.
In 2023/24, DCMS partnered with Arts Council England (ACE) to boost the Participation Survey to be able to produce meaningful estimates at Local Authority level. This has enabled us to have the most granular data we have ever had, which means there will be some new questions and changes to existing questions, response options and definitions in the 23/24 survey. The questionnaire for 2023/24 has been developed collaboratively to adapt to the needs and interests of both DCMS and ACE.
Where there has been a change, we have highlighted where a comparison with previous data can or cannot be made. Questionnaire changes can affect results, therefore should be taken into consideration when interpreting the findings.
The Participation Survey is only asked of adults in England. Currently there is no harmonised survey or set of questions within the administrations of the UK. Data on participation in cultural sectors for the devolved administrations is available in the https://www.gov.scot/collections/scottish-household-survey/">Scottish Household Survey, https://gov.wales/national-survey-wales">National Survey for Wales and https://www.communities-ni.gov.uk/topics/statistics-and-research/culture-and-heritage-statistics">Northern Ireland Continuous Household Survey.
The pre-release access document above contains a list of ministers and officials who have received privileged early access to this release of Participation Survey data. In line with best practice, the list has been kept to a minimum and those given access for briefing purposes had a maximum of 24 hours. Details on the pre-release access arrangements for this dataset are available in the accompanying material.
Our statistical practice is regulated by the OSR. OSR sets the standards of trustworthiness, quality and value in the https://code.statisticsauthority.gov.uk/the-code/">Code of Practice for Statistics that all producers of official statistics should adhere to.
You are welcome to contact us directly with any comments about how we meet these standards by emailing evidence@dcms.gov.uk. Alternatively, you can contact OSR by emailing regulation@statistics.gov.uk or via the OSR website.
The responsible statistician for this release is Donilia Asgill. For enquiries on this release, contact participationsurvey@dcms.gov.uk.
Facebook
TwitterAttribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
In statistics and chemometrics education for chemistry and STEAM (Science, Technology, Engineering, Arts, and Mathematics) students, gone are the days of laborious calculations and conclusions based solely on p values. Free and graphical user interface (GUI) software like JAMOVI streamlines calculations, generates informative plots, and fosters a deeper understanding beyond just p values. This article serves as a comprehensive guide for instructors on utilizing JAMOVI to teach essential parametric and nonparametric tests commonly encountered in these fields. Our focus centers on a paired samples t test, Wilcoxon rank sum test, repeated measures ANOVA (RMANOVA), and Friedman test. We also delve into principal component analysis (PCA) using the MEDA plugin, which generates high-quality colored plots that visually elucidate trends between groups. Through 21 guided questions, students will assess data normality, compare dependent groups using both parametric and nonparametric tests, explore comparisons between dependent groups, and observe trends using PCA. These STEAM-contextualized questions and practical examples empower educators to seamlessly integrate JAMOVI into their teaching, enhancing the learning experience for students.
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
This data set contains the replication data and supplements for the article "Knowing, Doing, and Feeling: A three-year, mixed-methods study of undergraduates’ information literacy development." The survey data is from two samples: - cross-sectional sample (different students at the same point in time) - longitudinal sample (the same students and different points in time)Surveys were distributed via Qualtrics during the students' first and sixth semesters. Quantitative and qualitative data were collected and used to describe students' IL development over 3 years. Statistics from the quantitative data were analyzed in SPSS. The qualitative data was coded and analyzed thematically in NVivo. The qualitative, textual data is from semi-structured interviews with sixth-semester students in psychology at UiT, both focus groups and individual interviews. All data were collected as part of the contact author's PhD research on information literacy (IL) at UiT. The following files are included in this data set: 1. A README file which explains the quantitative data files. (2 file formats: .txt, .pdf)2. The consent form for participants (in Norwegian). (2 file formats: .txt, .pdf)3. Six data files with survey results from UiT psychology undergraduate students for the cross-sectional (n=209) and longitudinal (n=56) samples, in 3 formats (.dat, .csv, .sav). The data was collected in Qualtrics from fall 2019 to fall 2022. 4. Interview guide for 3 focus group interviews. File format: .txt5. Interview guides for 7 individual interviews - first round (n=4) and second round (n=3). File format: .txt 6. The 21-item IL test (Tromsø Information Literacy Test = TILT), in English and Norwegian. TILT is used for assessing students' knowledge of three aspects of IL: evaluating sources, using sources, and seeking information. The test is multiple choice, with four alternative answers for each item. This test is a "KNOW-measure," intended to measure what students know about information literacy. (2 file formats: .txt, .pdf)7. Survey questions related to interest - specifically students' interest in being or becoming information literate - in 3 parts (all in English and Norwegian): a) information and questions about the 4 phases of interest; b) interest questionnaire with 26 items in 7 subscales (Tromsø Interest Questionnaire - TRIQ); c) Survey questions about IL and interest, need, and intent. (2 file formats: .txt, .pdf)8. Information about the assignment-based measures used to measure what students do in practice when evaluating and using sources. Students were evaluated with these measures in their first and sixth semesters. (2 file formats: .txt, .pdf)9. The Norwegain Centre for Research Data's (NSD) 2019 assessment of the notification form for personal data for the PhD research project. In Norwegian. (Format: .pdf)
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Characterizing the technical precision of measurements is a necessary stage in the planning of experiments and in the formal sample size calculation for optimal design. Instruments that measure multiple analytes simultaneously, such as in high-throughput assays arising in biomedical research, pose particular challenges from a statistical perspective. The current most popular method for assessing precision of high-throughput assays is by scatterplotting data from technical replicates. Here we question the statistical rationale of this approach from both an empirical and theoretical perspective, illustrating our discussion using four example data sets from different genomic platforms. We demonstrate that such scatterplots convey little statistical information of relevance and are potentially highly misleading. We present an alternative framework for assessing the precision of high-throughput assays and planning biomedical experiments. Our methods are based on \emph{repeatability} -- a long-established statistical quantity also known as the intraclass correlation coefficient. We provide guidance and software for estimation and visualization of repeatability of high-throughput assays, and for its incorporation into study design.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Building a score from a questionnaire to predict a binary gold standard is a common research question in psychology and health sciences. When building this score, researchers may have to choose between statistical performance and simplicity. A practical question is to what extent it is worth sacrificing the former to improve the latter. We investigated this research question using real data, in which the aim was to predict an alcohol use disorder (AUD) diagnosis from 20 self-reported binary questions in young Swiss men (n = 233, mean age = 26). We compared the statistical performance using the area under the ROC curve (AUC) of (a) a “refined score” obtained by logistic regression and several simplified versions of it (“simple scores”): with (b) 3, (c) 2, and (d) 1 digit(s), and (e) a “sum score” that did not allow negative coefficients. We used four estimation methods: (a) maximum likelihood, (b) backward selection, (c) LASSO, and (d) ridge penalty. We also used bootstrap procedures to correct for optimism. Simple scores, especially sum scores, performed almost identically or even slightly better than the refined score (respective ranges of corrected AUCs for refined and sum scores: 0.828–0.848, 0.835–0.850), with the best performance been achieved by LASSO. Our example data demonstrated that simplifying a score to predict a binary outcome does not necessarily imply a major loss in statistical performance, while it may improve its implementation, interpretation, and acceptability. Our study thus provides further empirical evidence of the potential benefits of using sum scores in psychology and health sciences.
Facebook
TwitterThe Associated Press is sharing data from the COVID Impact Survey, which provides statistics about physical health, mental health, economic security and social dynamics related to the coronavirus pandemic in the United States.
Conducted by NORC at the University of Chicago for the Data Foundation, the probability-based survey provides estimates for the United States as a whole, as well as in 10 states (California, Colorado, Florida, Louisiana, Minnesota, Missouri, Montana, New York, Oregon and Texas) and eight metropolitan areas (Atlanta, Baltimore, Birmingham, Chicago, Cleveland, Columbus, Phoenix and Pittsburgh).
The survey is designed to allow for an ongoing gauge of public perception, health and economic status to see what is shifting during the pandemic. When multiple sets of data are available, it will allow for the tracking of how issues ranging from COVID-19 symptoms to economic status change over time.
The survey is focused on three core areas of research:
Instead, use our queries linked below or statistical software such as R or SPSS to weight the data.
If you'd like to create a table to see how people nationally or in your state or city feel about a topic in the survey, use the survey questionnaire and codebook to match a question (the variable label) to a variable name. For instance, "How often have you felt lonely in the past 7 days?" is variable "soc5c".
Nationally: Go to this query and enter soc5c as the variable. Hit the blue Run Query button in the upper right hand corner.
Local or State: To find figures for that response in a specific state, go to this query and type in a state name and soc5c as the variable, and then hit the blue Run Query button in the upper right hand corner.
The resulting sentence you could write out of these queries is: "People in some states are less likely to report loneliness than others. For example, 66% of Louisianans report feeling lonely on none of the last seven days, compared with 52% of Californians. Nationally, 60% of people said they hadn't felt lonely."
The margin of error for the national and regional surveys is found in the attached methods statement. You will need the margin of error to determine if the comparisons are statistically significant. If the difference is:
The survey data will be provided under embargo in both comma-delimited and statistical formats.
Each set of survey data will be numbered and have the date the embargo lifts in front of it in the format of: 01_April_30_covid_impact_survey. The survey has been organized by the Data Foundation, a non-profit non-partisan think tank, and is sponsored by the Federal Reserve Bank of Minneapolis and the Packard Foundation. It is conducted by NORC at the University of Chicago, a non-partisan research organization. (NORC is not an abbreviation, it part of the organization's formal name.)
Data for the national estimates are collected using the AmeriSpeak Panel, NORC’s probability-based panel designed to be representative of the U.S. household population. Interviews are conducted with adults age 18 and over representing the 50 states and the District of Columbia. Panel members are randomly drawn from AmeriSpeak with a target of achieving 2,000 interviews in each survey. Invited panel members may complete the survey online or by telephone with an NORC telephone interviewer.
Once all the study data have been made final, an iterative raking process is used to adjust for any survey nonresponse as well as any noncoverage or under and oversampling resulting from the study specific sample design. Raking variables include age, gender, census division, race/ethnicity, education, and county groupings based on county level counts of the number of COVID-19 deaths. Demographic weighting variables were obtained from the 2020 Current Population Survey. The count of COVID-19 deaths by county was obtained from USA Facts. The weighted data reflect the U.S. population of adults age 18 and over.
Data for the regional estimates are collected using a multi-mode address-based (ABS) approach that allows residents of each area to complete the interview via web or with an NORC telephone interviewer. All sampled households are mailed a postcard inviting them to complete the survey either online using a unique PIN or via telephone by calling a toll-free number. Interviews are conducted with adults age 18 and over with a target of achieving 400 interviews in each region in each survey.Additional details on the survey methodology and the survey questionnaire are attached below or can be found at https://www.covid-impact.org.
Results should be credited to the COVID Impact Survey, conducted by NORC at the University of Chicago for the Data Foundation.
To learn more about AP's data journalism capabilities for publishers, corporations and financial institutions, go here or email kromano@ap.org.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Summary statistics and variable definitions-regular access sample.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Context
The dataset tabulates the population of Malta by gender across 18 age groups. It lists the male and female population in each age group along with the gender ratio for Malta. The dataset can be utilized to understand the population distribution of Malta by gender and age. For example, using this dataset, we can identify the largest age group for both Men and Women in Malta. Additionally, it can be used to see how the gender ratio changes from birth to senior most age group and male to female ratio across each age group for Malta.
Key observations
Largest age group (population): Male # 35-39 years (52) | Female # 25-29 years (62). Source: U.S. Census Bureau American Community Survey (ACS) 2017-2021 5-Year Estimates.
When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2017-2021 5-Year Estimates.
Age groups:
Scope of gender :
Please note that American Community Survey asks a question about the respondents current sex, but not about gender, sexual orientation, or sex at birth. The question is intended to capture data for biological sex, not gender. Respondents are supposed to respond with the answer as either of Male or Female. Our research and this dataset mirrors the data reported as Male and Female for gender distribution analysis.
Variables / Data Columns
Good to know
Margin of Error
Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.
Custom data
If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.
Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.
This dataset is a part of the main dataset for Malta Population by Gender. You can refer the same here
Facebook
TwitterSince the beginning of the 1960s, Statistics Sweden, in collaboration with various research institutions, has carried out follow-up surveys in the school system. These surveys have taken place within the framework of the IS project (Individual Statistics Project) at the University of Gothenburg and the UGU project (Evaluation through follow-up of students) at the University of Teacher Education in Stockholm, which since 1990 have been merged into a research project called 'Evaluation through Follow-up'. The follow-up surveys are part of the central evaluation of the school and are based on large nationally representative samples from different cohorts of students.
Evaluation through follow-up (UGU) is one of the country's largest research databases in the field of education. UGU is part of the central evaluation of the school and is based on large nationally representative samples from different cohorts of students. The longitudinal database contains information on nationally representative samples of school pupils from ten cohorts, born between 1948 and 2004. The sampling process was based on the student's birthday for the first two and on the school class for the other cohorts.
For each cohort, data of mainly two types are collected. School administrative data is collected annually by Statistics Sweden during the time that pupils are in the general school system (primary and secondary school), for most cohorts starting in compulsory school year 3. This information is provided by the school offices and, among other things, includes characteristics of school, class, special support, study choices and grades. Information obtained has varied somewhat, e.g. due to changes in curricula. A more detailed description of this data collection can be found in reports published by Statistics Sweden and linked to datasets for each cohort.
Survey data from the pupils is collected for the first time in compulsory school year 6 (for most cohorts). Questionnaire in survey in year 6 includes questions related to self-perception and interest in learning, attitudes to school, hobbies, school motivation and future plans. For some cohorts, questionnaire data are also collected in year 3 and year 9 in compulsory school and in upper secondary school.
Furthermore, results from various intelligence tests and standartized knowledge tests are included in the data collection year 6. The intelligence tests have been identical for all cohorts (except cohort born in 1987 from which questionnaire data were first collected in year 9). The intelligence test consists of a verbal, a spatial and an inductive test, each containing 40 tasks and specially designed for the UGU project. The verbal test is a vocabulary test of the opposite type. The spatial test is a so-called ‘sheet metal folding test’ and the inductive test are made up of series of numbers. The reliability of the test, intercorrelations and connection with school grades are reported by Svensson (1971).
For the first three cohorts (1948, 1953 and 1967), the standartized knowledge tests in year 6 consist of the standard tests in Swedish, mathematics and English that up to and including the beginning of the 1980s were offered to all pupils in compulsory school year 6. For the cohort 1972, specially prepared tests in reading and mathematics were used. The test in reading consists of 27 tasks and aimed to identify students with reading difficulties. The mathematics test, which was also offered for the fifth cohort, (1977) includes 19 assignments. After a changed version of the test, caused by the previously used test being judged to be somewhat too simple, has been used for the cohort born in 1982. Results on the mathematics test are not available for the 1987 cohort. The mathematics test was not offered to the students in the cohort in 1992, as the test did not seem to fully correspond with current curriculum intentions in mathematics. For further information, see the description of the dataset for each cohort.
For several of the samples, questionnaires were also collected from the students 'parents and teachers in year 6. The teacher questionnaire contains questions about the teacher, class size and composition, the teacher's assessments of the class' knowledge level, etc., school resources, working methods and parental involvement and questions about the existence of evaluations. The questionnaire for the guardians includes questions about the child's upbringing conditions, ambitions and wishes regarding the child's education, views on the school's objectives and the parents' own educational and professional situation.
The students are followed up even after they have left primary school. Among other things, data collection is done during the time they are in high school. Then school administrative data such as e.g. choice of upper secondary school line / program and grades after completing studies. For some of the cohorts, in addition to school administrative data, questionnaire data were also collected from the students.
New sample design compared to previous cohorts. The selection was carried out in two steps. In the first, municipalities were chosen and in the second, school classes with pupils in year 6. A stratified sample was selected from 29 municipalities, after which the school classes were chosen with the help of the class registers in the municipalities in question. In the small municipalities all classes were included, while a random sample was made from the larger ones. The final sample consisted of approximately 9601 students divided into 437 classes in year 6 spring term 1980, and mainly born in 1967. This was at the end 9114 due to the refusal in various forms.
The information obtained in 1980 for rides was:
School administrative data (school form, class type, year and grades). This information was collected by Statistics Sweden for all in the sample. Tasks 2-5 were collected by the Department of Education at the Stockholm University of Education.
Information about the parents' profession and education, housing, guardians, values of school and education, etc. This information was collected mainly through a questionnaire to guardians, which was new compared to the two previous cohorts. Information is available for about 70%.
Answers to questions that shed light on students' school attitudes, self-assessments and values, leisure activities and study and vocational plans, including motives for choosing alternative courses.
Results on three aptitude tests, one verbal, one spatial and one inductive.
The aptitude tests were completely identical, while the questionnaires were partially reworked compared to the two previous cohorts. This information is available to just over 90 percent of the students.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Summary of the associations between the 20 questions and the gold standard.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Context
The dataset tabulates the population of New Carlisle by gender across 18 age groups. It lists the male and female population in each age group along with the gender ratio for New Carlisle. The dataset can be utilized to understand the population distribution of New Carlisle by gender and age. For example, using this dataset, we can identify the largest age group for both Men and Women in New Carlisle. Additionally, it can be used to see how the gender ratio changes from birth to senior most age group and male to female ratio across each age group for New Carlisle.
Key observations
Largest age group (population): Male # 35-39 years (100) | Female # 10-14 years (136). Source: U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.
When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.
Age groups:
Scope of gender :
Please note that American Community Survey asks a question about the respondents current sex, but not about gender, sexual orientation, or sex at birth. The question is intended to capture data for biological sex, not gender. Respondents are supposed to respond with the answer as either of Male or Female. Our research and this dataset mirrors the data reported as Male and Female for gender distribution analysis.
Variables / Data Columns
Good to know
Margin of Error
Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.
Custom data
If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.
Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.
This dataset is a part of the main dataset for New Carlisle Population by Gender. You can refer the same here
Facebook
TwitterThese datasets contain 1.48 million question and answer pairs about products from Amazon.
Metadata includes
question and answer text
is the question binary (yes/no), and if so does it have a yes/no answer?
timestamps
product ID (to reference the review dataset)
Basic Statistics:
Questions: 1.48 million
Answers: 4,019,744
Labeled yes/no questions: 309,419
Number of unique products with questions: 191,185
Facebook
Twitterhttps://paper.erudition.co.in/termshttps://paper.erudition.co.in/terms
Question Paper Solutions of chapter Sample Space of Numerical and statistical Methods, 5th Semester , Bachelor of Computer Application 2020-2021