Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Analysis of ‘People according to the form of acquisition of videos. Multi-response (%)’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from http://data.europa.eu/88u/dataset/https-opendata-euskadi-eus-catalogo-estadistica-territorio-zona-geografica-y-dimension-municipal-cine-personas-segun-la-forma-de-adquisicion-de-videos-multirespuesta- on 17 January 2022.
--- Dataset description provided by original source is as follows ---
The Basque Observatory of Culture was created to place culture as a central element of social and economic development, with the mission of rigorously filling the information gap in the cultural field, in line with the Basque Culture Plan of which it forms part. The Observatory’s scope of action focuses on traditional areas of culture: cultural heritage, artistic creation and expression, industries and cross-cutting areas.The Basque Observatory of Culture publishes and updates more than 200 statistical indicators that can be consulted in euskadi.eus along with other research and reports.
--- Original source retains full ownership of the source dataset ---
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Central composite designs (CCDs) are widely accepted and used experimental designs for fitting second-order polynomial models in response surface methods. However, these designs are based only on the number of explanatory variables being investigated. In a multiresponse problem where prior information is available in the form of a screening experiment or previous process knowledge, investigators often know which factors will be used in the estimation of each response. This work presents an alternative design based on CCDs that allows main effects to be aliased for factors that are not related to the same response. This results in fewer required runs than current designs, saving investigators both time and money, by taking this prior information into account. R-package “DoE.multi.response” is included as a supplement for constructing these designs.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
GENERAL INFORMATION
Title of Dataset: A dataset from a survey investigating disciplinary differences in data citation
Date of data collection: January to March 2022
Collection instrument: SurveyMonkey
Funding: Alfred P. Sloan Foundation
SHARING/ACCESS INFORMATION
Licenses/restrictions placed on the data: These data are available under a CC BY 4.0 license
Links to publications that cite or use the data:
Gregory, K., Ninkov, A., Ripp, C., Peters, I., & Haustein, S. (2022). Surveying practices of data citation and reuse across disciplines. Proceedings of the 26th International Conference on Science and Technology Indicators. International Conference on Science and Technology Indicators, Granada, Spain. https://doi.org/10.5281/ZENODO.6951437
Gregory, K., Ninkov, A., Ripp, C., Roblin, E., Peters, I., & Haustein, S. (2023). Tracing data: A survey investigating disciplinary differences in data citation. Zenodo. https://doi.org/10.5281/zenodo.7555266
DATA & FILE OVERVIEW
File List
Filename: MDCDatacitationReuse2021Codebookv2.pdf Codebook
Filename: MDCDataCitationReuse2021surveydatav2.csv Dataset format in csv
Filename: MDCDataCitationReuse2021surveydatav2.sav Dataset format in SPSS
Filename: MDCDataCitationReuseSurvey2021QNR.pdf Questionnaire
Additional related data collected that was not included in the current data package: Open ended questions asked to respondents
METHODOLOGICAL INFORMATION
Description of methods used for collection/generation of data:
The development of the questionnaire (Gregory et al., 2022) was centered around the creation of two main branches of questions for the primary groups of interest in our study: researchers that reuse data (33 questions in total) and researchers that do not reuse data (16 questions in total). The population of interest for this survey consists of researchers from all disciplines and countries, sampled from the corresponding authors of papers indexed in the Web of Science (WoS) between 2016 and 2020.
Received 3,632 responses, 2,509 of which were completed, representing a completion rate of 68.6%. Incomplete responses were excluded from the dataset. The final total contains 2,492 complete responses and an uncorrected response rate of 1.57%. Controlling for invalid emails, bounced emails and opt-outs (n=5,201) produced a response rate of 1.62%, similar to surveys using comparable recruitment methods (Gregory et al., 2020).
Methods for processing the data:
Results were downloaded from SurveyMonkey in CSV format and were prepared for analysis using Excel and SPSS by recoding ordinal and multiple choice questions and by removing missing values.
Instrument- or software-specific information needed to interpret the data:
The dataset is provided in SPSS format, which requires IBM SPSS Statistics. The dataset is also available in a coded format in CSV. The Codebook is required to interpret to values.
DATA-SPECIFIC INFORMATION FOR: MDCDataCitationReuse2021surveydata
Number of variables: 95
Number of cases/rows: 2,492
Missing data codes: 999 Not asked
Refer to MDCDatacitationReuse2021Codebook.pdf for detailed variable information.
https://choosealicense.com/licenses/cc0-1.0/https://choosealicense.com/licenses/cc0-1.0/
adambuttrick/500K-ner-indexes-multiple-organizations-locations-alpaca-format-json-response-all-cases dataset hosted on Hugging Face and contributed by the HF Datasets community
Data for the article "Visual Design and Cognition in List-Style Open-Ended Questions in Web Probing" (Meitinger & Kunz, forthcoming) published in Sociological Methods & Research Abstract of article: Previous research reveals that the visual design of open-ended questions should match the response task so that respondents can infer the expected response format. Based on a web survey including specific probesin a list-style open-ended question format, we experimentally tested the effects of varying numbers of answer boxes on several indicators of response quality. Our results showed that using multiple small answer boxes instead of one large box had a positive impact on the number and variety of themes mentioned, as well as on the conciseness of responses to specific probes. We found no effect on the relevance of themes and the risk of item non-response. Based on our findings, we recommend using multiple small answer boxes instead of one large box to convey the expected response format and improve response quality in specific probes. This study makes a valuable contribution to the field of web probing, extends the concept of response quality in list-style open-ended questions, and provides a deeper understanding of how visual design features affect cognitive response processes in web surveys.
https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement
Welcome to the Portuguese Chain of Thought prompt-response dataset, a meticulously curated collection containing 3000 comprehensive prompt and response pairs. This dataset is an invaluable resource for training Language Models (LMs) to generate well-reasoned answers and minimize inaccuracies. Its primary utility lies in enhancing LLMs' reasoning skills for solving arithmetic, common sense, symbolic reasoning, and complex problems.
Dataset Content:
This COT dataset comprises a diverse set of instructions and questions paired with corresponding answers and rationales in the Portuguese language. These prompts and completions cover a broad range of topics and questions, including mathematical concepts, common sense reasoning, complex problem-solving, scientific inquiries, puzzles, and more.
Each prompt is meticulously accompanied by a response and rationale, providing essential information and insights to enhance the language model training process. These prompts, completions, and rationales were manually curated by native Portuguese people, drawing references from various sources, including open-source datasets, news articles, websites, and other reliable references.
Our chain-of-thought prompt-completion dataset includes various prompt types, such as instructional prompts, continuations, and in-context learning (zero-shot, few-shot) prompts. Additionally, the dataset contains prompts and completions enriched with various forms of rich text, such as lists, tables, code snippets, JSON, and more, with proper markdown format.
Prompt Diversity:
To ensure a wide-ranging dataset, we have included prompts from a plethora of topics related to mathematics, common sense reasoning, and symbolic reasoning. These topics encompass arithmetic, percentages, ratios, geometry, analogies, spatial reasoning, temporal reasoning, logic puzzles, patterns, and sequences, among others.
These prompts vary in complexity, spanning easy, medium, and hard levels. Various question types are included, such as multiple-choice, direct queries, and true/false assessments.
Response Formats:
To accommodate diverse learning experiences, our dataset incorporates different types of answers depending on the prompt and provides step-by-step rationales. The detailed rationale aids the language model in building reasoning process for complex questions.
These responses encompass text strings, numerical values, and date and time formats, enhancing the language model's ability to generate reliable, coherent, and contextually appropriate answers.
Data Format and Annotation Details:
This fully labeled Portuguese Chain of Thought Prompt Completion Dataset is available in JSON and CSV formats. It includes annotation details such as a unique ID, prompt, prompt type, prompt complexity, prompt category, domain, response, rationale, response type, and rich text presence.
Quality and Accuracy:
Our dataset upholds the highest standards of quality and accuracy. Each prompt undergoes meticulous validation, and the corresponding responses and rationales are thoroughly verified. We prioritize inclusivity, ensuring that the dataset incorporates prompts and completions representing diverse perspectives and writing styles, maintaining an unbiased and discrimination-free stance.
The Portuguese version is grammatically accurate without any spelling or grammatical errors. No copyrighted, toxic, or harmful content is used during the construction of this dataset.
Continuous Updates and Customization:
The entire dataset was prepared with the assistance of human curators from the FutureBeeAI crowd community. Ongoing efforts are made to add more assets to this dataset, ensuring its growth and relevance. Additionally, FutureBeeAI offers the ability to gather custom chain of thought prompt completion data tailored to specific needs, providing flexibility and customization options.
License:
The dataset, created by FutureBeeAI, is now available for commercial use. Researchers, data scientists, and developers can leverage this fully labeled and ready-to-deploy Portuguese Chain of Thought Prompt Completion Dataset to enhance the rationale and accurate response generation capabilities of their generative AI models and explore new approaches to NLP tasks.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Head related impulse response measurements with the KEMAR dummy head performed in an anechoic chamber with a resolution of 1°. The impulse responses are provided for different distances and are accompanied by headphone compensation filters.
This entry stores the measurements in the MAT format for use in Matlab/Octave. The measurements are identical to the once stored in the SOFA format available at https://doi.org/10.5281/zenodo.55418
https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement
Welcome to the Arabic Chain of Thought prompt-response dataset, a meticulously curated collection containing 3000 comprehensive prompt and response pairs. This dataset is an invaluable resource for training Language Models (LMs) to generate well-reasoned answers and minimize inaccuracies. Its primary utility lies in enhancing LLMs' reasoning skills for solving arithmetic, common sense, symbolic reasoning, and complex problems.
Dataset Content:
This COT dataset comprises a diverse set of instructions and questions paired with corresponding answers and rationales in the Arabic language. These prompts and completions cover a broad range of topics and questions, including mathematical concepts, common sense reasoning, complex problem-solving, scientific inquiries, puzzles, and more.
Each prompt is meticulously accompanied by a response and rationale, providing essential information and insights to enhance the language model training process. These prompts, completions, and rationales were manually curated by native Arabic people, drawing references from various sources, including open-source datasets, news articles, websites, and other reliable references.
Our chain-of-thought prompt-completion dataset includes various prompt types, such as instructional prompts, continuations, and in-context learning (zero-shot, few-shot) prompts. Additionally, the dataset contains prompts and completions enriched with various forms of rich text, such as lists, tables, code snippets, JSON, and more, with proper markdown format.
Prompt Diversity:
To ensure a wide-ranging dataset, we have included prompts from a plethora of topics related to mathematics, common sense reasoning, and symbolic reasoning. These topics encompass arithmetic, percentages, ratios, geometry, analogies, spatial reasoning, temporal reasoning, logic puzzles, patterns, and sequences, among others.
These prompts vary in complexity, spanning easy, medium, and hard levels. Various question types are included, such as multiple-choice, direct queries, and true/false assessments.
Response Formats:
To accommodate diverse learning experiences, our dataset incorporates different types of answers depending on the prompt and provides step-by-step rationales. The detailed rationale aids the language model in building reasoning process for complex questions.
These responses encompass text strings, numerical values, and date and time formats, enhancing the language model's ability to generate reliable, coherent, and contextually appropriate answers.
Data Format and Annotation Details:
This fully labeled Arabic Chain of Thought Prompt Completion Dataset is available in JSON and CSV formats. It includes annotation details such as a unique ID, prompt, prompt type, prompt complexity, prompt category, domain, response, rationale, response type, and rich text presence.
Quality and Accuracy:
Our dataset upholds the highest standards of quality and accuracy. Each prompt undergoes meticulous validation, and the corresponding responses and rationales are thoroughly verified. We prioritize inclusivity, ensuring that the dataset incorporates prompts and completions representing diverse perspectives and writing styles, maintaining an unbiased and discrimination-free stance.
The Arabic version is grammatically accurate without any spelling or grammatical errors. No copyrighted, toxic, or harmful content is used during the construction of this dataset.
Continuous Updates and Customization:
The entire dataset was prepared with the assistance of human curators from the FutureBeeAI crowd community. Ongoing efforts are made to add more assets to this dataset, ensuring its growth and relevance. Additionally, FutureBeeAI offers the ability to gather custom chain of thought prompt completion data tailored to specific needs, providing flexibility and customization options.
License:
The dataset, created by FutureBeeAI, is now available for commercial use. Researchers, data scientists, and developers can leverage this fully labeled and ready-to-deploy Arabic Chain of Thought Prompt Completion Dataset to enhance the rationale and accurate response generation capabilities of their generative AI models and explore new approaches to NLP tasks.
U.S. Government Workshttps://www.usa.gov/government-works
License information was derived automatically
This data set contains example data for exploration of the theory of regression based regionalization. The 90th percentile of annual maximum streamflow is provided as an example response variable for 293 streamgages in the conterminous United States. Several explanatory variables are drawn from the GAGES-II data base in order to demonstrate how multiple linear regression is applied. Example scripts demonstrate how to collect the original streamflow data provided and how to recreate the figures from the associated Techniques and Methods chapter.
https://search.gesis.org/research_data/datasearch-httpwww-da-ra-deoaip--oaioai-da-ra-de450973https://search.gesis.org/research_data/datasearch-httpwww-da-ra-deoaip--oaioai-da-ra-de450973
Abstract (en): The National Center for Early Development and Learning (NCEDL) combined the data of two major studies in order to understand variations among state-funded pre-kindergarten (pre-k) programs and in turn, how these variations relate to child outcomes at the end of pre-k and in kindergarten. The Multi-State Study of Pre-Kindergarten and the State-Wide Early Education Programs (SWEEP) Study provide detailed information on pre-kindergarten teachers, children, and classrooms in 11 states. By combining data from both studies, information is available from 721 classrooms and 2,982 pre-kindergarten children in these 11 states. Pre-kindergarten data collection for the Multi-State Study of Pre-Kindergarten took place during the 2001-2002 school year in six states: California, Georgia, Illinois, Kentucky, New York, and Ohio. These states were selected from among states that had committed significant resources to pre-k initiatives. States were selected to maximize diversity with regard to geography, program settings (public school or community setting), program intensity (full-day vs. part-day), and educational requirements for teachers. In each state, a stratified random sample of 40 centers/schools was selected from the list of all the school/centers or programs (both contractors and subcontractors) provided to the researchers by each state's department of education. In total, 238 sites participated in the fall and two additional sites joined the study in the spring. Participating teachers helped the data collectors recruit children into the study by sending recruitment packets home with all children enrolled in the classroom. On the first day of data collection, the data collectors determined which of the children were eligible to participate. Eligible children were those who (1) would be old enough for kindergarten in the fall of 2002, (2) did not have an Individualized Education Plan, according to the teacher, and (3) spoke English or Spanish well enough to understand simple instructions, according to the teacher. Pre-kindergarten data collection for the SWEEP Study took place during the 2003-2004 school year in five states: Massachusetts, New Jersey, Texas, Washington, and Wisconsin. These states were selected to complement the states already in the Multi-State Study of Pre-K by including programs with significantly different funding models or modes of service delivery. In each of the five states, 100 randomly selected state-funded pre-kindergarten sites were recruited for participation in the study from a list of all sites provided by the state. In total, 465 sites participated in the fall. Two sites declined to continue participation in the spring, resulting in 463 sites participating in the spring. Participating teachers helped the data collectors recruit children into the study by sending recruitment packets home with all children enrolled in the classroom. On the first day of data collection, the data collectors determined which of the children were eligible to participate. Eligible children were those who (1) would be old enough for kindergarten in the fall of 2004, (2) did not have an Individualized Education Plan, according to the teacher, and (3) spoke English or Spanish well enough to understand simple instructions, according to the teacher. Demographic information collected across both studies includes race, teacher gender, child gender, family income, mother's education level, and teacher education level. The researchers also created a variable for both the child-level data and the class-level data which allows secondary users to subset cases according to either the Multi-State or SWEEP study. ICPSR data undergo a confidentiality review and are altered when necessary to limit the risk of disclosure. ICPSR also routinely creates ready-to-go data files along with setups in the major statistical software formats as well as standard codebooks to accompany the data. In addition to these procedures, ICPSR performed the following processing steps for this data collection: Performed recodes and/or calculated derived variables.. Response Rates: Multi-State: Of the 40 sites per state, 78 percent of eligible sites agreed to participate (fall of pre-k, n = 238). For fall of pre-k (n = 238), 94 percent of the one classroom per site selected agreed to participate. For fall (n = 940) and spring (n = 960) of pre-k, 61 percent of the parents of eligible children consented.; SWEEP: Of the 10...
An estimated 723,000 Rohingya refugees have fled violence in Myanmar's Rakhine state since August 25, 2017. Most of the newly-arrived refugees rely on humanitarian assistance, having left with few possessions and exhausted their financial resources during the journey. The monsoon season began in May and continues into September, threatening the vast majority of refugees living in makeshift shelters and settlements highly vulnerable to floods and landsides. To understand the priority needs of the refugees, a Multi-Sector Needs Assessment (MSNA), commissioned by UNHCR and with technical support from REACH, was conducted at the household level in 31 refugee sites (3,171 households were surveyed). Translators Without Borders supported in questionnaire translation and enumerator training. This survey identified a number of areas where the basic needs of Rohingya refugees are being met. At the same time, this assessment has identified continuing service gaps in the Rohingya response. For example, the majority of households do not believe there is enough light at night to safely access latrines, and WASH facilities are generally perceived as dangerous areas for girls under age 18. In terms of access to protection services, only a small number of households report members making use of children and women friendly spaces. Despite widespread distribution coverage of key non-food items such as kitchen sets, demand for these items remains high, and refugees are spending the greatest portion of their limited financial resources on basic items including food, clothing and fuel. Findings suggest that there are uncertainties around actions to prepare for cyclones. The mahjis remain almost the sole focal point for communication and complaints with refugees, reflecting their continued prominent position within refugee communities. Finally, the median household debt is twice the median household income for the 30 days prior to data collection, with only two-fifths of households reporting any source of income at all.
31 refugee sites in the upazilas of Ukhiya and Teknaf in Cox's Bazar district.
Households and individuals
Data was collected via a household survey, conducted in 31 of the 34 refugee sites open at the time of data collection (data collection did not take place in Camp 4 and Camp 20 Extension since these camps were empty at the time of study design; data collection was aborted in Kutupalong Refugee Camp due to security concerns). The sample frame was developed to yield household-level results that were representative at the camp level at a 95% confidence level with 10% margin of error at the camp level, and 95% confidence level with 5% margin of error at aggregate level for all camps. For several indicators, data were collected on individuals within the household, rather than at the household level. Since sampling took place at the household level, data for these indicators is indicative and not statistically representative.
Face-to-face [f2f]
Questionnaire has the following sections: - general information - health - food assistance - site management - direct observation
Data was anonymized with recoding and local supression.
Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically
The MTR-QA dataset contains 24,312 reasoning data, including 8,740 logical reasoning data, 9,105 semantic reasoning data, 2,647 mathematical reasoning data, and 3,818 comprehensive knowledge reasoning data, with the size of 34MB.1 MB, which are stored in JSON format, and each JSON object has six attributes: Instruction, Question, Each JSON object has six attributes: Instruction, Question, Answer, Target, Lable, and Difficulty, which correspond to the instruction suggested by the user, the option provided by the user, the correct answer, the chain of thought, the reasoning type of the question, and the difficulty level of the question, respectively. Among them, lable corresponds to different types of reasoning, which are categorized into logical reasoning, semantic reasoning, mathematical reasoning, and comprehensive knowledge reasoning; difficulty is categorized into primary, intermediate, and advanced difficulty levels according to the degree of difficulty of the language, the steps of solving the problem, and the complexity of the knowledge points, and the grading method can clearly evaluate the degree of difficulty.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The Multiple-Model High Resolution HRTF database is a collection of HRTFs measured using four different Head-and-Torso Simulators at high spatial resolution (2 degree azimuth and elevation). The data here is stored in SOFA format, with the data identical to the HDF5 files in doi:10.5281/zenodo.1226873.
https://www.icpsr.umich.edu/web/ICPSR/studies/36951/termshttps://www.icpsr.umich.edu/web/ICPSR/studies/36951/terms
These data are part of NACJD's Fast Track Release and are distributed as they were received from the data depositor. The files have been zipped by NACJD for release but not checked or processed except for the removal of direct identifiers. Users should refer to the accompanying readme file for a brief description of the files available with this collection and consult the investigator(s) if further information is needed. The study gathered information from police officers and residents of four different community areas that had undergone some form of police consolidation or contracting. The communities were the city of Pontiac in Michigan; the cities of Chisago and Lindstrom in Minnesota; York and Windsor Townships and the boroughs of Felton, Jacobus, Yoe, Red Lion, and Windsor in Pennsylvania; and the city of Compton in California. Surveys were administered to gauge the implementation and effectiveness of three models of police consolidation: merger of agencies, regionalization under which two or more agencies join to provide services in a broader area, and contracting by municipalities with other organizations for police services. The collection includes 5 SPSS files: ComptonFinal_Masked-by-ICPSR.sav (176 cases / 99 variables) MinnesotaFinal_Masked-by-ICPSR.sav (228 cases / 99 variables) PontiacFinal_Masked-by-ICPSR.sav (230 cases / 99 variables) YorkFinal_Masked-by-ICPSR.sav (219 cases / 99 variables) OfficerWebFINALrecodesaug2015revised_Masked-by-ICPSR.sav (139 cases / 88 variables)
The most commonly used procedure for prediction of the behaviour of laterally loaded piles is the P–y curve formulation, which gives a simple but efficient framework to predict the response of the pile. This framework is limited to a single direction of loading, while there are several situations in which a pile is subjected to lateral loads with varying direction, as for example in the case of wind or wave loads. Here an extended framework for P–y curve modelling is presented, in which several springs are considered around the pile perimeter at each depth. The advantage of this framework is that it remains as simple and practical as the original P–y curve method and does not need any further information or parameters. A procedure is proposed for the extension of a given unidirectional model to the corresponding multi-directional one. The effects of multi-directional loading are discussed based on the simulation results. With a change in loading direction, misalignment between load direction and total displacement occurs. In addition, this quite simple model enables deduction of the profile of irreversible soil displacements around the pile at various depths.
In successive waves over four decades, Rohingya refugees have been fleeing to Bangladesh from Rakhine State, Myanmar, where they have suffered systematic ongoing persecution. Since August 2017, an estimated 745,000 Rohingya refugees have arrived in Cox's Bazar, Bangladesh, increasing the total number of Rohingya refugees to more than 900,000.
Most of the newly-arrived refugees have settled in hilly, formerly-forested areas that are vulnerable to landslides and flash-flooding in monsoon season and rely heavily on humanitarian assistance to cover their basic needs. As the crisis moves beyond the initial emergency phase, comprehensive information on the needs and vulnerabilities of affected populations is needed in order to inform the design and implementation of effective inter-sectoral programming. To this aim, a Joint Multi-Sector Needs Assessment (J-MSNA) was conducted across Rohingya refugee populations to support humanitarian planning and enhance operational and strategic decision-making. The J-MSNA was conducted in support of the mid-term review of the 2019 Joint Response Plan (JRP), with the specific objective of enabling the tracking of JRP 2019 indicators for monitoring and review purposes. A total of 876 households were surveyed across 33 refugee sites.
This J-MSNA was funded by UNHCR and coordinated through the MSNA Technical Working Group of the Information Management and Assessment Working Group (IMAWG), led by the Inter-Sector Coordination Group (ISCG) and comprised of: UNHCR, IOM Needs and Population Monitoring (NPM), ACAPS, WFP VAM, Translators without Borders, and REACH.
National coverage
Households
Sample survey data [ssd]
A total of 876 households were surveyed across 33 refugee sites, 2 employing a simple random sampling methodology of shelter footprints within official site boundaries. Each survey was conducted with an adult household representative responding on behalf of the household and its members. Findings are generalisable to refugee populations living within each of the two Upazilas with a 95% confidence level and 5% margin of error. This factsheet presents key findings from both Upazilas, where households were surveyed between 9 - 24 June 2019.
Face-to-face [f2f]
The questionnaire includes the following sections: household characteristics, individual characteristics, nutrition, shelter, protection and social cohesion, food security and livelihoods, communication with communities, multi-sector.
Dataset was edited and anonymised with recoding and local suppression.
The high-frequency phone survey of refugees monitors the economic and social impact of and responses to the COVID-19 pandemic on refugees and nationals, by calling a sample of households every four weeks. The main objective is to inform timely and adequate policy and program responses. Since the outbreak of the COVID-19 pandemic in Ethiopia, two rounds of data collection of refugees were completed between September and November 2020. The first round of the joint national and refugee HFPS was implemented between the 24 September and 17 October 2020 and the second round between 20 October and 20 November 2020.
Household
Sample survey data [ssd]
The sample was drawn using a simple random sample without replacement. Expecting a high non-response rate based on experience from the HFPS-HH, we drew a stratified sample of 3,300 refugee households for the first round. More details on sampling methodology are provided in the Survey Methodology Document available for download as Related Materials.
Computer Assisted Telephone Interview [cati]
The Ethiopia COVID-19 High Frequency Phone Survey of Refugee questionnaire consists of the following sections:
A more detailed description of the questionnaire is provided in Table 1 of the Survey Methodology Document that is provided as Related Materials. Round 1 and 2 questionnaires available for download.
DATA CLEANING At the end of data collection, the raw dataset was cleaned by the Research team. This included formatting, and correcting results based on monitoring issues, enumerator feedback and survey changes. Data cleaning carried out is detailed below.
Variable naming and labeling: • Variable names were changed to reflect the lowercase question name in the paper survey copy, and a word or two related to the question. • Variables were labeled with longer descriptions of their contents and the full question text was stored in Notes for each variable. • “Other, specify” variables were named similarly to their related question, with “_other” appended to the name. • Value labels were assigned where relevant, with options shown in English for all variables, unless preloaded from the roster in Amharic.
Variable formatting:
• Variables were formatted as their object type (string, integer, decimal, time, date, or datetime).
• Multi-select variables were saved both in space-separated single-variables and as multiple binary variables showing the yes/no value of each possible response.
• Time and date variables were stored as POSIX timestamp values and formatted to show Gregorian dates.
• Location information was left in separate ID and Name variables, following the format of the incoming roster. IDs were formatted to include only the variable level digits, and not the higher-level prefixes (2-3 digits only.)
• Only consented surveys were kept in the dataset, and all personal information and internal survey variables were dropped from the clean dataset. • Roster data is separated from the main data set and kept in long-form but can be merged on the key variable (key can also be used to merge with the raw data).
• The variables were arranged in the same order as the paper instrument, with observations arranged according to their submission time.
Backcheck data review: Results of the backcheck survey are compared against the originally captured survey results using the bcstats command in Stata. This function delivers a comparison of variables and identifies any discrepancies. Any discrepancies identified are then examined individually to determine if they are within reason.
The following data quality checks were completed: • Daily SurveyCTO monitoring: This included outlier checks, skipped questions, a review of “Other, specify”, other text responses, and enumerator comments. Enumerator comments were used to suggest new response options or to highlight situations where existing options should be used instead. Monitoring also included a review of variable relationship logic checks and checks of the logic of answers. Finally, outliers in phone variables such as survey duration or the percentage of time audio was at a conversational level were monitored. A survey duration of close to 15 minutes and a conversation-level audio percentage of around 40% was considered normal. • Dashboard review: This included monitoring individual enumerator performance, such as the number of calls logged, duration of calls, percentage of calls responded to and percentage of non-consents. Non-consent reason rates and attempts per household were monitored as well. Duration analysis using R was used to monitor each module's duration and estimate the time required for subsequent rounds. The dashboard was also used to track overall survey completion and preview the results of key questions. • Daily Data Team reporting: The Field Supervisors and the Data Manager reported daily feedback on call progress, enumerator feedback on the survey, and any suggestions to improve the instrument, such as adding options to multiple choice questions or adjusting translations. • Audio audits: Audio recordings were captured during the consent portion of the interview for all completed interviews, for the enumerators' side of the conversation only. The recordings were reviewed for any surveys flagged by enumerators as having data quality concerns and for an additional random sample of 2% of respondents. A range of lengths were selected to observe edge cases. Most consent readings took around one minute, with some longer recordings due to questions on the survey or holding for the respondent. All reviewed audio recordings were completed satisfactorily. • Back-check survey: Field Supervisors made back-check calls to a random sample of 5% of the households that completed a survey in Round 1. Field Supervisors called these households and administered a short survey, including (i) identifying the same respondent; (ii) determining the respondent's position within the household; (iii) confirming that a member of the the data collection team had completed the interview; and (iv) a few questions from the original survey.
https://www.law.cornell.edu/uscode/text/17/106https://www.law.cornell.edu/uscode/text/17/106
Multiple-attempt items are an innovative item type that remains under-studied in psychometrics and educational measurement. This dissertation advances the field by (a) extending sequential item-response theory for multiple-choice, multiple-attempt items (SIRT-MM), (b) designing computerized adaptive testing that incorporates multiple-attempt items, and (c) clarifying and detecting differential item functioning for such items.
Chapter 2 introduces two extensions of the SIRT-MM model. The first permits the slope of each item-category response function to vary, while the second freely estimates a pseudo guessing parameter to capture different success rates due to guessing. These models allow a wider range of response-function shapes and are more likely to fit empirical data. Model-selection strategies and parameter estimation methods for the new formulations are also proposed and evaluated.
Chapter 3 explores the integration of multiple-choice, multiple-attempt test items within the Computerized Adaptive Testing (CAT) framework, named as MM-CAT. Using the sequential item response theory model for multiple-choice, multiple-attempt items (Lu, Fowler, & Cheng, 2025), a simulation study was conducted to investigate the effectiveness of a MM-CAT design in improving ability estimation accuracy compared to traditional CAT, which relies on single-attempt, dichotomously scored items. Results show that MM-CAT substantially reduces the standard error of measurement (SEM), bias and root mean square error (RMSE), particularly for examinees with lower ability levels. Furthermore, we examine the impact of item exposure control procedures and find that while both the Sympson-and-Hetter method (SH; Shealy & Stout, 1993) and the Randomesque method (Kingsbury & Zara, 1989) are useful, the SH method is particularly effective in exposure control when paired with MM-CAT, minimizing the severeness of over-exposed items without sacrificing the measurement precision. Taken together, these findings suggest that MM-CAT is a promising approach for enhancing the precision and fairness of adaptive testing, especially in educational contexts where multiple attempts may support both assessment and learning.
While multiple-attempt procedures and items have been widely studied, limited research has addressed Differential Item Functioning (DIF) in the context of multiple-attempt items. Chapter 4 formalizes the concept of attempt-level DIF, which captures attempt-specific mechanisms underlying DIF. We present example scenarios to illustrate how attempt-level DIF can arise and propose several detection methods capable of identifying it. Simulation results demonstrate that these methods yield higher true positive rates (i.e., greater power) compared to traditional DIF detection approaches. Their advantage is particularly evident when the sample size and variance of item responses are reduced in the specific attempt where DIF exists.
Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically
This dataset was generated as part of a multi-wave field study and an online experiment designed to examine how individuals' green consumption behavior relates to workplace green behavior. In Study 1, we employed a two-wave time-lagged field survey to enhance external validity. Participants were recruited using a snowball sampling method through the authors’ alumni and professional social networks across multiple regions in China. At Time 1, participants completed a questionnaire assessing their green consumption behavior, warm glow, environmental concern, and demographic variables. A total of 408 employees submitted valid responses. Two weeks later, the same participants were invited to complete a Time 2 survey measuring organizational belongingness and employee green behavior. This second wave yielded 375 valid responses, resulting in a high response rate of 91.91%. All participants provided informed consent and were assured of the confidentiality and academic purpose of the study.The required sample size was estimated using G*Power (version 3.1), which indicated that 146 participants were needed to detect an effect size of f = 0.25 with a power of 0.85 and α = 0.05. To cross-validate the findings, Study 2 involved an online experiment with full-time employees recruited via Credamo, a professional data collection platform in China. A total of 150 valid responses were collected. Participation was limited to users with a credit score above 80 (out of 100), ensuring high response quality. Several attention-check questions were embedded to exclude inattentive respondents. Demographic information showed that 49.3% of participants were female, 40.0% were under 30 years old, and 80.0% held at least a bachelor's degree.All data were collected using structured self-report questionnaires programmed and delivered via professional survey platforms. Data preprocessing involved screening for incomplete entries, removing invalid responses based on attention checks, and anonymizing any identifying information. The final dataset includes complete responses only; thus, no missing data are present in the shared dataset.The dataset is structured in tabular format (.xlsx and .csv) and includes the following variables:Participant_ID: anonymized code for each respondentGreen_Consumption: participants’ self-rated green consumption behavior (Likert scale, unitless)Warm_Glow, Environmental_Concern, Organizational_Belongingness, Green_Behavior: psychological constructs measured with validated multi-item scales, all using 7-point Likert scalesGender, Age_Group, Education, Industry, and other demographic fieldsEach row represents a single participant’s responses, and each column represents a questionnaire item or demographic variable. Measurement units are not applicable for Likert-type responses.There are no known systematic errors in the dataset. Minor self-reporting bias may exist due to the nature of subjective questionnaires, but multiple measures were taken to reduce response bias, including anonymity and attention checks. The dataset does not contain any spatial information, and while time-lagged, the data are cross-sectional in structure with two points of measurement (Time 1 and Time 2) spaced approximately two weeks apart.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The continued expansion of the digital age, driven by technological innovations, has fundamentally transformed the landscape of modern higher education services, leading to discussions about evaluation techniques. The emergence of generative artificial intelligence (AI) applications has raised questions about the reliability and academic honesty of multiple-choice question assessments within the realm of online education. In this context, this study investigates the effects of multiple-answer questions (MAQs) versus traditional single-answer questions (SAQs) within online higher education assessments. We conducted a mixed-methods study with quantitative field experiments and qualitative interviews with students enrolled in an Online Marketing MSc programme. Students were divided randomly and assessed using different structures: SAQs or MAQs. We evaluated variables such as grade averages, study times, perceived workload, and difficulty using independent sample t-tests. The effect of independent variables on coursework grades was explored via regression analysis. We found that, although grades were lower and MAQs were perceived as more difficult, study times and perceived workload showed no significant differences between formats. These findings suggest that MAQs—despite their challenge—can promote deeper understanding and greater learning retention. Furthermore, even with the higher perceived difficulty and impact on performance, MAQs hold potential for dealing with AI-related academic integrity concerns. However, careful implementation and further research are recommended to explore best practices in integrating MAQs into online assessment design.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Analysis of ‘People according to the form of acquisition of videos. Multi-response (%)’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from http://data.europa.eu/88u/dataset/https-opendata-euskadi-eus-catalogo-estadistica-territorio-zona-geografica-y-dimension-municipal-cine-personas-segun-la-forma-de-adquisicion-de-videos-multirespuesta- on 17 January 2022.
--- Dataset description provided by original source is as follows ---
The Basque Observatory of Culture was created to place culture as a central element of social and economic development, with the mission of rigorously filling the information gap in the cultural field, in line with the Basque Culture Plan of which it forms part. The Observatory’s scope of action focuses on traditional areas of culture: cultural heritage, artistic creation and expression, industries and cross-cutting areas.The Basque Observatory of Culture publishes and updates more than 200 statistical indicators that can be consulted in euskadi.eus along with other research and reports.
--- Original source retains full ownership of the source dataset ---