100+ datasets found
  1. QADO: An RDF Representation of Question Answering Datasets and their...

    • figshare.com
    • datasetcatalog.nlm.nih.gov
    zip
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Andreas Both; Oliver Schmidtke; Aleksandr Perevalov (2023). QADO: An RDF Representation of Question Answering Datasets and their Analyses for Improving Reproducibility [Dataset]. http://doi.org/10.6084/m9.figshare.21750029.v3
    Explore at:
    zipAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Andreas Both; Oliver Schmidtke; Aleksandr Perevalov
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Measuring the quality of Question Answering (QA) systems is a crucial task to validate the results of novel approaches. However, there are already indicators of a reproducibility crisis as many published systems have used outdated datasets or use subsets of QA benchmarks, making it hard to compare results. We identified the following core problems: there is no standard data format, instead, proprietary data representations are used by the different partly inconsistent datasets; additionally, the characteristics of datasets are typically not reflected by the dataset maintainers nor by the system publishers. To overcome these problems, we established an ontology---Question Answering Dataset Ontology (QADO)---for representing the QA datasets in RDF. The following datasets were mapped into the ontology: the QALD series, LC-QuAD series, RuBQ series, ComplexWebQuestions, and Mintaka. Hence, the integrated data in QADO covers widely used datasets and multilinguality. Additionally, we did intensive analyses of the datasets to identify their characteristics to make it easier for researchers to identify specific research questions and to select well-defined subsets. The provided resource will enable the research community to improve the quality of their research and support the reproducibility of experiments.

    Here, the mapping results of the QADO process, the SPARQL queries for data analytics, and the archived analytics results file are provided.

    Up-to-date statistics can be created automatically by the script provided at the corresponding QADO GitHub RDFizer repository.

  2. R Questions from Stack Overflow

    • kaggle.com
    zip
    Updated Sep 26, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Stack Overflow (2017). R Questions from Stack Overflow [Dataset]. https://www.kaggle.com/stackoverflow/rquestions
    Explore at:
    zip(183212751 bytes)Available download formats
    Dataset updated
    Sep 26, 2017
    Dataset authored and provided by
    Stack Overflowhttp://stackoverflow.com/
    Description

    Full text of questions and answers from Stack Overflow that are tagged with the r tag, useful for natural language processing and community analysis.

    This is organized as three tables:

    • Questions contains the title, body, creation date, score, and owner ID for each R question.
    • Answers contains the body, creation date, score, and owner ID for each of the answers to these questions. The ParentId column links back to the Questions table.
    • Tags contains the tags on each question besides the R tag.

    For space reasons only non-deleted and non-closed content are included in the dataset. The dataset contains questions up to 24 September 2017 (UTC).

    License

    All Stack Overflow user contributions are licensed under CC-BY-SA 3.0 with attribution required.

  3. e

    Introduction to Statistics & Probability

    • paper.erudition.co.in
    html
    Updated Feb 6, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Einetic (2022). Introduction to Statistics & Probability [Dataset]. https://paper.erudition.co.in/makaut/master-of-computer-applications-2-years/2/numerical-and-statistical-analysis
    Explore at:
    htmlAvailable download formats
    Dataset updated
    Feb 6, 2022
    Dataset authored and provided by
    Einetic
    License

    https://paper.erudition.co.in/termshttps://paper.erudition.co.in/terms

    Description

    Question Paper Solutions of chapter Introduction to Statistics & Probability of Numerical and Statistical Analysis, 2nd Semester , Master of Computer Applications (2 Years)

  4. e

    Data from: Numerical solution of Linear equations

    • paper.erudition.co.in
    html
    Updated Feb 6, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Einetic (2022). Numerical solution of Linear equations [Dataset]. https://paper.erudition.co.in/makaut/master-of-computer-applications-2-years/2/numerical-and-statistical-analysis
    Explore at:
    htmlAvailable download formats
    Dataset updated
    Feb 6, 2022
    Dataset authored and provided by
    Einetic
    License

    https://paper.erudition.co.in/termshttps://paper.erudition.co.in/terms

    Description

    Question Paper Solutions of chapter Numerical solution of Linear equations of Numerical and Statistical Analysis, 2nd Semester , Master of Computer Applications (2 Years)

  5. r

    ROUNDING, FOCAL POINT ANSWERS AND NONRESPONSE TO SUBJECTIVE PROBABILITY...

    • resodate.org
    Updated Oct 6, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kristin J. Kleinjans (2025). ROUNDING, FOCAL POINT ANSWERS AND NONRESPONSE TO SUBJECTIVE PROBABILITY QUESTIONS (replication data) [Dataset]. https://resodate.org/resources/aHR0cHM6Ly9qb3VybmFsZGF0YS56YncuZXUvZGF0YXNldC9yb3VuZGluZy1mb2NhbC1wb2ludC1hbnN3ZXJzLWFuZC1ub25yZXNwb25zZS10by1zdWJqZWN0aXZlLXByb2JhYmlsaXR5LXF1ZXN0aW9ucw==
    Explore at:
    Dataset updated
    Oct 6, 2025
    Dataset provided by
    Journal of Applied Econometrics
    ZBW
    ZBW Journal Data Archive
    Authors
    Kristin J. Kleinjans
    Description

    We develop a panel data model explaining answers to subjective probabilities about binary events and estimate it using data from the Health and Retirement Study on six such probabilities. The model explicitly accounts for several forms of reporting behavior: rounding, focal point 50% answers and item nonresponse. We find observed and unobserved heterogeneity in the tendencies to report rounded values or a focal answer, explaining persistency in 50% answers over time. Focal 50% answers matter for some of the probabilities. Incorporating reporting behavior does not have a large effect on the estimated distribution of the genuine subjective probabilities.

  6. Questions from Cross Validated Stack Exchange

    • kaggle.com
    zip
    Updated Oct 8, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Stack Overflow (2019). Questions from Cross Validated Stack Exchange [Dataset]. https://www.kaggle.com/stackoverflow/statsquestions
    Explore at:
    zip(165628099 bytes)Available download formats
    Dataset updated
    Oct 8, 2019
    Dataset authored and provided by
    Stack Overflowhttp://stackoverflow.com/
    Description

    Full text of questions and answers from Cross Validated, the statistics and machine learning Q&A site from the Stack Exchange network.

    This is organized as three tables:

    • Questions contains the title, body, creation date, score, and owner ID for each question.
    • Answers contains the body, creation date, score, and owner ID for each of the answers to these questions. The ParentId column links back to the Questions table.
    • Tags contains the tags on each question

    For space reasons only non-deleted and non-closed content are included in the dataset. The dataset contains questions up to 19 October 2016 (UTC).

    License

    All Stack Exchange user contributions are licensed under CC-BY-SA 3.0 with attribution required.

  7. e

    Data from: Numerical solution of ordinary differential equation

    • paper.erudition.co.in
    html
    Updated Feb 6, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Einetic (2022). Numerical solution of ordinary differential equation [Dataset]. https://paper.erudition.co.in/makaut/master-of-computer-applications-2-years/2/numerical-and-statistical-analysis
    Explore at:
    htmlAvailable download formats
    Dataset updated
    Feb 6, 2022
    Dataset authored and provided by
    Einetic
    License

    https://paper.erudition.co.in/termshttps://paper.erudition.co.in/terms

    Description

    Question Paper Solutions of chapter Numerical solution of ordinary differential equation of Numerical and Statistical Analysis, 2nd Semester , Master of Computer Applications (2 Years)

  8. Data from: Basic statistical considerations for physiology: The journal...

    • tandf.figshare.com
    txt
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Aaron R. Caldwell; Samuel N. Cheuvront (2023). Basic statistical considerations for physiology: The journal Temperature toolbox [Dataset]. http://doi.org/10.6084/m9.figshare.8320151.v2
    Explore at:
    txtAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    Taylor & Francishttps://taylorandfrancis.com/
    Authors
    Aaron R. Caldwell; Samuel N. Cheuvront
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The average environmental and occupational physiologist may find statistics are difficult to interpret and use since their formal training in statistics is limited. Unfortunately, poor statistical practices can generate erroneous or at least misleading results and distorts the evidence in the scientific literature. These problems are exacerbated when statistics are used as thoughtless ritual that is performed after the data are collected. The situation is worsened when statistics are then treated as strict judgements about the data (i.e., significant versus non-significant) without a thought given to how these statistics were calculated or their practical meaning. We propose that researchers should consider statistics at every step of the research process whether that be the designing of experiments, collecting data, analysing the data or disseminating the results. When statistics are considered as an integral part of the research process, from start to finish, several problematic practices can be mitigated. Further, proper practices in disseminating the results of a study can greatly improve the quality of the literature. Within this review, we have included a number of reminders and statistical questions researchers should answer throughout the scientific process. Rather than treat statistics as a strict rule following procedure we hope that readers will use this review to stimulate a discussion around their current practices and attempt to improve them. The code to reproduce all analyses and figures within the manuscript can be found at https://doi.org/10.17605/OSF.IO/BQGDH.

  9. e

    Statistics

    • paper.erudition.co.in
    html
    Updated Jun 9, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Einetic (2020). Statistics [Dataset]. https://paper.erudition.co.in/competitive-exams/gate/question-paper
    Explore at:
    htmlAvailable download formats
    Dataset updated
    Jun 9, 2020
    Dataset authored and provided by
    Einetic
    License

    https://paper.erudition.co.in/termshttps://paper.erudition.co.in/terms

    Description

    Get Exam Question Paper Solutions of Statistics and many more.

  10. u

    Amazon Question and Answer Data

    • cseweb.ucsd.edu
    json
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    UCSD CSE Research Project, Amazon Question and Answer Data [Dataset]. https://cseweb.ucsd.edu/~jmcauley/datasets.html
    Explore at:
    jsonAvailable download formats
    Dataset authored and provided by
    UCSD CSE Research Project
    Description

    These datasets contain 1.48 million question and answer pairs about products from Amazon.

    Metadata includes

    • question and answer text

    • is the question binary (yes/no), and if so does it have a yes/no answer?

    • timestamps

    • product ID (to reference the review dataset)

    Basic Statistics:

    • Questions: 1.48 million

    • Answers: 4,019,744

    • Labeled yes/no questions: 309,419

    • Number of unique products with questions: 191,185

  11. e

    Statistics (ST), Question Paper, Graduate Aptitude Test in Engineering,...

    • paper.erudition.co.in
    html
    Updated Jun 10, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Einetic (2020). Statistics (ST), Question Paper, Graduate Aptitude Test in Engineering, Competitive Exams | Erudition Paper [Dataset]. https://paper.erudition.co.in/competitive-exams/gate/question-paper/statistics
    Explore at:
    htmlAvailable download formats
    Dataset updated
    Jun 10, 2020
    Dataset authored and provided by
    Einetic
    License

    https://paper.erudition.co.in/termshttps://paper.erudition.co.in/terms

    Description

    Question Paper Solutions of Statistics (ST),Question Paper,Graduate Aptitude Test in Engineering,Competitive Exams

  12. f

    Aggregate statistics of the answer rate of questions in each site.

    • datasetcatalog.nlm.nih.gov
    • plos.figshare.com
    • +1more
    Updated Dec 31, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Aranovich, Raul; Filkov, Vladimir; Wu, Muting (2021). Aggregate statistics of the answer rate of questions in each site. [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0000923770
    Explore at:
    Dataset updated
    Dec 31, 2021
    Authors
    Aranovich, Raul; Filkov, Vladimir; Wu, Muting
    Description

    Aggregate statistics of the answer rate of questions in each site.

  13. R

    Question Answers Label Dataset

    • universe.roboflow.com
    zip
    Updated Nov 30, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Question Answer Labelling (2022). Question Answers Label Dataset [Dataset]. https://universe.roboflow.com/question-answer-labelling/question-answers-label/dataset/1
    Explore at:
    zipAvailable download formats
    Dataset updated
    Nov 30, 2022
    Dataset authored and provided by
    Question Answer Labelling
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    Objects Bounding Boxes
    Description

    Here are a few use cases for this project:

    1. Digital Document Management: This model can be used to effectively organize and manage digital documents. By identifying areas such as headers, addresses, and vendors, it could streamline workflows in companies dealing with large amounts of papers, forms or invoices.

    2. Automated Data Extraction: The model could be used in extracting pertinent information from documents automatically. For example, pulling out questions and answers from educational materials, extracting vendor or address information from invoices, or grabbing column headers from statistical reports.

    3. Augmented Reality (AR) Applications: "Question Answers Label" can be utilized in AR glasses to give real-time information about objects a user sees, especially in the realm of paper documents.

    4. Virtual Assistance: This model may be used to build a virtual assistant capable of reading and understanding physical documents. For instance, reading out a user's mail, helping learning from textbooks, or assisting in reviewing legal documents.

    5. Accessibility Tools for Visually Impaired: The tool could be utilized to interpret written documents for visually impaired people by identifying and vocalizing text based on their classes (answers, questions, headers, etc).

  14. Ten quick tips for getting the most scientific value out of numerical data

    • plos.figshare.com
    pdf
    Updated May 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Lars Ole Schwen; Sabrina Rueschenbaum (2023). Ten quick tips for getting the most scientific value out of numerical data [Dataset]. http://doi.org/10.1371/journal.pcbi.1006141
    Explore at:
    pdfAvailable download formats
    Dataset updated
    May 30, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Lars Ole Schwen; Sabrina Rueschenbaum
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Most studies in the life sciences and other disciplines involve generating and analyzing numerical data of some type as the foundation for scientific findings. Working with numerical data involves multiple challenges. These include reproducible data acquisition, appropriate data storage, computationally correct data analysis, appropriate reporting and presentation of the results, and suitable data interpretation.Finding and correcting mistakes when analyzing and interpreting data can be frustrating and time-consuming. Presenting or publishing incorrect results is embarrassing but not uncommon. Particular sources of errors are inappropriate use of statistical methods and incorrect interpretation of data by software. To detect mistakes as early as possible, one should frequently check intermediate and final results for plausibility. Clearly documenting how quantities and results were obtained facilitates correcting mistakes. Properly understanding data is indispensable for reaching well-founded conclusions from experimental results. Units are needed to make sense of numbers, and uncertainty should be estimated to know how meaningful results are. Descriptive statistics and significance testing are useful tools for interpreting numerical results if applied correctly. However, blindly trusting in computed numbers can also be misleading, so it is worth thinking about how data should be summarized quantitatively to properly answer the question at hand. Finally, a suitable form of presentation is needed so that the data can properly support the interpretation and findings. By additionally sharing the relevant data, others can access, understand, and ultimately make use of the results.These quick tips are intended to provide guidelines for correctly interpreting, efficiently analyzing, and presenting numerical data in a useful way.

  15. i

    Household Health Survey 2012-2013, Economic Research Forum (ERF)...

    • catalog.ihsn.org
    • datacatalog.ihsn.org
    Updated Jun 26, 2017
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Central Statistical Organization (CSO) (2017). Household Health Survey 2012-2013, Economic Research Forum (ERF) Harmonization Data - Iraq [Dataset]. https://catalog.ihsn.org/index.php/catalog/6937
    Explore at:
    Dataset updated
    Jun 26, 2017
    Dataset provided by
    Central Statistical Organization (CSO)
    Kurdistan Regional Statistics Office (KRSO)
    Economic Research Forum
    Time period covered
    2012 - 2013
    Area covered
    Iraq
    Description

    Abstract

    The harmonized data set on health, created and published by the ERF, is a subset of Iraq Household Socio Economic Survey (IHSES) 2012. It was derived from the household, individual and health modules, collected in the context of the above mentioned survey. The sample was then used to create a harmonized health survey, comparable with the Iraq Household Socio Economic Survey (IHSES) 2007 micro data set.

    ----> Overview of the Iraq Household Socio Economic Survey (IHSES) 2012:

    Iraq is considered a leader in household expenditure and income surveys where the first was conducted in 1946 followed by surveys in 1954 and 1961. After the establishment of Central Statistical Organization, household expenditure and income surveys were carried out every 3-5 years in (1971/ 1972, 1976, 1979, 1984/ 1985, 1988, 1993, 2002 / 2007). Implementing the cooperation between CSO and WB, Central Statistical Organization (CSO) and Kurdistan Region Statistics Office (KRSO) launched fieldwork on IHSES on 1/1/2012. The survey was carried out over a full year covering all governorates including those in Kurdistan Region.

    The survey has six main objectives. These objectives are:

    1. Provide data for poverty analysis and measurement and monitor, evaluate and update the implementation Poverty Reduction National Strategy issued in 2009.
    2. Provide comprehensive data system to assess household social and economic conditions and prepare the indicators related to the human development.
    3. Provide data that meet the needs and requirements of national accounts.
    4. Provide detailed indicators on consumption expenditure that serve making decision related to production, consumption, export and import.
    5. Provide detailed indicators on the sources of households and individuals income.
    6. Provide data necessary for formulation of a new consumer price index number.

    The raw survey data provided by the Statistical Office were then harmonized by the Economic Research Forum, to create a comparable version with the 2006/2007 Household Socio Economic Survey in Iraq. Harmonization at this stage only included unifying variables' names, labels and some definitions. See: Iraq 2007 & 2012- Variables Mapping & Availability Matrix.pdf provided in the external resources for further information on the mapping of the original variables on the harmonized ones, in addition to more indications on the variables' availability in both survey years and relevant comments.

    Geographic coverage

    National coverage: Covering a sample of urban, rural and metropolitan areas in all the governorates including those in Kurdistan Region.

    Analysis unit

    1- Household/family. 2- Individual/person.

    Universe

    The survey was carried out over a full year covering all governorates including those in Kurdistan Region.

    Kind of data

    Sample survey data [ssd]

    Sampling procedure

    ----> Design:

    Sample size was (25488) household for the whole Iraq, 216 households for each district of 118 districts, 2832 clusters each of which includes 9 households distributed on districts and governorates for rural and urban.

    ----> Sample frame:

    Listing and numbering results of 2009-2010 Population and Housing Survey were adopted in all the governorates including Kurdistan Region as a frame to select households, the sample was selected in two stages: Stage 1: Primary sampling unit (blocks) within each stratum (district) for urban and rural were systematically selected with probability proportional to size to reach 2832 units (cluster). Stage two: 9 households from each primary sampling unit were selected to create a cluster, thus the sample size of total survey clusters was 25488 households distributed on the governorates, 216 households in each district.

    ----> Sampling Stages:

    In each district, the sample was selected in two stages: Stage 1: based on 2010 listing and numbering frame 24 sample points were selected within each stratum through systematic sampling with probability proportional to size, in addition to the implicit breakdown urban and rural and geographic breakdown (sub-district, quarter, street, county, village and block). Stage 2: Using households as secondary sampling units, 9 households were selected from each sample point using systematic equal probability sampling. Sampling frames of each stages can be developed based on 2010 building listing and numbering without updating household lists. In some small districts, random selection processes of primary sampling may lead to select less than 24 units therefore a sampling unit is selected more than once , the selection may reach two cluster or more from the same enumeration unit when it is necessary.

    Mode of data collection

    Face-to-face [f2f]

    Research instrument

    ----> Preparation:

    The questionnaire of 2006 survey was adopted in designing the questionnaire of 2012 survey on which many revisions were made. Two rounds of pre-test were carried out. Revision were made based on the feedback of field work team, World Bank consultants and others, other revisions were made before final version was implemented in a pilot survey in September 2011. After the pilot survey implemented, other revisions were made in based on the challenges and feedbacks emerged during the implementation to implement the final version in the actual survey.

    ----> Questionnaire Parts:

    The questionnaire consists of four parts each with several sections: Part 1: Socio – Economic Data: - Section 1: Household Roster - Section 2: Emigration - Section 3: Food Rations - Section 4: housing - Section 5: education - Section 6: health - Section 7: Physical measurements - Section 8: job seeking and previous job

    Part 2: Monthly, Quarterly and Annual Expenditures: - Section 9: Expenditures on Non – Food Commodities and Services (past 30 days). - Section 10 : Expenditures on Non – Food Commodities and Services (past 90 days). - Section 11: Expenditures on Non – Food Commodities and Services (past 12 months). - Section 12: Expenditures on Non-food Frequent Food Stuff and Commodities (7 days). - Section 12, Table 1: Meals Had Within the Residential Unit. - Section 12, table 2: Number of Persons Participate in the Meals within Household Expenditure Other Than its Members.

    Part 3: Income and Other Data: - Section 13: Job - Section 14: paid jobs - Section 15: Agriculture, forestry and fishing - Section 16: Household non – agricultural projects - Section 17: Income from ownership and transfers - Section 18: Durable goods - Section 19: Loans, advances and subsidies - Section 20: Shocks and strategy of dealing in the households - Section 21: Time use - Section 22: Justice - Section 23: Satisfaction in life - Section 24: Food consumption during past 7 days

    Part 4: Diary of Daily Expenditures: Diary of expenditure is an essential component of this survey. It is left at the household to record all the daily purchases such as expenditures on food and frequent non-food items such as gasoline, newspapers…etc. during 7 days. Two pages were allocated for recording the expenditures of each day, thus the roster will be consists of 14 pages.

    Cleaning operations

    ----> Raw Data:

    Data Editing and Processing: To ensure accuracy and consistency, the data were edited at the following stages: 1. Interviewer: Checks all answers on the household questionnaire, confirming that they are clear and correct. 2. Local Supervisor: Checks to make sure that questions has been correctly completed. 3. Statistical analysis: After exporting data files from excel to SPSS, the Statistical Analysis Unit uses program commands to identify irregular or non-logical values in addition to auditing some variables. 4. World Bank consultants in coordination with the CSO data management team: the World Bank technical consultants use additional programs in SPSS and STAT to examine and correct remaining inconsistencies within the data files. The software detects errors by analyzing questionnaire items according to the expected parameter for each variable.

    ----> Harmonized Data:

    • The SPSS package is used to harmonize the Iraq Household Socio Economic Survey (IHSES) 2007 with Iraq Household Socio Economic Survey (IHSES) 2012.
    • The harmonization process starts with raw data files received from the Statistical Office.
    • A program is generated for each dataset to create harmonized variables.
    • Data is saved on the household and individual level, in SPSS and then converted to STATA, to be disseminated.

    Response rate

    Iraq Household Socio Economic Survey (IHSES) reached a total of 25488 households. Number of households refused to response was 305, response rate was 98.6%. The highest interview rates were in Ninevah and Muthanna (100%) while the lowest rates were in Sulaimaniya (92%).

  16. Trusted place to receive answers on faith questions among Iraqi Millennials...

    • statista.com
    Updated Oct 9, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). Trusted place to receive answers on faith questions among Iraqi Millennials 2017 [Dataset]. https://www.statista.com/statistics/752829/iraq-place-to-get-faith-question-answered-to-millennials/
    Explore at:
    Dataset updated
    Oct 9, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    Feb 5, 2017 - Mar 1, 2017
    Area covered
    Iraq
    Description

    This statistic represents the trusted place to receive answers on questions of faith among Iraqi Millennials as of 2017. During the survey, ** percent of Iraqi Millennials stated that they would go to their local mosque Imam for answers on their questions of faith.

  17. Summary statistics for the three arms of Experiment 2.

    • figshare.com
    xls
    Updated Jun 4, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Thomas C. McAndrew; Elizaveta A. Guseva; James P. Bagrow (2023). Summary statistics for the three arms of Experiment 2. [Dataset]. http://doi.org/10.1371/journal.pone.0182662.t001
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 4, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Thomas C. McAndrew; Elizaveta A. Guseva; James P. Bagrow
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Both Binomial and Thompson sampling are more efficient than Random sampling (lower 〈A〉) without losing the crowd’s average consensus on answers, measured by 〈S〉 and 〈d〉.

  18. q

    Answer Checking - Modified for Environmental Data Analysis/Statistics Course...

    • qubeshub.org
    Updated Jun 6, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jennifer Prairie (2019). Answer Checking - Modified for Environmental Data Analysis/Statistics Course [Dataset]. http://doi.org/10.25334/Q4NJ14
    Explore at:
    Dataset updated
    Jun 6, 2019
    Dataset provided by
    QUBES
    Authors
    Jennifer Prairie
    Description

    Modified answer checking worksheet from original BIOMAAP resource to include two additional questions relating to probability/P-values.

  19. H

    Survey of Consumer Finances (SCF)

    • dataverse.harvard.edu
    Updated May 30, 2013
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Anthony Damico (2013). Survey of Consumer Finances (SCF) [Dataset]. http://doi.org/10.7910/DVN/FRMKMF
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 30, 2013
    Dataset provided by
    Harvard Dataverse
    Authors
    Anthony Damico
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    analyze the survey of consumer finances (scf) with r the survey of consumer finances (scf) tracks the wealth of american families. every three years, more than five thousand households answer a battery of questions about income, net worth, credit card debt, pensions, mortgages, even the lease on their cars. plenty of surveys collect annual income, only the survey of consumer finances captures such detailed asset data. responses are at the primary economic unit-level (peu) - the economically dominant, financially interdependent family members within a sampled household. norc at the university of chicago administers the data collection, but the board of governors of the federal reserve pay the bills and therefore call the shots. if you were so brazen as to open up the microdata and run a simple weighted median, you'd get the wrong answer. the five to six thousand respondents actually gobble up twenty-five to thirty thousand records in the final pub lic use files. why oh why? well, those tables contain not one, not two, but five records for each peu. wherever missing, these data are multiply-imputed, meaning answers to the same question for the same household might vary across implicates. each analysis must account for all that, lest your confidence intervals be too tight. to calculate the correct statistics, you'll need to break the single file into five, necessarily complicating your life. this can be accomplished with the meanit sas macro buried in the 2004 scf codebook (search for meanit - you'll need the sas iml add-on). or you might blow the dust off this website referred to in the 2010 codebook as the home of an alternative multiple imputation technique, but all i found were broken links. perhaps it's time for plan c, and by c, i mean free. read the imputation section of the latest codebook (search for imputation), then give these scripts a whirl. they've got that new r smell. the lion's share of the respondents in the survey of consumer finances get drawn from a pretty standard sample of american dwellings - no nursing homes, no active-duty military. then there's this secondary sample of richer households to even out the statistical noise at the higher end of the i ncome and assets spectrum. you can read more if you like, but at the end of the day the weights just generalize to civilian, non-institutional american households. one last thing before you start your engine: read everything you always wanted to know about the scf. my favorite part of that title is the word always. this new github repository contains t hree scripts: 1989-2010 download all microdata.R initiate a function to download and import any survey of consumer finances zipped stata file (.dta) loop through each year specified by the user (starting at the 1989 re-vamp) to download the main, extract, and replicate weight files, then import each into r break the main file into five implicates (each containing one record per peu) and merge the appropriate extract data onto each implicate save the five implicates and replicate weights to an r data file (.rda) for rapid future loading 2010 analysis examples.R prepare two survey of consumer finances-flavored multiply-imputed survey analysis functions load the r data files (.rda) necessary to create a multiply-imputed, replicate-weighted survey design demonstrate how to access the properties of a multiply-imput ed survey design object cook up some descriptive statistics and export examples, calculated with scf-centric variance quirks run a quick t-test and regression, but only because you asked nicely replicate FRB SAS output.R reproduce each and every statistic pr ovided by the friendly folks at the federal reserve create a multiply-imputed, replicate-weighted survey design object re-reproduce (and yes, i said/meant what i meant/said) each of those statistics, now using the multiply-imputed survey design object to highlight the statistically-theoretically-irrelevant differences click here to view these three scripts for more detail about the survey of consumer finances (scf), visit: the federal reserve board of governors' survey of consumer finances homepage the latest scf chartbook, to browse what's possible. (spoiler alert: everything.) the survey of consumer finances wikipedia entry the official frequently asked questions notes: nationally-representative statistics on the financial health, wealth, and assets of american hous eholds might not be monopolized by the survey of consumer finances, but there isn't much competition aside from the assets topical module of the survey of income and program participation (sipp). on one hand, the scf interview questions contain more detail than sipp. on the other hand, scf's smaller sample precludes analyses of acute subpopulations. and for any three-handed martians in the audience, ther e's also a few biases between these two data sources that you ought to consider. the survey methodologists at the federal reserve take their job...

  20. Z

    Modelling and Automated Retrieval of Provenance Relationships (Metadata and...

    • data.niaid.nih.gov
    • zenodo.org
    Updated Jun 14, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Schneider, Thomas (2023). Modelling and Automated Retrieval of Provenance Relationships (Metadata and Statistics) [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_8036823
    Explore at:
    Dataset updated
    Jun 14, 2023
    Dataset provided by
    University of Erfurt and Humboldt-Universität zu Berlin
    Authors
    Schneider, Thomas
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset contains the data used in the master's thesis with the above title. It consists of a BibTeX file with the bibliographic metadata of the publications and websites cited throughout the thesis, and a Markdown file with statistics of the data sources discussed in Chapter 4.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Andreas Both; Oliver Schmidtke; Aleksandr Perevalov (2023). QADO: An RDF Representation of Question Answering Datasets and their Analyses for Improving Reproducibility [Dataset]. http://doi.org/10.6084/m9.figshare.21750029.v3
Organization logo

QADO: An RDF Representation of Question Answering Datasets and their Analyses for Improving Reproducibility

Explore at:
zipAvailable download formats
Dataset updated
May 31, 2023
Dataset provided by
Figsharehttp://figshare.com/
Authors
Andreas Both; Oliver Schmidtke; Aleksandr Perevalov
License

MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically

Description

Measuring the quality of Question Answering (QA) systems is a crucial task to validate the results of novel approaches. However, there are already indicators of a reproducibility crisis as many published systems have used outdated datasets or use subsets of QA benchmarks, making it hard to compare results. We identified the following core problems: there is no standard data format, instead, proprietary data representations are used by the different partly inconsistent datasets; additionally, the characteristics of datasets are typically not reflected by the dataset maintainers nor by the system publishers. To overcome these problems, we established an ontology---Question Answering Dataset Ontology (QADO)---for representing the QA datasets in RDF. The following datasets were mapped into the ontology: the QALD series, LC-QuAD series, RuBQ series, ComplexWebQuestions, and Mintaka. Hence, the integrated data in QADO covers widely used datasets and multilinguality. Additionally, we did intensive analyses of the datasets to identify their characteristics to make it easier for researchers to identify specific research questions and to select well-defined subsets. The provided resource will enable the research community to improve the quality of their research and support the reproducibility of experiments.

Here, the mapping results of the QADO process, the SPARQL queries for data analytics, and the archived analytics results file are provided.

Up-to-date statistics can be created automatically by the script provided at the corresponding QADO GitHub RDFizer repository.

Search
Clear search
Close search
Google apps
Main menu