100+ datasets found
  1. P

    RACE Dataset

    • paperswithcode.com
    Updated Jan 27, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Guokun Lai; Qizhe Xie; Hanxiao Liu; Yiming Yang; Eduard Hovy (2022). RACE Dataset [Dataset]. https://paperswithcode.com/dataset/race
    Explore at:
    Dataset updated
    Jan 27, 2022
    Authors
    Guokun Lai; Qizhe Xie; Hanxiao Liu; Yiming Yang; Eduard Hovy
    Description

    The ReAding Comprehension dataset from Examinations (RACE) dataset is a machine reading comprehension dataset consisting of 27,933 passages and 97,867 questions from English exams, targeting Chinese students aged 12-18. RACE consists of two subsets, RACE-M and RACE-H, from middle school and high school exams, respectively. RACE-M has 28,293 questions and RACE-H has 69,574. Each question is associated with 4 candidate answers, one of which is correct. The data generation process of RACE differs from most machine reading comprehension datasets - instead of generating questions and answers by heuristics or crowd-sourcing, questions in RACE are specifically designed for testing human reading skills, and are created by domain experts.

  2. S

    TibetanQA: Tibetan Dataset for Machine Reading Comprehension

    • scidb.cn
    Updated Feb 11, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yuan Sun; Zhengcuo Dan; Sisi Liu; Xiaobing Zhao (2022). TibetanQA: Tibetan Dataset for Machine Reading Comprehension [Dataset]. http://doi.org/10.11922/sciencedb.j00001.00351
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Feb 11, 2022
    Dataset provided by
    Science Data Bank
    Authors
    Yuan Sun; Zhengcuo Dan; Sisi Liu; Xiaobing Zhao
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This paper constructs a dataset for Tibetan machine reading comprehension. The data comes from Yunzang website, and covers 12 fields of nature, culture, education, geography, history, life, society, art, technology, people, science and sports. The questions and answers of the dataset are manually entered and marked by 20 Tibetan professionals. It contains 631 articles, 903 paragraphs, and 2,000 question-and-answer pairs constructed based on the paragraphs. Data items mainly include article ID, title, paragraph, question and answer. The publication of this dataset is of great value for promoting the development of Tibetan information processing.

  3. d

    Replication data for: Improving Reading Comprehension, Science Domain...

    • search.dataone.org
    • dataverse.harvard.edu
    Updated Mar 6, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kim, James S.; Burkhauser, Mary; Mesite, Laura; Asher, Catherine A.; Relyea, Jackie Eunjung; Fitzgerald, Jill; Elmore, Jeff (2024). Replication data for: Improving Reading Comprehension, Science Domain Knowledge, and Reading Engagement through a First-Grade Content Literacy Intervention [Dataset]. http://doi.org/10.7910/DVN/RVJIMX
    Explore at:
    Dataset updated
    Mar 6, 2024
    Dataset provided by
    Harvard Dataverse
    Authors
    Kim, James S.; Burkhauser, Mary; Mesite, Laura; Asher, Catherine A.; Relyea, Jackie Eunjung; Fitzgerald, Jill; Elmore, Jeff
    Description

    This dataset contains replication materials for the Journal of Educational Psychology paper entitled: "Improving Reading Comprehension, Science Domain Knowledge, and Reading Engagement through a First-Grade Content Literacy Intervention." Materials include the dataset and programs to replicate the analyses

  4. Data from: Read Philippines

    • catalog.data.gov
    Updated Jun 25, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    data.usaid.gov (2024). Read Philippines [Dataset]. https://catalog.data.gov/dataset/read-philippines
    Explore at:
    Dataset updated
    Jun 25, 2024
    Dataset provided by
    United States Agency for International Developmenthttp://usaid.gov/
    Area covered
    Philippines
    Description

    Read Philippines or Basa Pilipinas was a four-year early grade reading project that operated from January 2013 to December 2016 and supported the Philippine Department of Education’s national reading program. Basa assisted the implementation of transformative literacy practices in selected divisions of Regions 1 and 7 by providing teacher and student materials, training teachers and school heads, and providing post-training support for Grade 1, 2 and 3 teachers, as well as providing Early Language, Literacy and Numeracy training to kindergarten teachers. The Basa Pilipinas activity used a quasi-experimental cross-sectional design to evaluate the impact of the treatment in improving reading and comprehension skills. Sampling was conducted at three levels: school, classrooms, and student. The school sample was drawn randomly from the activity’s five provinces. Within each school, one grade 2 classroom was selected randomly for baseline and midline with an additional grade 3 classroom selected during the endline. Within each classroom, students were randomly selected to be administered the assessment. A total of 469 students were sample from 40 schools in two provinces at the baseline (comparison), 1,216 students were sampled from 80 schools in five provinces at the midline (intervention 1), and 1,658 students were sampled from 5 provinces at the endline (intervention 2). The disparity in the number of provinces sampled is due to the expansion of the intervention from two provinces to five provinces starting at the midline to provide a more complete picture of the Basa outcomes. To enable the computation of estimates of literacy skills among students in all schools affected by the Basa intervention, design weights were applied to the analyses of EGRA data. Design weights were applied to compensate for differences in provincial sampling and to ensure an appropriate representation of learners in all provinces in the sample.

  5. Dataset of psychophysiological data from children with learning difficulties...

    • openneuro.org
    Updated May 29, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    César E. Corona-González; Claudia Rebeca De Stefano-Ramos; Juan Pablo Rosado-Aíza; David I. Ibarra-Zarate; Fabiola R. Gómez-Velázquez; Luz María Alonso-Valerdi (2025). Dataset of psychophysiological data from children with learning difficulties who strengthen reading and math skills through assistive technology [Dataset]. http://doi.org/10.18112/openneuro.ds006260.v1.0.1
    Explore at:
    Dataset updated
    May 29, 2025
    Dataset provided by
    OpenNeurohttps://openneuro.org/
    Authors
    César E. Corona-González; Claudia Rebeca De Stefano-Ramos; Juan Pablo Rosado-Aíza; David I. Ibarra-Zarate; Fabiola R. Gómez-Velázquez; Luz María Alonso-Valerdi
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    README

    Authors

    César E. Corona-González, Claudia Rebeca De Stefano-Ramos, Juan Pablo Rosado-Aíza, Fabiola R Gómez-Velázquez, David I. Ibarra-Zarate, Luz María Alonso-Valerdi

    Contact person

    César E. Corona-González

    https://orcid.org/0000-0002-7680-2953

    a00833959@tec.mx

    Project name

    Psychophysiological data from Mexican children with learning difficulties who strengthen reading and math skills by assistive technology

    Year that the project ran

    2023

    Brief overview of the tasks in the experiment

    The current dataset consists of psychometric and electrophysiological data from children with reading or math learning difficulties. These data were collected to evaluate improvements in reading or math skills resulting from using an online learning method called Smartick.

    The psychometric evaluations from children with reading difficulties encompassed: spelling tests, where 1) orthographic and 2) phonological errors were considered, 3) reading speed, expressed in words read per minute, and 4) reading comprehension, where multiple-choice questions were given to the children. The last 2 parameters were determined according to the standards from the Ministry of Public Education (Secretaría de Educación Pública in Spanish) in Mexico. On the other hand, group 2 assessments embraced: 1) an assessment of general mathematical knowledge, as well as 2) the hits percentage, and 3) reaction time from an arithmetical task. Additionally, selective attention and intelligence quotient (IQ) were also evaluated.

    Then, individuals underwent an EEG experimental paradigm where two conditions were recorded: 1) a 3-minute eyes-open resting state and 2) performing either reading or mathematical activities. EEG recordings from the reading experiment consisted of reading a text aloud and then answering questions about the text. Alternatively, EEG recordings from the math experiment involved the solution of two blocks with 20 arithmetic operations (addition and subtraction). Subsequently, each child was randomly subcategorized as 1) the experimental group, who were asked to engage with Smartick for three months, and 2) the control group, who were not involved with the intervention. Once the 3-month period was over, every child was reassessed as described before.

    Description of the contents of the dataset

    The dataset contains a total of 76 subjects (sub-), where two study groups were assessed: 1) reading difficulties (R) and 2) math difficulties (M). Then, each individual was subcategorized as experimental subgroup (e), where children were compromised to engage with Smartick, or control subgroup (c), where they did not get involved with any intervention.

    Every subject was followed up on for three months. During this period, each subject underwent two EEG sessions, representing the PRE-intervention (ses-1) and the POST-intervention (ses-2).

    The EEG recordings from the reading difficulties group consisted of a resting state condition (run-1) and while performing active reading and reading comprehension activities (run-2). On the other hand, EEG data from the math difficulties group was collected from a resting state condition (run-1) and when solving two blocks of 20 arithmetic operations (run-2 and run-3). All EEG files were stored in .set format. The nomenclature and description from filenames are shown below:

    NomenclatureDescription
    sub-Subject
    MMath group
    RReading group
    cControl subgroup
    eExperimental subgroup
    ses-1PRE-intervention
    ses-2POST-Intervention
    run-1EEG for baseline
    run-2EEG for reading activity, or the first block of math
    run-3EEG for the second block of math

    Example: the file sub-Rc11_ses-1_task-SmartickDataset_run-2_eeg.set is related to: - The 11th subject from the reading difficulties group, control subgroup (sub-Rc11). - EEG recording from the PRE-intervention (ses-1) while performing the reading activity (run-2)

    Independent variables

    • Study groups:
      • Reading difficulties
        • Control: children did not follow any intervention
        • Experimental: Children used the reading program of Smartick for 3 months
      • Math difficulties
        • Control: children did not follow any intervention
        • Experimental: Children used the math program of Smartick for 3 months
    • Condition:
      • PRE-intervention: first psychological and electroencephalographic evaluation
      • POST-intervention: second psychological and electroencephalographic evaluation

    Dependent variables

    • Psychometric data from the reading difficulties group:

      • Orthographic_ERR: number of orthographic errors.
      • Phonological_ERR: number of phonological errors.
      • Selective_Attention: score from the selective attention test.
      • Reading_Speed: reading speed in words per minute.
      • Comprehension: score on a reading comprehension task.
      • GROUP: C for the control group, E for the experimental group.
      • GENDER: M for male, F for Female.
      • AGE: age at the beginning of the study.
      • IQ: intelligence quotient.
    • Psychometric data from the math difficulties group:

      • WRAT4: score from the WRAT-4 test.
      • hits: hits during the EEG acquisition [%].
      • RT: reaction time during the EEG acquisition [s].
      • Selective_Attention: score from the selective attention test.
      • GROUP: C for the control Group, E for the experimental group.
      • GENDER: M for male, F for female.
      • AGE: age at the beginning of the study.
      • IQ: intelligence quotient.

    Psychometric data can be found in the 01_Psychometric_Data.xlsx file

    • Engagement percentage within Smartick (only for experimental group)
      • These values represent the engagement percentage through Smartick.
      • Students were asked to get involved with the online method for learning for 3 months, 5 days a week.
      • Greater values than 100% denote participants who regularly logged in more than 5 days weekly.

    Engagement percentage be found in the 05_SessionEngagement.xlsx file

    Methods

    Subjects

    Seventy-six Mexican children between 7 and 13 years old were enrolled in this study.

    Information about the recruitment procedure

    The sample was recruited through non-profit foundations that support learning and foster care programs.

    Apparatus

    g.USBamp RESEARCH amplifier

    Initial setup

    1. Explain the task to the participant.
    2. Sign informed consent.
    3. Set up electrodes.

    Task details

    The stimuli nested folder contains all stimuli employed in the EEG experiments.

    Level 1 - Math: Images used in the math experiment.​​​​​​​ - Reading: Images used in the reading experiment.

    Level 2 - Math * POST_Operations: arithmetic operations from the POST-intervention.
    * PRE_Operations: arithmetic operations from the PRE-intervention. - Reading * POST_Reading1: text 1 and text-related comprehension questions from the POST-intervention. * POST_Reading2: text 2 and text-related comprehension questions from the POST-intervention. * POST_Reading3: text 3 and text-related comprehension questions from the POST-intervention. * PRE_Reading1: text 1 and text-related comprehension questions from the PRE-intervention. * PRE_Reading2: text 2 and text-related comprehension questions from the PRE-intervention. * PRE_Reading3: text 3 and text-related comprehension questions from the PRE-intervention.

    Level 3 - Math * Operation01.jpg to Operation20.jpg: arithmetical operations solved during the first block of the math

  6. o

    Data from: Content Counts and Motivation Matters: Reading Comprehension in...

    • openicpsr.org
    Updated Dec 20, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    HyeJin Hwang; Nell Duke (2020). Content Counts and Motivation Matters: Reading Comprehension in Third-Grade Students Who Are English Learners [Dataset]. http://doi.org/10.3886/E129401V1
    Explore at:
    Dataset updated
    Dec 20, 2020
    Dataset provided by
    Florida State University
    University of Michigan
    Authors
    HyeJin Hwang; Nell Duke
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This study examined the role of science domain knowledge, reading motivation, and decoding skills in reading comprehension achievement in third-grade students who are English learners (ELs) and students who are monolingual, using a nationally representative data set. Multigroup probit regression analyses showed that third-grade science domain knowledge and motivation for reading, decoding skills, and early attainment of decoding skills were significantly associated with third-grade reading comprehension in both language groups. Also, using Wald chi-square tests, the study showed that the association between third-grade science domain knowledge and reading comprehension was stronger in students who were ELs than in students who were monolingual. These findings suggest that cultivating science domain knowledge is very important to supporting reading comprehension development in third grade, particularly for students who are ELs.

  7. o

    Are School Entry Skills Predictive of Reading Comprehension across European...

    • osf.io
    url
    Updated Mar 27, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Thomas Brüggemann; Ulrich Ludewig; Jakob Schwerter; Nele McElvany; Philipp Doebler (2024). Are School Entry Skills Predictive of Reading Comprehension across European Countries? [Dataset]. http://doi.org/10.17605/OSF.IO/4EVX2
    Explore at:
    urlAvailable download formats
    Dataset updated
    Mar 27, 2024
    Dataset provided by
    Center For Open Science
    Authors
    Thomas Brüggemann; Ulrich Ludewig; Jakob Schwerter; Nele McElvany; Philipp Doebler
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Description

    Transitioning from learning to read to reading to learn is a central aim of primary school education (Chall, 1983). However, while this aim is shared across borders, countries differ with regard to conditions facilitating or hindering this aim. These conditions encompass extracurricular learning environments and resources, socio-economic backgrounds, home languages, migratory backgrounds and the parents’ reading appreciation (e.g. El-Khechen et al., 2016; Kieffer, 2012; Kigel et al., 2015). Those determinants are also relevant in explaining students’ school entry reading skills, which, in turn, predict students’ reading literacy in primary school (Cameron et al., 2023; Claessens et al., 2009; Duncan et al., 2007). However, it is unclear at which point in a students’ life these effects are most impactful, which is necessary in order to recommend interventions in a timely effective manner. Therefore, in this study we attempt to investigate to what extent factors that are primarily time-invariant for the students affect both the students’ reading skills at school entry and their reading literacy in fourth grade. At the core, we investigate to what extend differences due to time-invariant variables affect students’ rate of learning to reading throughout primary school, or if pre-school differences in reading competence accumulate during primary school. Put differently, we investigate if differences in reading competence start early and stay the same (accumulate) or if they start early and then widen (affecting the learning rate). Building upon Carroll’s (1963) concept of time-on-task within the model of school learning, we investigate the effects of students’ language at home (El-Khechen et al, 2016). Within this theory, spending time on learning a language, such as when students have the ability to practice the language at home, the language skills improve due to higher time investments. Since students whose home language is different from the test language have less time-on-task in learning the test language, we assume that the test language affects the reading competence at school entry. Furthermore, because the home language persists throughout primary school and thus the time-on-task effects persist, we also assume that the home language affects the reading literacy in fourth grade. We consider the parents’ reading appreciation within a social learning theory (SLT; Bandura, 1977) context. SLT posits that students learn by observing and imitating the behavior of adults. Children of parents that highly appreciate reading are more likely to observe their parents while reading and attempt to imitate that behavior. Hence, parents’ reading appreciation is a persistent factor for students, both before and during primary school, and may thus affect both the reading competence at school entry and the reading literacy in fourth grade. In addition, we investigate the effects of household possessions (see Avvisati, 2020) as a persistent environmental factor under the framework of resource deprivation. Students from family backgrounds with few household possessions may lack cultural resources for a home learning environment that is conductive for learning to read before entering primary school and the economic resources to provide support to struggling learners during primary school (see Kieffer, 2012).
    We use structural equation modelling on secondary data from the Progress in International Reading Literacy Study (PIRLS) with N = 177,386 fourth grade students from 17 European countries, gathered in 2016 and 2021, to investigate whether effects differ between countries and if they are stable even in changing educational circumstances, such as the COVID-19 pandemic (see Gee et al., 2023; Werner & Woessmann, 2023).

  8. The Quest Dataset

    • kaggle.com
    Updated Nov 26, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jules King (2024). The Quest Dataset [Dataset]. https://www.kaggle.com/datasets/julesking/the-quest-dataset
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Nov 26, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Jules King
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The Learning Agency Lab’s data science competition, “The Quest for Quality Questions: Improving Reading Comprehension through Automated Question Generation,” was designed to build AI algorithms that can automatically generate questions that test young learners’ reading comprehension.

    As many educators and researchers know, questions are key in teaching and evaluating narrative comprehension skills in young learners. However, generating high-quality reading comprehension queries is time consuming, which limits the number of texts that young readers can engage with in this way. Datasets can help by informing quality question automation.

    The Quest challenge dataset can be accessed on this page and was aided by foundational data from the Lab’s FairytaleQA dataset of 10,580 questions. Those queries were created to address gaps in similar datasets, which often overlooked fine reading skills that showcased an understanding of varying narrative elements.

    The Quest was made possible by The Learning Agency Lab, Mark Warschauer at UC Irvine, and Ying Xu at The University of Michigan School of Education. More can be found about the creators here.

    Quest dataset © 2024 by The Learning Agency Lab is licensed under CC BY 4.0. To view a copy of this license, visit https://creativecommons.org/licenses/by/4.0/

    Competition - https://www.thequestchallenge.org/

    Publications - Xu, Y., Wang, D., Yu, M., Ritchie, D., Yao, B., Wu, T., ... & Warschauer, M. (2022). Fantastic Questions and Where to Find Them: FairytaleQA--An Authentic Dataset for Narrative Comprehension. arXiv preprint arXiv:2203.13947.

  9. n

    Data for: The Contributions of Language Skills and Comprehension Monitoring...

    • narcis.nl
    • data.mendeley.com
    Updated Jan 31, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Zhao, A (via Mendeley Data) (2021). Data for: The Contributions of Language Skills and Comprehension Monitoring to Chinese Reading Comprehension: A Longitudinal Investigation [Dataset]. http://doi.org/10.17632/62mfct3gc8.1
    Explore at:
    Dataset updated
    Jan 31, 2021
    Dataset provided by
    Data Archiving and Networked Services (DANS)
    Authors
    Zhao, A (via Mendeley Data)
    Description

    Longitudinal data of language and cognitive skills of Chinese children followed from Grade 1 to Grade 3

  10. w

    Learning Poverty Global Database

    • data360.worldbank.org
    Updated Apr 18, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). Learning Poverty Global Database [Dataset]. https://data360.worldbank.org/en/dataset/WB_LPGD
    Explore at:
    Dataset updated
    Apr 18, 2025
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    2001 - 2023
    Description

    Will all children be able to read by 2030? The ability to read with comprehension is a foundational skill that every education system around the world strives to impart by late in primary school—generally by age 10. Moreover, attaining the ambitious Sustainable Development Goals (SDGs) in education requires first achieving this basic building block, and so does improving countries’ Human Capital Index scores. Yet past evidence from many low- and middle-income countries has shown that many children are not learning to read with comprehension in primary school. To understand the global picture better, we have worked with the UNESCO Institute for Statistics (UIS) to assemble a new dataset with the most comprehensive measures of this foundational skill yet developed, by linking together data from credible cross-national and national assessments of reading. This dataset covers 115 countries, accounting for 81% of children worldwide and 79% of children in low- and middle-income countries. The new data allow us to estimate the reading proficiency of late-primary-age children, and we also provide what are among the first estimates (and the most comprehensive, for low- and middle-income countries) of the historical rate of progress in improving reading proficiency globally (for the 2000-17 period). The results show that 53% of all children in low- and middle-income countries cannot read age-appropriate material by age 10, and that at current rates of improvement, this “learning poverty” rate will have fallen only to 43% by 2030. Indeed, we find that the goal of all children reading by 2030 will be attainable only with historically unprecedented progress. The high rate of “learning poverty” and slow progress in low- and middle-income countries is an early warning that all the ambitious SDG targets in education (and likely of social progress) are at risk. Based on this evidence, we suggest a new medium-term target to guide the World Bank’s work in low- and middle- income countries: cut learning poverty by at least half by 2030. This target, together with improved measurement of learning, can be as an evidence-based tool to accelerate progress to get all children reading by age 10.

    For further details, please refer to https://thedocs.worldbank.org/en/doc/e52f55322528903b27f1b7e61238e416-0200022022/original/Learning-poverty-report-2022-06-21-final-V7-0-conferenceEdition.pdf

  11. f

    Data from: Delineating the Profile of Good and Poor Readers

    • scielo.figshare.com
    jpeg
    Updated Jun 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Lucilene Bender de Sousa; Lilian Cristine Hübner (2023). Delineating the Profile of Good and Poor Readers [Dataset]. http://doi.org/10.6084/m9.figshare.14285532.v1
    Explore at:
    jpegAvailable download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    SciELO journals
    Authors
    Lucilene Bender de Sousa; Lilian Cristine Hübner
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Abstract This article aims at profiling and comparing good (GR) and poor readers’comprehension (PRC). Through a word reading task and a reading comprehension task, 49 good readers and 37 poor readers were identified among 336 students in the 8th grade of public schools in the southern Brazil, not including the intermediate group in the analysis. The investigation of the profile used a self-completion written questionnaire. The results showed difference in the groups’ reading experience and a positive correlation between reading comprehension performance and number of books read by the students in a year. The study verifies that the research on reading habits might help to comprehend the differences in good and poor readers’ profiles. Future research may improve the instrument and expand it to direct not only theoretical studies but also practical studies of clinical and pedagogical intervention.

  12. h

    Supporting data for “Monitoring and Regulation in Reading Comprehension...

    • datahub.hku.hk
    zip
    Updated Jan 1, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ching Yan Kwok (2024). Supporting data for “Monitoring and Regulation in Reading Comprehension among Chinese children: Exploring the Role of Comprehension Monitoring, Lexical Ambiguity Resolution and Reading Strategy" [Dataset]. http://doi.org/10.25442/hku.19539061.v1
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jan 1, 2024
    Dataset provided by
    HKU Data Repository
    Authors
    Ching Yan Kwok
    License

    Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
    License information was derived automatically

    Description

    Datasets for the thesis titled "Monitoring and Regulation in Reading Comprehension among Chinese children: Exploring the Role of Comprehension Monitoring, Lexical Ambiguity Resolution and Reading Strategy". The thesis contained three studies and aimed to provide a more comprehensive view on monitoring and regulation in reading and examine its contributions on reading comprehension. Dataset for Study 1 contained cross-sectional and longitudinal data from junior primary school children on their general cognitive abilities, reading comrpehesion performance and cognitive-linguistic skills. Dataset for Study 2 and 3 contained eye-movement data and data on general cognitive abilities and reading-related cogntive-linguistic skills of grade 3 children.

    Please refer to the README file for details of the datasets.

  13. u

    The Effects of Choice on the Reading Comprehension and Enjoyment of Children...

    • rdr.ucl.ac.uk
    bin
    Updated Jun 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Myrofora Kakoulidou (2023). The Effects of Choice on the Reading Comprehension and Enjoyment of Children with Severe Inattention and no Attentional Difficulties: Key data analysed and discussed in the published paper [Dataset]. http://doi.org/10.5522/04/14816121.v1
    Explore at:
    binAvailable download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    University College London
    Authors
    Myrofora Kakoulidou
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    This dataset includes the key data analysed and discussed in the paper titled 'The Effects of Choice on the Reading Comprehension and Enjoyment of Children with Severe Inattention and no Attentional Difficulties' published in the Research on Child and Adolescent Psychopathology journal. Data were collected as part of a larger PhD study of Myrofora Kakoulidou, under the supervision of Professor Jane Hurry, Dr Frances Le Cornu Knight and Dr Roberto Filippi.Variables includedCodenumber = Participant IDSex = Biological sexSEN status = Whether children had (or did not have) an SEN statement or and Education and Health Care planReading Motivation Questions = MRQ items (38 items)MRQ_NoMissing = Total scores on MRQ after imputationSentenceCompletion_NGRT = Children's answers to the Sentence Completion items of the NGRT (20 items)TextQuestions_NGRT = Children's answers to the Reading Comprehension questions across the three passages (28 items)NewGroupReadingScores = Total scores across the 48 NGRT itemsNGRT_Standardised = Raw scores converted to standardised scores by ageReadingscores_Choice = Children's reading scores in the Choice conditionReadingscores_NoChoice = Children's reading scores in the No Choice conditionReadingdifference_Final = Difference scores for Reading Comprehension (Reading comprehension scores in the No Choice condition subtracted by reading comprehension scores in the Choice condition)Readingdenjoyment_Final = Difference scores for Reading Enjoyment (Reading Enjoyment scores in the No Choice condition subtracted by reading enjoyment scores in the Choice condition)Enjoyment_NoChoice = Children's enjoyment scores in the No Choice conditionEnjoyment_Choice = Children's enjoyment scores in the Choice conditionTeacherConnersItems = Teachers' ratings on the Conners 3 scale (39 items in total, short version)TeacherRatedInattention = Teachers' ratings of children's inattentionOmissionErrors = Raw scores on Omission errors in AULARTV (Reaction Time Variability) = Raw scores on RTV in AULATeacherRatedInattention_Trichotomised = Trichotomised scores on Teacher-rated InattentionOmissionErrors_Trichotomised = Trichotomised scores on Omission errorsRTV_Trichotomised = Trichotomised scores on RTV

  14. f

    Descriptive Statistics, Intraclass Correlations, and Univariate Genetic...

    • plos.figshare.com
    xls
    Updated May 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Brooke Soden; Micaela E. Christopher; Jacqueline Hulslander; Richard K. Olson; Laurie Cutting; Janice M. Keenan; Lee A. Thompson; Sally J. Wadsworth; Erik G. Willcutt; Stephen A. Petrill (2023). Descriptive Statistics, Intraclass Correlations, and Univariate Genetic (a²), Shared Environmental (c²), and Nonshared Environmental (e²) Components of Variance for Reading Comprehension. [Dataset]. http://doi.org/10.1371/journal.pone.0113807.t001
    Explore at:
    xlsAvailable download formats
    Dataset updated
    May 30, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Brooke Soden; Micaela E. Christopher; Jacqueline Hulslander; Richard K. Olson; Laurie Cutting; Janice M. Keenan; Lee A. Thompson; Sally J. Wadsworth; Erik G. Willcutt; Stephen A. Petrill
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Note. Ns are for individuals.*p

  15. h

    Supporting data for The relationship between musical training and reading...

    • datahub.hku.hk
    Updated Aug 26, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Wai Sing Chan (2024). Supporting data for The relationship between musical training and reading comprehension in children with Autism Spectrum Disorder [Dataset]. http://doi.org/10.25442/hku.26715601.v1
    Explore at:
    Dataset updated
    Aug 26, 2024
    Dataset provided by
    HKU Data Repository
    Authors
    Wai Sing Chan
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    The present study examined the reading comprehension performance of children with Autism Spectrum Disorder and compared it with that of their typically developing counterparts to identify the possible variables that may relate to their reading comprehension performance. A series of six tests namely tone identification test, Embedded Figures Test (EFT), homograph reading test, homophone test, music test and reading test were conducted with native Cantonese ASD participants with and without musical training and their corresponding TD groups. The results were recorded in the file.

  16. Data from: The NarrativeQA Reading Comprehension Challenge Dataset

    • github.com
    • gitee.com
    Updated Dec 21, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    DeepMind (2017). The NarrativeQA Reading Comprehension Challenge Dataset [Dataset]. https://github.com/google-deepmind/narrativeqa
    Explore at:
    Dataset updated
    Dec 21, 2017
    Dataset provided by
    DeepMindhttp://deepmind.com/
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    This repository contains the NarrativeQA dataset. It includes the list of documents with Wikipedia summaries, links to full stories, and questions and answers.

  17. m

    Data from: Differences in FL Reading Comprehension Among High-, Middle-, and...

    • data.mendeley.com
    Updated Apr 2, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Abdel Salam El-Koumy (2024). Differences in FL Reading Comprehension Among High-, Middle-, and Low-Ambiguity Tolerance Students [Dataset]. http://doi.org/10.17632/k4tscc66zg.1
    Explore at:
    Dataset updated
    Apr 2, 2024
    Authors
    Abdel Salam El-Koumy
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Florida
    Description

    The purpose of this study was to examine the differences in foreign language reading comprehension among high-, middle-, and low ambiguity tolerance students.

  18. S

    Construction and Evaluation of Chinese Reading Comprehension Data Set for...

    • scidb.cn
    Updated Sep 26, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    yang zi hang; Jin Hou (2022). Construction and Evaluation of Chinese Reading Comprehension Data Set for Security Field [Dataset]. http://doi.org/10.57760/sciencedb.j00133.00081
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Sep 26, 2022
    Dataset provided by
    Science Data Bank
    Authors
    yang zi hang; Jin Hou
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    We proposed a Chinese machine reading comprehension dataset for security field (SecMRC), which solves the problem of lack of professional data support for machine reading comprehension technology research in this field. The dataset contains 2 1 00 Anti-terrorism and security domain news, 7300 extracted question-answer pairs, 2 1 00 generative Q&A pairs , and a total of 47 9 6264 characters.Tests were conducted using advanced reading comprehension models on the SecMRC. The results show that the F1 of the extraction task reaches 72.5%, and the average ROUGE-L of the generative task is 37.8%, both of which are significantly weaker than the human level. SecMRC highlights domain knowledge and is difficult and challenging. It can effectively support the research of machine reading comprehension technology in this field. And the dataset construction method is universal and can be extended to other professional fields.

  19. h

    Supporting data for Improving Early Reading Comprehension in...

    • datahub.hku.hk
    Updated Jul 9, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kembell Lentejas (2024). Supporting data for Improving Early Reading Comprehension in Filipino-English Bilingual Children Through Oral Language Assessment and Intervention in the Philippines [Dataset]. http://doi.org/10.25442/hku.26112739.v1
    Explore at:
    Dataset updated
    Jul 9, 2024
    Dataset provided by
    HKU Data Repository
    Authors
    Kembell Lentejas
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Area covered
    Philippines
    Description

    The description of each dataset is given below. Study01 assessed Grades 1 to 3 bilingual children to identify the cognitive and linguistic skills that predict early reading comprehension. The data includes all the participants' scores in different cognitive and linguistic measures.Study02 identified different subgroups of early bilingual readers using their word reading, oral language, and reading fluency skills. The data includes the participants' scores in different linguistic measures in their first and second languages.Study 03 determined the efficacy of a multi-component oral language intervention for bilingual Grades 1 to 3 children. The data include scores of pretest and posttest measures of control and intervention groups.

  20. P

    Data from: CJRC Dataset

    • paperswithcode.com
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Xingyi Duan; Baoxin Wang; Ziyue Wang; Wentao Ma; Yiming Cui; Dayong Wu; Shijin Wang; Ting Liu; Tianxiang Huo; Zhen Hu; Heng Wang; Zhiyuan Liu, CJRC Dataset [Dataset]. https://paperswithcode.com/dataset/cjrc
    Explore at:
    Authors
    Xingyi Duan; Baoxin Wang; Ziyue Wang; Wentao Ma; Yiming Cui; Dayong Wu; Shijin Wang; Ting Liu; Tianxiang Huo; Zhen Hu; Heng Wang; Zhiyuan Liu
    Description

    The Chinese judicial reading comprehension (CJRC) dataset contains approximately 10K documents and almost 50K questions with answers. The documents come from judgment documents and the questions are annotated by law experts.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Guokun Lai; Qizhe Xie; Hanxiao Liu; Yiming Yang; Eduard Hovy (2022). RACE Dataset [Dataset]. https://paperswithcode.com/dataset/race

RACE Dataset

ReAding Comprehension dataset from Examinations

Explore at:
Dataset updated
Jan 27, 2022
Authors
Guokun Lai; Qizhe Xie; Hanxiao Liu; Yiming Yang; Eduard Hovy
Description

The ReAding Comprehension dataset from Examinations (RACE) dataset is a machine reading comprehension dataset consisting of 27,933 passages and 97,867 questions from English exams, targeting Chinese students aged 12-18. RACE consists of two subsets, RACE-M and RACE-H, from middle school and high school exams, respectively. RACE-M has 28,293 questions and RACE-H has 69,574. Each question is associated with 4 candidate answers, one of which is correct. The data generation process of RACE differs from most machine reading comprehension datasets - instead of generating questions and answers by heuristics or crowd-sourcing, questions in RACE are specifically designed for testing human reading skills, and are created by domain experts.

Search
Clear search
Close search
Google apps
Main menu