100+ datasets found

P
RACE Dataset
paperswithcode.com
Updated Jan 27, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Guokun Lai; Qizhe Xie; Hanxiao Liu; Yiming Yang; Eduard Hovy (2022). RACE Dataset [Dataset]. https://paperswithcode.com/dataset/race
Explore at:
Dataset updated
Jan 27, 2022
Authors
Guokun Lai; Qizhe Xie; Hanxiao Liu; Yiming Yang; Eduard Hovy
Description
The ReAding Comprehension dataset from Examinations (RACE) dataset is a machine reading comprehension dataset consisting of 27,933 passages and 97,867 questions from English exams, targeting Chinese students aged 12-18. RACE consists of two subsets, RACE-M and RACE-H, from middle school and high school exams, respectively. RACE-M has 28,293 questions and RACE-H has 69,574. Each question is associated with 4 candidate answers, one of which is correct. The data generation process of RACE differs from most machine reading comprehension datasets - instead of generating questions and answers by heuristics or crowd-sourcing, questions in RACE are specifically designed for testing human reading skills, and are created by domain experts.
S
TibetanQA: Tibetan Dataset for Machine Reading Comprehension
scidb.cn
Updated Feb 11, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yuan Sun; Zhengcuo Dan; Sisi Liu; Xiaobing Zhao (2022). TibetanQA: Tibetan Dataset for Machine Reading Comprehension [Dataset]. http://doi.org/10.11922/sciencedb.j00001.00351
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.11922/sciencedb.j00001.00351
Dataset updated
Feb 11, 2022
Dataset provided by
Science Data Bank
Authors
Yuan Sun; Zhengcuo Dan; Sisi Liu; Xiaobing Zhao
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This paper constructs a dataset for Tibetan machine reading comprehension. The data comes from Yunzang website, and covers 12 fields of nature, culture, education, geography, history, life, society, art, technology, people, science and sports. The questions and answers of the dataset are manually entered and marked by 20 Tibetan professionals. It contains 631 articles, 903 paragraphs, and 2,000 question-and-answer pairs constructed based on the paragraphs. Data items mainly include article ID, title, paragraph, question and answer. The publication of this dataset is of great value for promoting the development of Tibetan information processing.
d
Replication data for: Improving Reading Comprehension, Science Domain...
search.dataone.org
dataverse.harvard.edu
Updated Mar 6, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Kim, James S.; Burkhauser, Mary; Mesite, Laura; Asher, Catherine A.; Relyea, Jackie Eunjung; Fitzgerald, Jill; Elmore, Jeff (2024). Replication data for: Improving Reading Comprehension, Science Domain Knowledge, and Reading Engagement through a First-Grade Content Literacy Intervention [Dataset]. http://doi.org/10.7910/DVN/RVJIMX
Explore at:
Unique identifier
https://doi.org/10.7910/DVN/RVJIMX
Dataset updated
Mar 6, 2024
Dataset provided by
Harvard Dataverse
Authors
Kim, James S.; Burkhauser, Mary; Mesite, Laura; Asher, Catherine A.; Relyea, Jackie Eunjung; Fitzgerald, Jill; Elmore, Jeff
Description
This dataset contains replication materials for the Journal of Educational Psychology paper entitled: "Improving Reading Comprehension, Science Domain Knowledge, and Reading Engagement through a First-Grade Content Literacy Intervention." Materials include the dataset and programs to replicate the analyses
Data from: Read Philippines
catalog.data.gov
Updated Jun 25, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
data.usaid.gov (2024). Read Philippines [Dataset]. https://catalog.data.gov/dataset/read-philippines
Explore at:
Dataset updated
Jun 25, 2024
Dataset provided by
United States Agency for International Developmenthttp://usaid.gov/
Area covered
Philippines
Description
Read Philippines or Basa Pilipinas was a four-year early grade reading project that operated from January 2013 to December 2016 and supported the Philippine Department of Education’s national reading program. Basa assisted the implementation of transformative literacy practices in selected divisions of Regions 1 and 7 by providing teacher and student materials, training teachers and school heads, and providing post-training support for Grade 1, 2 and 3 teachers, as well as providing Early Language, Literacy and Numeracy training to kindergarten teachers. The Basa Pilipinas activity used a quasi-experimental cross-sectional design to evaluate the impact of the treatment in improving reading and comprehension skills. Sampling was conducted at three levels: school, classrooms, and student. The school sample was drawn randomly from the activity’s five provinces. Within each school, one grade 2 classroom was selected randomly for baseline and midline with an additional grade 3 classroom selected during the endline. Within each classroom, students were randomly selected to be administered the assessment. A total of 469 students were sample from 40 schools in two provinces at the baseline (comparison), 1,216 students were sampled from 80 schools in five provinces at the midline (intervention 1), and 1,658 students were sampled from 5 provinces at the endline (intervention 2). The disparity in the number of provinces sampled is due to the expansion of the intervention from two provinces to five provinces starting at the midline to provide a more complete picture of the Basa outcomes. To enable the computation of estimates of literacy skills among students in all schools affected by the Basa intervention, design weights were applied to the analyses of EGRA data. Design weights were applied to compensate for differences in provincial sampling and to ensure an appropriate representation of learners in all provinces in the sample.

Dataset of psychophysiological data from children with learning difficulties...

openneuro.org

Updated May 29, 2025

Facebook

Twitter

Click to copy link

Link copied

Cite

César E. Corona-González; Claudia Rebeca De Stefano-Ramos; Juan Pablo Rosado-Aíza; David I. Ibarra-Zarate; Fabiola R. Gómez-Velázquez; Luz María Alonso-Valerdi (2025). Dataset of psychophysiological data from children with learning difficulties who strengthen reading and math skills through assistive technology [Dataset]. http://doi.org/10.18112/openneuro.ds006260.v1.0.1

Explore at:

Unique identifier

https://doi.org/10.18112/openneuro.ds006260.v1.0.1

Dataset updated

May 29, 2025

Dataset provided by

OpenNeurohttps://openneuro.org/

Authors

César E. Corona-González; Claudia Rebeca De Stefano-Ramos; Juan Pablo Rosado-Aíza; David I. Ibarra-Zarate; Fabiola R. Gómez-Velázquez; Luz María Alonso-Valerdi

License

CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically

Description

README

Authors

César E. Corona-González, Claudia Rebeca De Stefano-Ramos, Juan Pablo Rosado-Aíza, Fabiola R Gómez-Velázquez, David I. Ibarra-Zarate, Luz María Alonso-Valerdi

Contact person

César E. Corona-González

https://orcid.org/0000-0002-7680-2953

a00833959@tec.mx

Project name

Psychophysiological data from Mexican children with learning difficulties who strengthen reading and math skills by assistive technology

Year that the project ran

2023

Brief overview of the tasks in the experiment

The current dataset consists of psychometric and electrophysiological data from children with reading or math learning difficulties. These data were collected to evaluate improvements in reading or math skills resulting from using an online learning method called Smartick.

The psychometric evaluations from children with reading difficulties encompassed: spelling tests, where 1) orthographic and 2) phonological errors were considered, 3) reading speed, expressed in words read per minute, and 4) reading comprehension, where multiple-choice questions were given to the children. The last 2 parameters were determined according to the standards from the Ministry of Public Education (Secretaría de Educación Pública in Spanish) in Mexico. On the other hand, group 2 assessments embraced: 1) an assessment of general mathematical knowledge, as well as 2) the hits percentage, and 3) reaction time from an arithmetical task. Additionally, selective attention and intelligence quotient (IQ) were also evaluated.

Then, individuals underwent an EEG experimental paradigm where two conditions were recorded: 1) a 3-minute eyes-open resting state and 2) performing either reading or mathematical activities. EEG recordings from the reading experiment consisted of reading a text aloud and then answering questions about the text. Alternatively, EEG recordings from the math experiment involved the solution of two blocks with 20 arithmetic operations (addition and subtraction). Subsequently, each child was randomly subcategorized as 1) the experimental group, who were asked to engage with Smartick for three months, and 2) the control group, who were not involved with the intervention. Once the 3-month period was over, every child was reassessed as described before.

Description of the contents of the dataset

The dataset contains a total of 76 subjects (sub-), where two study groups were assessed: 1) reading difficulties (R) and 2) math difficulties (M). Then, each individual was subcategorized as experimental subgroup (e), where children were compromised to engage with Smartick, or control subgroup (c), where they did not get involved with any intervention.

Every subject was followed up on for three months. During this period, each subject underwent two EEG sessions, representing the PRE-intervention (ses-1) and the POST-intervention (ses-2).

The EEG recordings from the reading difficulties group consisted of a resting state condition (run-1) and while performing active reading and reading comprehension activities (run-2). On the other hand, EEG data from the math difficulties group was collected from a resting state condition (run-1) and when solving two blocks of 20 arithmetic operations (run-2 and run-3). All EEG files were stored in .set format. The nomenclature and description from filenames are shown below:

Nomenclature	Description
sub-	Subject
M	Math group
R	Reading group
c	Control subgroup
e	Experimental subgroup
ses-1	PRE-intervention
ses-2	POST-Intervention
run-1	EEG for baseline
run-2	EEG for reading activity, or the first block of math
run-3	EEG for the second block of math

Example: the file sub-Rc11_ses-1_task-SmartickDataset_run-2_eeg.set is related to: - The 11th subject from the reading difficulties group, control subgroup (sub-Rc11). - EEG recording from the PRE-intervention (ses-1) while performing the reading activity (run-2)

Independent variables

Study groups:
- Reading difficulties
  - Control: children did not follow any intervention
  - Experimental: Children used the reading program of Smartick for 3 months
- Math difficulties
  - Control: children did not follow any intervention
  - Experimental: Children used the math program of Smartick for 3 months
Condition:
- PRE-intervention: first psychological and electroencephalographic evaluation
- POST-intervention: second psychological and electroencephalographic evaluation

Dependent variables

Psychometric data from the reading difficulties group:
- Orthographic_ERR: number of orthographic errors.
- Phonological_ERR: number of phonological errors.
- Selective_Attention: score from the selective attention test.
- Reading_Speed: reading speed in words per minute.
- Comprehension: score on a reading comprehension task.
- GROUP: C for the control group, E for the experimental group.
- GENDER: M for male, F for Female.
- AGE: age at the beginning of the study.
- IQ: intelligence quotient.
Psychometric data from the math difficulties group:
- WRAT4: score from the WRAT-4 test.
- hits: hits during the EEG acquisition [%].
- RT: reaction time during the EEG acquisition [s].
- Selective_Attention: score from the selective attention test.
- GROUP: C for the control Group, E for the experimental group.
- GENDER: M for male, F for female.
- AGE: age at the beginning of the study.
- IQ: intelligence quotient.

Psychometric data can be found in the 01_Psychometric_Data.xlsx file

Engagement percentage within Smartick (only for experimental group)
- These values represent the engagement percentage through Smartick.
- Students were asked to get involved with the online method for learning for 3 months, 5 days a week.
- Greater values than 100% denote participants who regularly logged in more than 5 days weekly.

Engagement percentage be found in the 05_SessionEngagement.xlsx file

Methods

Subjects

Seventy-six Mexican children between 7 and 13 years old were enrolled in this study.

Information about the recruitment procedure

The sample was recruited through non-profit foundations that support learning and foster care programs.

Apparatus

g.USBamp RESEARCH amplifier

Initial setup

Explain the task to the participant.
Sign informed consent.
Set up electrodes.

Task details

The stimuli nested folder contains all stimuli employed in the EEG experiments.

Level 1 - Math: Images used in the math experiment. - Reading: Images used in the reading experiment.

Level 2 - Math * POST_Operations: arithmetic operations from the POST-intervention.
* PRE_Operations: arithmetic operations from the PRE-intervention. - Reading * POST_Reading1: text 1 and text-related comprehension questions from the POST-intervention. * POST_Reading2: text 2 and text-related comprehension questions from the POST-intervention. * POST_Reading3: text 3 and text-related comprehension questions from the POST-intervention. * PRE_Reading1: text 1 and text-related comprehension questions from the PRE-intervention. * PRE_Reading2: text 2 and text-related comprehension questions from the PRE-intervention. * PRE_Reading3: text 3 and text-related comprehension questions from the PRE-intervention.

Level 3 - Math * Operation01.jpg to Operation20.jpg: arithmetical operations solved during the first block of the math

o
Data from: Content Counts and Motivation Matters: Reading Comprehension in...
openicpsr.org
Updated Dec 20, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
HyeJin Hwang; Nell Duke (2020). Content Counts and Motivation Matters: Reading Comprehension in Third-Grade Students Who Are English Learners [Dataset]. http://doi.org/10.3886/E129401V1
Explore at:
Unique identifier
https://doi.org/10.3886/E129401V1
Dataset updated
Dec 20, 2020
Dataset provided by
Florida State University
University of Michigan
Authors
HyeJin Hwang; Nell Duke
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This study examined the role of science domain knowledge, reading motivation, and decoding skills in reading comprehension achievement in third-grade students who are English learners (ELs) and students who are monolingual, using a nationally representative data set. Multigroup probit regression analyses showed that third-grade science domain knowledge and motivation for reading, decoding skills, and early attainment of decoding skills were significantly associated with third-grade reading comprehension in both language groups. Also, using Wald chi-square tests, the study showed that the association between third-grade science domain knowledge and reading comprehension was stronger in students who were ELs than in students who were monolingual. These findings suggest that cultivating science domain knowledge is very important to supporting reading comprehension development in third grade, particularly for students who are ELs.
o
Are School Entry Skills Predictive of Reading Comprehension across European...
osf.io
url
Updated Mar 27, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Thomas Brüggemann; Ulrich Ludewig; Jakob Schwerter; Nele McElvany; Philipp Doebler (2024). Are School Entry Skills Predictive of Reading Comprehension across European Countries? [Dataset]. http://doi.org/10.17605/OSF.IO/4EVX2
Explore at:
urlAvailable download formats
Unique identifier
https://doi.org/10.17605/OSF.IO/4EVX2
Dataset updated
Mar 27, 2024
Dataset provided by
Center For Open Science
Authors
Thomas Brüggemann; Ulrich Ludewig; Jakob Schwerter; Nele McElvany; Philipp Doebler
License
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Description
Transitioning from learning to read to reading to learn is a central aim of primary school education (Chall, 1983). However, while this aim is shared across borders, countries differ with regard to conditions facilitating or hindering this aim. These conditions encompass extracurricular learning environments and resources, socio-economic backgrounds, home languages, migratory backgrounds and the parents’ reading appreciation (e.g. El-Khechen et al., 2016; Kieffer, 2012; Kigel et al., 2015). Those determinants are also relevant in explaining students’ school entry reading skills, which, in turn, predict students’ reading literacy in primary school (Cameron et al., 2023; Claessens et al., 2009; Duncan et al., 2007). However, it is unclear at which point in a students’ life these effects are most impactful, which is necessary in order to recommend interventions in a timely effective manner. Therefore, in this study we attempt to investigate to what extent factors that are primarily time-invariant for the students affect both the students’ reading skills at school entry and their reading literacy in fourth grade. At the core, we investigate to what extend differences due to time-invariant variables affect students’ rate of learning to reading throughout primary school, or if pre-school differences in reading competence accumulate during primary school. Put differently, we investigate if differences in reading competence start early and stay the same (accumulate) or if they start early and then widen (affecting the learning rate). Building upon Carroll’s (1963) concept of time-on-task within the model of school learning, we investigate the effects of students’ language at home (El-Khechen et al, 2016). Within this theory, spending time on learning a language, such as when students have the ability to practice the language at home, the language skills improve due to higher time investments. Since students whose home language is different from the test language have less time-on-task in learning the test language, we assume that the test language affects the reading competence at school entry. Furthermore, because the home language persists throughout primary school and thus the time-on-task effects persist, we also assume that the home language affects the reading literacy in fourth grade. We consider the parents’ reading appreciation within a social learning theory (SLT; Bandura, 1977) context. SLT posits that students learn by observing and imitating the behavior of adults. Children of parents that highly appreciate reading are more likely to observe their parents while reading and attempt to imitate that behavior. Hence, parents’ reading appreciation is a persistent factor for students, both before and during primary school, and may thus affect both the reading competence at school entry and the reading literacy in fourth grade. In addition, we investigate the effects of household possessions (see Avvisati, 2020) as a persistent environmental factor under the framework of resource deprivation. Students from family backgrounds with few household possessions may lack cultural resources for a home learning environment that is conductive for learning to read before entering primary school and the economic resources to provide support to struggling learners during primary school (see Kieffer, 2012).
We use structural equation modelling on secondary data from the Progress in International Reading Literacy Study (PIRLS) with N = 177,386 fourth grade students from 17 European countries, gathered in 2016 and 2021, to investigate whether effects differ between countries and if they are stable even in changing educational circumstances, such as the COVID-19 pandemic (see Gee et al., 2023; Werner & Woessmann, 2023).
The Quest Dataset
kaggle.com
Updated Nov 26, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jules King (2024). The Quest Dataset [Dataset]. https://www.kaggle.com/datasets/julesking/the-quest-dataset
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Nov 26, 2024
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Jules King
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The Learning Agency Lab’s data science competition, “The Quest for Quality Questions: Improving Reading Comprehension through Automated Question Generation,” was designed to build AI algorithms that can automatically generate questions that test young learners’ reading comprehension.

As many educators and researchers know, questions are key in teaching and evaluating narrative comprehension skills in young learners. However, generating high-quality reading comprehension queries is time consuming, which limits the number of texts that young readers can engage with in this way. Datasets can help by informing quality question automation.

The Quest challenge dataset can be accessed on this page and was aided by foundational data from the Lab’s FairytaleQA dataset of 10,580 questions. Those queries were created to address gaps in similar datasets, which often overlooked fine reading skills that showcased an understanding of varying narrative elements.

The Quest was made possible by The Learning Agency Lab, Mark Warschauer at UC Irvine, and Ying Xu at The University of Michigan School of Education. More can be found about the creators here.

Quest dataset © 2024 by The Learning Agency Lab is licensed under CC BY 4.0. To view a copy of this license, visit https://creativecommons.org/licenses/by/4.0/

Competition - https://www.thequestchallenge.org/

Publications - Xu, Y., Wang, D., Yu, M., Ritchie, D., Yao, B., Wu, T., ... & Warschauer, M. (2022). Fantastic Questions and Where to Find Them: FairytaleQA--An Authentic Dataset for Narrative Comprehension. arXiv preprint arXiv:2203.13947.
n
Data for: The Contributions of Language Skills and Comprehension Monitoring...
narcis.nl
data.mendeley.com
Updated Jan 31, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Zhao, A (via Mendeley Data) (2021). Data for: The Contributions of Language Skills and Comprehension Monitoring to Chinese Reading Comprehension: A Longitudinal Investigation [Dataset]. http://doi.org/10.17632/62mfct3gc8.1
Explore at:
Unique identifier
https://doi.org/10.17632/62mfct3gc8.1
Dataset updated
Jan 31, 2021
Dataset provided by
Data Archiving and Networked Services (DANS)
Authors
Zhao, A (via Mendeley Data)
Description
Longitudinal data of language and cognitive skills of Chinese children followed from Grade 1 to Grade 3
w
Learning Poverty Global Database
data360.worldbank.org
Updated Apr 18, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). Learning Poverty Global Database [Dataset]. https://data360.worldbank.org/en/dataset/WB_LPGD
Explore at:
Dataset updated
Apr 18, 2025
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
2001 - 2023
Description
Will all children be able to read by 2030? The ability to read with comprehension is a foundational skill that every education system around the world strives to impart by late in primary school—generally by age 10. Moreover, attaining the ambitious Sustainable Development Goals (SDGs) in education requires first achieving this basic building block, and so does improving countries’ Human Capital Index scores. Yet past evidence from many low- and middle-income countries has shown that many children are not learning to read with comprehension in primary school. To understand the global picture better, we have worked with the UNESCO Institute for Statistics (UIS) to assemble a new dataset with the most comprehensive measures of this foundational skill yet developed, by linking together data from credible cross-national and national assessments of reading. This dataset covers 115 countries, accounting for 81% of children worldwide and 79% of children in low- and middle-income countries. The new data allow us to estimate the reading proficiency of late-primary-age children, and we also provide what are among the first estimates (and the most comprehensive, for low- and middle-income countries) of the historical rate of progress in improving reading proficiency globally (for the 2000-17 period). The results show that 53% of all children in low- and middle-income countries cannot read age-appropriate material by age 10, and that at current rates of improvement, this “learning poverty” rate will have fallen only to 43% by 2030. Indeed, we find that the goal of all children reading by 2030 will be attainable only with historically unprecedented progress. The high rate of “learning poverty” and slow progress in low- and middle-income countries is an early warning that all the ambitious SDG targets in education (and likely of social progress) are at risk. Based on this evidence, we suggest a new medium-term target to guide the World Bank’s work in low- and middle- income countries: cut learning poverty by at least half by 2030. This target, together with improved measurement of learning, can be as an evidence-based tool to accelerate progress to get all children reading by age 10.

For further details, please refer to https://thedocs.worldbank.org/en/doc/e52f55322528903b27f1b7e61238e416-0200022022/original/Learning-poverty-report-2022-06-21-final-V7-0-conferenceEdition.pdf
f
Data from: Delineating the Profile of Good and Poor Readers
scielo.figshare.com
jpeg
Updated Jun 1, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Lucilene Bender de Sousa; Lilian Cristine Hübner (2023). Delineating the Profile of Good and Poor Readers [Dataset]. http://doi.org/10.6084/m9.figshare.14285532.v1
Explore at:
jpegAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.14285532.v1
Dataset updated
Jun 1, 2023
Dataset provided by
SciELO journals
Authors
Lucilene Bender de Sousa; Lilian Cristine Hübner
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Abstract This article aims at profiling and comparing good (GR) and poor readers’comprehension (PRC). Through a word reading task and a reading comprehension task, 49 good readers and 37 poor readers were identified among 336 students in the 8th grade of public schools in the southern Brazil, not including the intermediate group in the analysis. The investigation of the profile used a self-completion written questionnaire. The results showed difference in the groups’ reading experience and a positive correlation between reading comprehension performance and number of books read by the students in a year. The study verifies that the research on reading habits might help to comprehend the differences in good and poor readers’ profiles. Future research may improve the instrument and expand it to direct not only theoretical studies but also practical studies of clinical and pedagogical intervention.
h
Supporting data for “Monitoring and Regulation in Reading Comprehension...
datahub.hku.hk
zip
Updated Jan 1, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ching Yan Kwok (2024). Supporting data for “Monitoring and Regulation in Reading Comprehension among Chinese children: Exploring the Role of Comprehension Monitoring, Lexical Ambiguity Resolution and Reading Strategy" [Dataset]. http://doi.org/10.25442/hku.19539061.v1
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.25442/hku.19539061.v1
Dataset updated
Jan 1, 2024
Dataset provided by
HKU Data Repository
Authors
Ching Yan Kwok
License
Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically
Description
Datasets for the thesis titled "Monitoring and Regulation in Reading Comprehension among Chinese children: Exploring the Role of Comprehension Monitoring, Lexical Ambiguity Resolution and Reading Strategy". The thesis contained three studies and aimed to provide a more comprehensive view on monitoring and regulation in reading and examine its contributions on reading comprehension. Dataset for Study 1 contained cross-sectional and longitudinal data from junior primary school children on their general cognitive abilities, reading comrpehesion performance and cognitive-linguistic skills. Dataset for Study 2 and 3 contained eye-movement data and data on general cognitive abilities and reading-related cogntive-linguistic skills of grade 3 children.

Please refer to the README file for details of the datasets.
u
The Effects of Choice on the Reading Comprehension and Enjoyment of Children...
rdr.ucl.ac.uk
bin
Updated Jun 1, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Myrofora Kakoulidou (2023). The Effects of Choice on the Reading Comprehension and Enjoyment of Children with Severe Inattention and no Attentional Difficulties: Key data analysed and discussed in the published paper [Dataset]. http://doi.org/10.5522/04/14816121.v1
Explore at:
binAvailable download formats
Unique identifier
https://doi.org/10.5522/04/14816121.v1
Dataset updated
Jun 1, 2023
Dataset provided by
University College London
Authors
Myrofora Kakoulidou
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
This dataset includes the key data analysed and discussed in the paper titled 'The Effects of Choice on the Reading Comprehension and Enjoyment of Children with Severe Inattention and no Attentional Difficulties' published in the Research on Child and Adolescent Psychopathology journal. Data were collected as part of a larger PhD study of Myrofora Kakoulidou, under the supervision of Professor Jane Hurry, Dr Frances Le Cornu Knight and Dr Roberto Filippi.Variables includedCodenumber = Participant IDSex = Biological sexSEN status = Whether children had (or did not have) an SEN statement or and Education and Health Care planReading Motivation Questions = MRQ items (38 items)MRQ_NoMissing = Total scores on MRQ after imputationSentenceCompletion_NGRT = Children's answers to the Sentence Completion items of the NGRT (20 items)TextQuestions_NGRT = Children's answers to the Reading Comprehension questions across the three passages (28 items)NewGroupReadingScores = Total scores across the 48 NGRT itemsNGRT_Standardised = Raw scores converted to standardised scores by ageReadingscores_Choice = Children's reading scores in the Choice conditionReadingscores_NoChoice = Children's reading scores in the No Choice conditionReadingdifference_Final = Difference scores for Reading Comprehension (Reading comprehension scores in the No Choice condition subtracted by reading comprehension scores in the Choice condition)Readingdenjoyment_Final = Difference scores for Reading Enjoyment (Reading Enjoyment scores in the No Choice condition subtracted by reading enjoyment scores in the Choice condition)Enjoyment_NoChoice = Children's enjoyment scores in the No Choice conditionEnjoyment_Choice = Children's enjoyment scores in the Choice conditionTeacherConnersItems = Teachers' ratings on the Conners 3 scale (39 items in total, short version)TeacherRatedInattention = Teachers' ratings of children's inattentionOmissionErrors = Raw scores on Omission errors in AULARTV (Reaction Time Variability) = Raw scores on RTV in AULATeacherRatedInattention_Trichotomised = Trichotomised scores on Teacher-rated InattentionOmissionErrors_Trichotomised = Trichotomised scores on Omission errorsRTV_Trichotomised = Trichotomised scores on RTV
f
Descriptive Statistics, Intraclass Correlations, and Univariate Genetic...
plos.figshare.com
xls
Updated May 30, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Brooke Soden; Micaela E. Christopher; Jacqueline Hulslander; Richard K. Olson; Laurie Cutting; Janice M. Keenan; Lee A. Thompson; Sally J. Wadsworth; Erik G. Willcutt; Stephen A. Petrill (2023). Descriptive Statistics, Intraclass Correlations, and Univariate Genetic (a²), Shared Environmental (c²), and Nonshared Environmental (e²) Components of Variance for Reading Comprehension. [Dataset]. http://doi.org/10.1371/journal.pone.0113807.t001
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0113807.t001
Dataset updated
May 30, 2023
Dataset provided by
PLOS ONE
Authors
Brooke Soden; Micaela E. Christopher; Jacqueline Hulslander; Richard K. Olson; Laurie Cutting; Janice M. Keenan; Lee A. Thompson; Sally J. Wadsworth; Erik G. Willcutt; Stephen A. Petrill
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Note. Ns are for individuals.*p
h
Supporting data for The relationship between musical training and reading...
datahub.hku.hk
Updated Aug 26, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Wai Sing Chan (2024). Supporting data for The relationship between musical training and reading comprehension in children with Autism Spectrum Disorder [Dataset]. http://doi.org/10.25442/hku.26715601.v1
Explore at:
Unique identifier
https://doi.org/10.25442/hku.26715601.v1
Dataset updated
Aug 26, 2024
Dataset provided by
HKU Data Repository
Authors
Wai Sing Chan
License
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Description
The present study examined the reading comprehension performance of children with Autism Spectrum Disorder and compared it with that of their typically developing counterparts to identify the possible variables that may relate to their reading comprehension performance. A series of six tests namely tone identification test, Embedded Figures Test (EFT), homograph reading test, homophone test, music test and reading test were conducted with native Cantonese ASD participants with and without musical training and their corresponding TD groups. The results were recorded in the file.
Data from: The NarrativeQA Reading Comprehension Challenge Dataset
github.com
gitee.com
Updated Dec 21, 2017
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
DeepMind (2017). The NarrativeQA Reading Comprehension Challenge Dataset [Dataset]. https://github.com/google-deepmind/narrativeqa
Explore at:
Dataset updated
Dec 21, 2017
Dataset provided by
DeepMindhttp://deepmind.com/
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
This repository contains the NarrativeQA dataset. It includes the list of documents with Wikipedia summaries, links to full stories, and questions and answers.
m
Data from: Differences in FL Reading Comprehension Among High-, Middle-, and...
data.mendeley.com
Updated Apr 2, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Abdel Salam El-Koumy (2024). Differences in FL Reading Comprehension Among High-, Middle-, and Low-Ambiguity Tolerance Students [Dataset]. http://doi.org/10.17632/k4tscc66zg.1
Explore at:
Unique identifier
https://doi.org/10.17632/k4tscc66zg.1
Dataset updated
Apr 2, 2024
Authors
Abdel Salam El-Koumy
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Florida
Description
The purpose of this study was to examine the differences in foreign language reading comprehension among high-, middle-, and low ambiguity tolerance students.
S
Construction and Evaluation of Chinese Reading Comprehension Data Set for...
scidb.cn
Updated Sep 26, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
yang zi hang; Jin Hou (2022). Construction and Evaluation of Chinese Reading Comprehension Data Set for Security Field [Dataset]. http://doi.org/10.57760/sciencedb.j00133.00081
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.57760/sciencedb.j00133.00081
Dataset updated
Sep 26, 2022
Dataset provided by
Science Data Bank
Authors
yang zi hang; Jin Hou
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
We proposed a Chinese machine reading comprehension dataset for security field (SecMRC), which solves the problem of lack of professional data support for machine reading comprehension technology research in this field. The dataset contains 2 1 00 Anti-terrorism and security domain news, 7300 extracted question-answer pairs, 2 1 00 generative Q&A pairs , and a total of 47 9 6264 characters.Tests were conducted using advanced reading comprehension models on the SecMRC. The results show that the F1 of the extraction task reaches 72.5%, and the average ROUGE-L of the generative task is 37.8%, both of which are significantly weaker than the human level. SecMRC highlights domain knowledge and is difficult and challenging. It can effectively support the research of machine reading comprehension technology in this field. And the dataset construction method is universal and can be extended to other professional fields.
h
Supporting data for Improving Early Reading Comprehension in...
datahub.hku.hk
Updated Jul 9, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Kembell Lentejas (2024). Supporting data for Improving Early Reading Comprehension in Filipino-English Bilingual Children Through Oral Language Assessment and Intervention in the Philippines [Dataset]. http://doi.org/10.25442/hku.26112739.v1
Explore at:
Unique identifier
https://doi.org/10.25442/hku.26112739.v1
Dataset updated
Jul 9, 2024
Dataset provided by
HKU Data Repository
Authors
Kembell Lentejas
License
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Area covered
Philippines
Description
The description of each dataset is given below. Study01 assessed Grades 1 to 3 bilingual children to identify the cognitive and linguistic skills that predict early reading comprehension. The data includes all the participants' scores in different cognitive and linguistic measures.Study02 identified different subgroups of early bilingual readers using their word reading, oral language, and reading fluency skills. The data includes the participants' scores in different linguistic measures in their first and second languages.Study 03 determined the efficacy of a multi-component oral language intervention for bilingual Grades 1 to 3 children. The data include scores of pretest and posttest measures of control and intervention groups.
P
Data from: CJRC Dataset
paperswithcode.com
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Xingyi Duan; Baoxin Wang; Ziyue Wang; Wentao Ma; Yiming Cui; Dayong Wu; Shijin Wang; Ting Liu; Tianxiang Huo; Zhen Hu; Heng Wang; Zhiyuan Liu, CJRC Dataset [Dataset]. https://paperswithcode.com/dataset/cjrc
Explore at:
Authors
Xingyi Duan; Baoxin Wang; Ziyue Wang; Wentao Ma; Yiming Cui; Dayong Wu; Shijin Wang; Ting Liu; Tianxiang Huo; Zhen Hu; Heng Wang; Zhiyuan Liu
Description
The Chinese judicial reading comprehension (CJRC) dataset contains approximately 10K documents and almost 50K questions with answers. The documents come from judgment documents and the questions are annotated by law experts.

Facebook

Twitter

Click to copy link

Link copied

Cite

Guokun Lai; Qizhe Xie; Hanxiao Liu; Yiming Yang; Eduard Hovy (2022). RACE Dataset [Dataset]. https://paperswithcode.com/dataset/race

RACE Dataset

ReAding Comprehension dataset from Examinations

Explore at:

Dataset updated

Jan 27, 2022

Authors

Guokun Lai; Qizhe Xie; Hanxiao Liu; Yiming Yang; Eduard Hovy

Description

The ReAding Comprehension dataset from Examinations (RACE) dataset is a machine reading comprehension dataset consisting of 27,933 passages and 97,867 questions from English exams, targeting Chinese students aged 12-18. RACE consists of two subsets, RACE-M and RACE-H, from middle school and high school exams, respectively. RACE-M has 28,293 questions and RACE-H has 69,574. Each question is associated with 4 candidate answers, one of which is correct. The data generation process of RACE differs from most machine reading comprehension datasets - instead of generating questions and answers by heuristics or crowd-sourcing, questions in RACE are specifically designed for testing human reading skills, and are created by domain experts.

Clear search

Close search

Google apps

Main menu

RACE Dataset

TibetanQA: Tibetan Dataset for Machine Reading Comprehension

Replication data for: Improving Reading Comprehension, Science Domain...

Data from: Read Philippines

Dataset of psychophysiological data from children with learning difficulties...

README

Authors

Contact person

Project name

Year that the project ran

Brief overview of the tasks in the experiment

Description of the contents of the dataset

Independent variables

Dependent variables

Methods

Subjects

Information about the recruitment procedure

Apparatus

Initial setup

Task details

Data from: Content Counts and Motivation Matters: Reading Comprehension in...

Are School Entry Skills Predictive of Reading Comprehension across European...

The Quest Dataset

Data for: The Contributions of Language Skills and Comprehension Monitoring...

Learning Poverty Global Database

Data from: Delineating the Profile of Good and Poor Readers

Supporting data for “Monitoring and Regulation in Reading Comprehension...

The Effects of Choice on the Reading Comprehension and Enjoyment of Children...

Descriptive Statistics, Intraclass Correlations, and Univariate Genetic...

Supporting data for The relationship between musical training and reading...

Data from: The NarrativeQA Reading Comprehension Challenge Dataset

Data from: Differences in FL Reading Comprehension Among High-, Middle-, and...

Construction and Evaluation of Chinese Reading Comprehension Data Set for...

Supporting data for Improving Early Reading Comprehension in...

Data from: CJRC Dataset

RACE DatasetSee More Versions

ReAding Comprehension dataset from Examinations

RACE Dataset