http://researchdatafinder.qut.edu.au/display/n10932http://researchdatafinder.qut.edu.au/display/n10932
QUT Research Data Respository Dataset Resource available for download
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Verbal and Quantitative Reasoning GRE scores and percentiles were collected by querying the student database for the appropriate information. Any student records that were missing data such as GRE scores or grade point average were removed from the study before the data were analyzed. The GRE Scores of entering doctoral students from 2007-2012 were collected and analyzed. A total of 528 student records were reviewed. Ninety-six records were removed from the data because of a lack of GRE scores. Thirty-nine of these records belonged to MD/PhD applicants who were not required to take the GRE to be reviewed for admission. Fifty-seven more records were removed because they did not have an admissions committee score in the database. After 2011, the GRE’s scoring system was changed from a scale of 200-800 points per section to 130-170 points per section. As a result, 12 more records were removed because their scores were representative of the new scoring system and therefore were not able to be compared to the older scores based on raw score. After removal of these 96 records from our analyses, a total of 420 student records remained which included students that were currently enrolled, left the doctoral program without a degree, or left the doctoral program with an MS degree. To maintain consistency in the participants, we removed 100 additional records so that our analyses only considered students that had graduated with a doctoral degree. In addition, thirty-nine admissions scores were identified as outliers by statistical analysis software and removed for a final data set of 286 (see Outliers below). Outliers We used the automated ROUT method included in the PRISM software to test the data for the presence of outliers which could skew our data. The false discovery rate for outlier detection (Q) was set to 1%. After removing the 96 students without a GRE score, 432 students were reviewed for the presence of outliers. ROUT detected 39 outliers that were removed before statistical analysis was performed. Sample See detailed description in the Participants section. Linear regression analysis was used to examine potential trends between GRE scores, GRE percentiles, normalized admissions scores or GPA and outcomes between selected student groups. The D’Agostino & Pearson omnibus and Shapiro-Wilk normality tests were used to test for normality regarding outcomes in the sample. The Pearson correlation coefficient was calculated to determine the relationship between GRE scores, GRE percentiles, admissions scores or GPA (undergraduate and graduate) and time to degree. Candidacy exam results were divided into students who either passed or failed the exam. A Mann-Whitney test was then used to test for statistically significant differences between mean GRE scores, percentiles, and undergraduate GPA and candidacy exam results. Other variables were also observed such as gender, race, ethnicity, and citizenship status within the samples. Predictive Metrics. The input variables used in this study were GPA and scores and percentiles of applicants on both the Quantitative and Verbal Reasoning GRE sections. GRE scores and percentiles were examined to normalize variances that could occur between tests. Performance Metrics. The output variables used in the statistical analyses of each data set were either the amount of time it took for each student to earn their doctoral degree, or the student’s candidacy examination result.
Note: To download this raster dataset, go to ArcGIS Open Data Set and click the download button, and under additional resources select any of the download options. Data can also be downloaded from the FSGeodata Clearinghouse.More information about rangeland productivity and the effects of drought are available in this StoryMap; additional drought and rangeland products from the Office of Sustainability and Climate are available in our Climate Gallery.Time enabled image service showing estimates of annual production of rangeland vegetation.Production data were generated using the Normalized Difference Vegetation Index (NDVI) from the Thematic Mapper Suite from 1984 to 2023 at 250 m resolution. The NDVI is converted to production estimates using two regression formulas depending on the level of the NDVI; there is one equation for lower values (and thus lower production values) and one for higher values. This raster dataset yields estimates of annual production of rangeland vegetation and should be useful for understanding trends and variability in forage resources. These results were then converted to Z-scores for easier comparison of annual relative productivity in coterminous U.S. rangelands, and for rapid display in online time-enabled applications. This Z-scores dataset as well as the raw lbs/acre data that the Z-scores were derived from can be downloaded from: https://data.fs.usda.gov/geodata/rastergateway/rangelands/index.phpMore information about rangeland productivity and the effects of drought are available in this story map.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Control and Treatment test scores. Introductory Macroeconomics College level. Treatment: student rewarded for taking notes on assigned textbook chapters. Collected from students at a southeast regional United States college. Study applies parallel randomized experiments with one control variable.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Raw data outputs 1-18 Raw data output 1. Differentially expressed genes in AML CSCs compared with GTCs as well as in TCGA AML cancer samples compared with normal ones. This data was generated based on the results of AML microarray and TCGA data analysis. Raw data output 2. Commonly and uniquely differentially expressed genes in AML CSC/GTC microarray and TCGA bulk RNA-seq datasets. This data was generated based on the results of AML microarray and TCGA data analysis. Raw data output 3. Common differentially expressed genes between training and test set samples the microarray dataset. This data was generated based on the results of AML microarray data analysis. Raw data output 4. Detailed information on the samples of the breast cancer microarray dataset (GSE52327) used in this study. Raw data output 5. Differentially expressed genes in breast CSCs compared with GTCs as well as in TCGA BRCA cancer samples compared with normal ones. Raw data output 6. Commonly and uniquely differentially expressed genes in breast cancer CSC/GTC microarray and TCGA BRCA bulk RNA-seq datasets. This data was generated based on the results of breast cancer microarray and TCGA BRCA data analysis. CSC, and GTC are abbreviations of cancer stem cell, and general tumor cell, respectively. Raw data output 7. Differential and common co-expression and protein-protein interaction of genes between CSC and GTC samples. This data was generated based on the results of AML microarray and STRING database-based protein-protein interaction data analysis. CSC, and GTC are abbreviations of cancer stem cell, and general tumor cell, respectively. Raw data output 8. Differentially expressed genes between AML dormant and active CSCs. This data was generated based on the results of AML scRNA-seq data analysis. Raw data output 9. Uniquely expressed genes in dormant or active AML CSCs. This data was generated based on the results of AML scRNA-seq data analysis. Raw data output 10. Intersections between the targeting transcription factors of AML key CSC genes and differentially expressed genes between AML CSCs vs GTCs and between dormant and active AML CSCs or the uniquely expressed genes in either class of CSCs. Raw data output 11. Targeting desirableness score of AML key CSC genes and their targeting transcription factors. These scores were generated based on an in-house scoring function described in the Methods section. Raw data output 12. CSC-specific targeting desirableness score of AML key CSC genes and their targeting transcription factors. These scores were generated based on an in-house scoring function described in the Methods section. Raw data output 13. The protein-protein interactions between AML key CSC genes with themselves and their targeting transcription factors. This data was generated based on the results of AML microarray and STRING database-based protein-protein interaction data analysis. Raw data output 14. The previously confirmed associations of genes having the highest targeting desirableness and CSC-specific targeting desirableness scores with AML or other cancers’ (stem) cells as well as hematopoietic stem cells. These data were generated based on a PubMed database-based literature mining. Raw data output 15. Drug score of available drugs and bioactive small molecules targeting AML key CSC genes and/or their targeting transcription factors. These scores were generated based on an in-house scoring function described in the Methods section. Raw data output 16. CSC-specific drug score of available drugs and bioactive small molecules targeting AML key CSC genes and/or their targeting transcription factors. These scores were generated based on an in-house scoring function described in the Methods section. Raw data output 17. Candidate drugs for experimental validation. These drugs were selected based on their respective (CSC-specific) drug scores. CSC is the abbreviation of cancer stem cell. Raw data output 18. Detailed information on the samples of the AML microarray dataset GSE30375 used in this study.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset contains the raw questionnaire data collected from the virtual reality experiment.
Each row contains one participant's answers to all questionnaire items for one environment. Given that each participant experienced nine environments, the file contains 9 rows per participant. Environments are represented by a number (1-14), see explanation below. For questions included in the questionnaire, please see the file "Environment questionnaire."
Environments
Virtual environments in the experiment varied along 4 dimensions (location, building height, facade quality and number of people present). The following list explains the features of each environment.
B and S refer to the locations Bedok and Simei respectively.
H refers to building height and has three levels: 0 - uniformly tall 1 - tall at the back, low at the front 2 - uniformly low
F refers to facade quality and has two levels: 0 - low quality 1- high quality
P refers to number of people present and has two levels 0 - few 1- many
Environments experienced by each group Group A (participants 1-25, 51-52): 1, 2, 6, 7, 8, 10, 11, 12, 14 Group B (participants 26-50): 1, 3, 4, 5, 7, 8, 9, 13, 14
Environment quality
The column "Environment quality" refers to hypothesised environment quality. According to our hypothesis, environments with low buildings, high-quality facades and many people present would be the best environments, with each of these features being independently positive. Thus environment quality ranges from 0-3, with a score of 0 being achieved when all these features are at their "worst" (tall buildings, low-quality facades, and few people present), and a score of 1 being achieved when all features have their best possible value.
Note that building height has 3 possible values. In this case, tall buildings gain a score of 0, while low buildings gain a score of 1. Buildings that are tall at the back and low at the front gain a score of 0.5. Facade quality and presence of people both have possible values 0 and 1. Thus possible environment scores are 0, 0.5, 1, 1.5, 2, 2.5 and 3.
Missing data Some participants did not complete the questionnaire fully, and some technical errors also happened. For these reasons, there is some missing data.
https://doi.org/10.23668/psycharchives.4988https://doi.org/10.23668/psycharchives.4988
The current project examines, through two independent experiments, the role of neuroticism on evaluative conditioning. Evaluative conditioning (EC) is an effect which consists in repeatedly presentations of a conditioned stimulus (CS) with a positive or negative unconditioned stimulus (US), resulting a valence transfer from the US to the CS. To further investigate the interindividual differences of neuroticism on this effect, we introduced the uncertainty/ ambivalence element which could help us to capture the natural tendency of highly neurotic people in transferring negative valence. Experiment 1 presented an experimental manipulation at the US level by using ambivalent USs (i.e., a positive picture and a negative picture merged into one image), whereas Experiment 2 provided a reinforcement manipulation by presenting two CSs with positive USs in half of presentations and with negative USs in other half of presentations. Datasets for: Bunghez, C., Rusu, A., De Houwer, J., Perugini, M., Boddez, Y., & Sava, F. A. (2023). The Moderating Role of Neuroticism on Evaluative Conditioning: Evidence From Ambiguous Learning Situations. Social Psychological and Personality Science, 0(0). https://doi.org/10.1177/19485506231191861: This file involves raw scores and computed scores used in statistical analyses of Experiment 2.
https://www.icpsr.umich.edu/web/ICPSR/studies/7790/termshttps://www.icpsr.umich.edu/web/ICPSR/studies/7790/terms
This dataset includes test scores for over 40,000 students in 175 Irish primary schools that were selected and randomly assigned to a variety of testing treatments as part of a four-year study. The goal of this research effort was to assess the effects of standardized tests and test results on teachers, students, and parents, as well as on school policy. Northern Ireland was chosen because of its developed educational system (in which the English language is used) and its prior lack of standardized testing. During the course of this study, three main testing treatments were implemented in all classrooms in each primary school: (1) no testing was done, (2) norm referenced ability and attainment testing was done in basic curricular areas (English, Irish, and mathematics), but pupil performance data were not returned to the teachers, and (3) norm referenced ability and attainment testing was done, and pupils' raw scores, percentiles, and standard scores were returned to teachers. This dataset contains the norm referenced test scores gathered over the course of the four-year study for each of eight primary age-group cohorts. Parts 1-6 contain scores from students who were in grades 1-6, respectively, during the first year of the study. Part 7 contains scores from students who were in grade 2 in the fourth (last) year of the study, and Part 8 contains the scores from students who were in grade 3 during the last year of the study. Background variables for each student (e.g., treatment group, school type, sex served by school, location of school, size of school, type of administration of school, school identification number, and student's sex) are also included.
U.S. Government Workshttps://www.usa.gov/government-works
License information was derived automatically
The Current Population Survey Food Security Supplement (CPS-FSS) is the source of national and State-level statistics on food insecurity used in USDA's annual reports on household food security. The CPS is a monthly labor force survey of about 50,000 households conducted by the Census Bureau for the Bureau of Labor Statistics. Once each year, after answering the labor force questions, the same households are asked a series of questions (the Food Security Supplement) about food security, food expenditures, and use of food and nutrition assistance programs. Food security data have been collected by the CPS-FSS each year since 1995. Four data sets that complement those available from the Census Bureau are available for download on the ERS website. These are available as ASCII uncompressed or zipped files. The purpose and appropriate use of these additional data files are described below: 1) CPS 1995 Revised Food Security Status data--This file provides household food security scores and food security status categories that are consistent with procedures and variable naming conventions introduced in 1996. This includes the "common screen" variables to facilitate comparisons of prevalence rates across years. This file must be matched to the 1995 CPS Food Security Supplement public-use data file. 2) CPS 1998 Children's and 30-day Food Security data--Subsequent to the release of the April 1999 CPS-FSS public-use data file, USDA developed two additional food security scales to describe aspects of food security conditions in interviewed households not captured by the 12-month household food security scale. This file provides three food security variables (categorical, raw score, and scale score) for each of these scales along with household identification variables to allow the user to match this supplementary data file to the CPS-FSS April 1998 data file. 3) CPS 1999 Children's and 30-day Food Security data--Subsequent to the release of the April 1999 CPS-FSS public-use data file, USDA developed two additional food security scales to describe aspects of food security conditions in interviewed households not captured by the 12-month household food security scale. This file provides three food security variables (categorical, raw score, and scale score) for each of these scales along with household identification variables to allow the user to match this supplementary data file to the CPS-FSS April 1999 data file. 4) CPS 2000 30-day Food Security data--Subsequent to the release of the September 2000 CPS-FSS public-use data file, USDA developed a revised 30-day CPS Food Security Scale. This file provides three food security variables (categorical, raw score, and scale score) for the 30-day scale along with household identification variables to allow the user to match this supplementary data file to the CPS-FSS September 2000 data file. Food security is measured at the household level in three categories: food secure, low food security and very low food security. Each category is measured by a total count and as a percent of the total population. Categories and measurements are broken down further based on the following demographic characteristics: household composition, race/ethnicity, metro/nonmetro area of residence, and geographic region. The food security scale includes questions about households and their ability to purchase enough food and balanced meals, questions about adult meals and their size, frequency skipped, weight lost, days gone without eating, questions about children meals, including diversity, balanced meals, size of meals, skipped meals and hunger. Questions are also asked about the use of public assistance and supplemental food assistance. The food security scale is 18 items that measure insecurity. A score of 0-2 means a house is food secure, from 3-7 indicates low food security, and 8-18 means very low food security. The scale and the data also report the frequency with which each item is experienced. Data are available as .dat files which may be processed in statistical software or through the United State Census Bureau's DataFerret http://dataferrett.census.gov/. Data from 2010 onwards is available below and online. Data from 1995-2009 must be accessed through DataFerrett. DataFerrett is a data analysis and extraction tool to customize federal, state, and local data to suit your requirements. Through DataFerrett, the user can develop an unlimited array of customized spreadsheets that are as versatile and complex as your usage demands then turn those spreadsheets into graphs and maps without any additional software. Resources in this dataset:Resource Title: December 2014 Food Security CPS Supplement. File Name: dec14pub.zipResource Title: December 2013 Food Security CPS Supplement. File Name: dec13pub.zipResource Title: December 2012 Food Security CPS Supplement. File Name: dec12pub.zipResource Title: December 2011 Food Security CPS Supplement. File Name: dec11pub.zipResource Title: December 2010 Food Security CPS Supplement. File Name: dec10pub.zip
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This classroom-based research followed a post-test quasi-experimental study that sought to explore the feasibility of meme-based assessment among students and the outcome of testing critical thinking via said meme-based assessment. A meme-based question paper designed to test basic critical thinking skills like inferencing, identification of bias and classification via association was administered. Based on the feedback provided at the end of the test, the learner’s perception of meme-based assessment and the outcome of the meme-based assessment on critical thinking was identified.The study presents key findings using clustered bar graphs and a pie chart, visually representing raw scores without statistical manipulation. Figure 1 displays student performance across test sections (inferencing, bias identification, association, and research). Figure 2 illustrates student perceptions of assessment efficiency based on meme elements, while Figure 3 depicts their comprehension speed perceptions. Figure 4 shows students' perceptions of the test's critical thinking assessment level. Finally, Figure 5 outlines meme skills identified by students, including previous knowledge, logic, creativity, and critical thinking, with each score from 1 to 5. These visual representations offer insights into students' performance, perceptions, and meme-related skills, aiding understanding without relying on complex statistical analysis.
Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically
The sample included in this dataset represents children who participated in a cross-sectional study, a smaller cohort of which was followed up as part of a longitudinal study reported elsewhere (Bull et al., 2021). In the original study, 347 children were recruited. As data was found to be likely missing completely at random (χ2 = 29.445, df = 24, p = .204, Little, 1998), listwise deletion was used, and 23 observations were deleted from the original dataset. This dataset includes three hundred and twenty-four participants that composed the final sample of this study (162 boys, Mage = 6.2 years, SDage = 0.3 years). Children in this sample were in their second year of kindergarten (i.e., the year before starting primary school) in Singapore. The dataset includes children's sociodemographic information (i.e., age and sex) and performance on different mathematical skills. Children were assessed on a computer-based 0-100 number line task and on the Mathematical Reasoning and Numerical Operations subtests from the Wechsler Individual Achievement Test II (WIAT II). The initial variables recorded on the dataset were children's estimates on each of the target numbers included on the 0-100 number line task, and their accuracy for both subtests of the WIAT II. Several more variables were created based on these original ones. The variables included in the dataset are: Age = Child’s age (in months) Sex = Boy/Girl (parent reported; boy=1, girl=2) Maths_reason = Mathematical reasoning (Math Reasoning subtest from the Wechsler Individual Achievement Test II) Num_Ops = Numerical Operations (Numerical Operations subtest from the Weschler Individual Achievement Test II) Mathematical_achievement = Mathematical achievement (Composite score created by adding the raw scores from the Numerical Operations and Mathematical Reasoning subtests from the Weschler Individual Achievement Test II) P3 to P96 = Placement of the estimate on the 0-100 number line for each respective target number (i.e., P3 corresponds to the placement of the estimate provided when the target number was 3) NLE100PAE = 0-100 number line (Percent absolute error) NP100_Corr = Correlation of individual estimates to target numbers (Spearman’s correlation; p > .05= 0, p < .05 = 1) NP100LinAICc = AICc value obtained for the linear model (9999 = model cannot be fitted) NP100LogAICc = AICc value obtained for the logarithmic model (9999 = model cannot be fitted) NP100PowerAICc = AICc value obtained for the unbounded power model (9999 = model cannot be fitted) NP1001cycleAICc = AICc value obtained for the one-cycle power model (9999 = model cannot be fitted) NP1002cycleAICc = AICc value obtained for the two-cycle power model (9999 = model cannot be fitted) Best_fit_NP100_repshift = Best fitting model based on the representational shift account (0 = model cannot be fitted, 1 = linear, 2 = logarithmic) AICc_bestmodel_repshift = AICc value of the best fitting model based on the representational shift account AICc_diff_repshift = AICc difference (ΔAICc) between both models (i.e, linear and logarithmic) based on the representational shift account AICc_diff_cat_repshift = categorical value created based on AICc_diff_repshift (0 = model cannot be fitted, 1= best fitting model does not have strong support (ΔAIcc < 2), 2 = best fitting model has strong support (ΔAIcc > 2)) Best_fit_NP100_propjudg = Best fitting model based on the proportional judgment account (0 = model cannot be fitted, 3 = unbounded power model, 4 = one-cycle power model, 5 = two-cycle power model) AICc_bestmodel_propjudg = AICc value of the best fitting model based on the proportional judgment account AICc_diff_propjudg_unb = AICc difference (ΔAIcc) between the best fitting model based on the proportional judgment account and the unbounded power model AICc_diff_propjudg_1cyc = AICc difference (ΔAIcc) between the best fitting model based on the proportional judgment account and the one-cycle power model AICc_diff_propjudg_2cyc = AICc difference (ΔAIcc) between the best fitting model based on the proportional judgment account and the two-cycle power model AICc_diff_cat_propjudg = categorical value created based on AICc differences between the best fitting model and the following one based on the proportional judgment account (0 = model cannot be fitted, 1= best fitting model does not have strong support (ΔAIcc < 2), 2 = best fitting model has strong support (ΔAIcc > 2)) Best_fit_NP100_between = Best fitting model when comparing all models to each other (0= model cannot be fitted, 1 = linear, 2 = logarithmic, 3 = unbounded power model, 4 = one-cycle power model, 5 = two-cycle power model) AICc_bestmodel_between = AICc value of the best fitting model from comparing all models to each other AICc_diff_linear_NP100 =AICc difference (ΔAIcc) between the best fitting model based on comparing all models to each other and the linear model AICc_diff_log_NP100 =AICc difference (ΔAIcc) between the best fitting model based on comparing all model to each other and the logarithmic model AICc_diff_power_NP100 =AICc difference (ΔAIcc) between the best fitting model based on comparing all models to each other and the unbounded power model AICc_diff_1cycle_NP100 =AICc difference (ΔAIcc) between the best fitting model based on comparing all models to each other and the one-cycle power model AICc_diff_2cycle_NP100 =AICc difference (ΔAIcc) between the best fitting model based on comparing all models to each other and the two-cycle power model AICc_diff_cat_between = categorical value created based on AICc differences between the best fitting model and the following one based on the comparison of all models to each other (0 = model cannot be fitted, 1= best fitting model does not have strong support (ΔAIcc < 2), 2 = best fitting model has strong support (ΔAIcc > 2))
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Production data were generated using the Normalized Difference Vegetation Index (NDVI) from the Thematic Mapper Suite from 1984 to 2023 at 250 m resolution. The NDVI is converted to production estimates using two regression formulas depending on the level of the NDVI; there is one equation for lower values (and thus lower production values) and one for higher values. This raster dataset yields estimates of annual production of rangeland vegetation and should be useful for understanding trends and variability in forage resources. These results were then converted to Z-scores for easier comparison of annual relative productivity in coterminous U.S. rangelands, and for rapid display in online time-enabled applications. This Z-scores dataset as well as the raw lbs/acre data that the Z-scores were derived from can be downloaded from: https://data.fs.usda.gov/geodata/rastergateway/rangelands/index.php
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
This repository contains the complete raw data files collected for Paleczny & Wild et al. 2025: Characterizing the Cognitive and Mental Health Benefits of Exercise and Video Game Playing. Files 1. brainbodystudy-preprocessed-data-final.csv - The final preprocessed dataset that joins all the individual questionnaires together, along with scoring, filterting etc. If you want to start playing with the data as described in the paper, then you can just use this file instead of using all the raw files below. 2. BBDemographic_Survey_June_19_2024_11_17.csv.bz - Contains the Qualtrics export of the demographics and health questionnaire (includes PHQ-2 and GAD-2). 3. brainbodystudy-cogs.csv.bz2 - Preprocessed Creyos data, where score features have been extracted. One row per assessment. 4. PAAQ_June_19_2024_08_20.csv.bz2 - Raw Qualtrics export of PAAQ (ref) questionnaire data. 5. VGQ_June_19_2024_08_22.csv.bz2 - Raw Qualtrics export of VGQ (ref) questionnaire data.
The dataset contains all raw data and corresponding standard scores used for analyses in the research described by Walda, Weenk, Van Weerdenburg, and Bosman (in preparation). Also, corresponding edited scores of the Attention Concentration Test (ACT) are displayed. The raw scores of the ACT were edited using computer programs designed by the designers of the ACT.
The manuscript addresses attention skill, as assessed with the ACT in children with and without dyslexia. For a full description of the reading and spelling remediation program, and all measures see the manuscript of Experiment 1 (Walda, Van Weerdenburg, Van der Ven, & Bosman, 2022). Data of Experiment 1 were previously deposited in DANS-EASY (https://doi.org/10.17026/dans-258-kq8c).
An important finding of Experiment 1 is that many children with dyslexia (about 65%) were unable to finish 25 consecutive bars of the ACT at pre-test. Even at later moments of measurement within the nine months of the remediation program, when they had gotten more used to the task, many children failed to complete the ACT without making any error. Thus, regardless of whether they were exposed to the ACT training, children with dyslexia failed to complete the ACT and also hardly mastered the task of the ACT at a later moment. A prudent conclusion from the findings of Experiment 2 is that an adaptive and intensive training of the ACT lead to better results on all ACT measures (i.e., number of bars completed without making errors, working speed, and distraction time). More interestingly, literacy progress was not affected by improvement of attention skill, which raises the question whether this null-finding is specific for children with dyslexia or general for all children who are learning to read.
In the present (third) experiment, we focused on a comparison between children with and without dyslexia in terms of attention skill. In the present experiment, three questions were addressed:
1. Do children with dyslexia differ from children without in terms of performance on the ACT?
2. Does the ACT training affect ACT performance in children without dyslexia?
3. Does training of the ACT affect literacy performance of children without dyslexia?
The experimental design was a randomized controlled trial, using two moments of measurement administered in a one-to-one assessment setting. The pretest (T1) took place prior to all interventions. For the group of children with dyslexia, a follow-up measurement was administered after three months of reading and spelling remediation (T2). For the group of children without dyslexia, a follow-up measurement was administered after seven regular school weeks (T2). In the README.pdf file, the process of collecting data is described, followed by an overview and description of variables included in the data set.
Comparing the datasets of Experiment 1 and Experiment 3, two main differences appear:
1. The dataset of Experiment 3 contains data of both participants with and without a dyslexia diagnosis, whereas the dataset of Experiment 1 contains only data of participants with a dyslexia diagnosis.
2. The dataset of Experiment 3 contains data on two measurement moments (pre and post ACT training), whereas the dataset of Experiment 1 contains data on four moments of measurement.
The Area Deprivation Index (ADI) can show where areas of deprivation and affluence exist within a community. The ADI is calculated with 17 indicators from the American Community Survey (ACS) having been well-studied in the peer-reviewed literature since 2003, and used for 20 years by the Health Resources and Services Administration (HRSA). High levels of deprivation have been linked to health outcomes such as 30-day hospital readmission rates, cardiovascular disease deaths, cervical cancer incidence, cancer deaths, and all-cause mortality. The 17 indicators from the ADI encompass income, education, employment, and housing conditions at the Census Block Group level.The ADI is available on BigQuery for release years 2018-2020 and is reported as a percentile that is 0-100% with 50% indicating a "middle of the nation" percentile. Data is provided at the county, ZIP, and Census Block Group levels. Neighborhood and racial disparities occur when some neighborhoods have high ADI scores and others have low scores. A low ADI score indicates affluence or prosperity. A high ADI score is indicative of high levels of deprivation. Raw ADI scores and additional statistics and dataviz can be seen in this ADI story with a BroadStreet free account.Much of the ADI research and popularity would not be possible without the excellent work of Dr. Amy Kind and colleagues at HIPxChange and at The University of Wisconsin Madison.This public dataset is hosted in Google BigQuery and is included in BigQuery's 1TB/mo of free tier processing. This means that each user receives 1TB of free BigQuery processing every month, which can be used to run queries on this public dataset. Watch this short video to learn how to get started quickly using BigQuery to access public datasets. What is BigQuery. 詳細
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This repository was created for my Master's thesis in Computational Intelligence and Internet of Things at the University of Córdoba, Spain. The purpose of this repository is to store the datasets found that were used in some of the studies that served as research material for this Master's thesis. Also, the datasets used in the experimental part of this work are included.
Below are the datasets specified, along with the details of their references, authors, and download sources.
----------- STS-Gold Dataset ----------------
The dataset consists of 2026 tweets. The file consists of 3 columns: id, polarity, and tweet. The three columns denote the unique id, polarity index of the text and the tweet text respectively.
Reference: Saif, H., Fernandez, M., He, Y., & Alani, H. (2013). Evaluation datasets for Twitter sentiment analysis: a survey and a new dataset, the STS-Gold.
File name: sts_gold_tweet.csv
----------- Amazon Sales Dataset ----------------
This dataset is having the data of 1K+ Amazon Product's Ratings and Reviews as per their details listed on the official website of Amazon. The data was scraped in the month of January 2023 from the Official Website of Amazon.
Owner: Karkavelraja J., Postgraduate student at Puducherry Technological University (Puducherry, Puducherry, India)
Features:
License: CC BY-NC-SA 4.0
File name: amazon.csv
----------- Rotten Tomatoes Reviews Dataset ----------------
This rating inference dataset is a sentiment classification dataset, containing 5,331 positive and 5,331 negative processed sentences from Rotten Tomatoes movie reviews. On average, these reviews consist of 21 words. The first 5331 rows contains only negative samples and the last 5331 rows contain only positive samples, thus the data should be shuffled before usage.
This data is collected from https://www.cs.cornell.edu/people/pabo/movie-review-data/ as a txt file and converted into a csv file. The file consists of 2 columns: reviews and labels (1 for fresh (good) and 0 for rotten (bad)).
Reference: Bo Pang and Lillian Lee. Seeing stars: Exploiting class relationships for sentiment categorization with respect to rating scales. In Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL'05), pages 115–124, Ann Arbor, Michigan, June 2005. Association for Computational Linguistics
File name: data_rt.csv
----------- Preprocessed Dataset Sentiment Analysis ----------------
Preprocessed amazon product review data of Gen3EcoDot (Alexa) scrapped entirely from amazon.in
Stemmed and lemmatized using nltk.
Sentiment labels are generated using TextBlob polarity scores.
The file consists of 4 columns: index, review (stemmed and lemmatized review using nltk), polarity (score) and division (categorical label generated using polarity score).
DOI: 10.34740/kaggle/dsv/3877817
Citation: @misc{pradeesh arumadi_2022, title={Preprocessed Dataset Sentiment Analysis}, url={https://www.kaggle.com/dsv/3877817}, DOI={10.34740/KAGGLE/DSV/3877817}, publisher={Kaggle}, author={Pradeesh Arumadi}, year={2022} }
This dataset was used in the experimental phase of my research.
File name: EcoPreprocessed.csv
----------- Amazon Earphones Reviews ----------------
This dataset consists of a 9930 Amazon reviews, star ratings, for 10 latest (as of mid-2019) bluetooth earphone devices for learning how to train Machine for sentiment analysis.
This dataset was employed in the experimental phase of my research. To align it with the objectives of my study, certain reviews were excluded from the original dataset, and an additional column was incorporated into this dataset.
The file consists of 5 columns: ReviewTitle, ReviewBody, ReviewStar, Product and division (manually added - categorical label generated using ReviewStar score)
License: U.S. Government Works
Source: www.amazon.in
File name (original): AllProductReviews.csv (contains 14337 reviews)
File name (edited - used for my research) : AllProductReviews2.csv (contains 9930 reviews)
----------- Amazon Musical Instruments Reviews ----------------
This dataset contains 7137 comments/reviews of different musical instruments coming from Amazon.
This dataset was employed in the experimental phase of my research. To align it with the objectives of my study, certain reviews were excluded from the original dataset, and an additional column was incorporated into this dataset.
The file consists of 10 columns: reviewerID, asin (ID of the product), reviewerName, helpful (helpfulness rating of the review), reviewText, overall (rating of the product), summary (summary of the review), unixReviewTime (time of the review - unix time), reviewTime (time of the review (raw) and division (manually added - categorical label generated using overall score).
Source: http://jmcauley.ucsd.edu/data/amazon/
File name (original): Musical_instruments_reviews.csv (contains 10261 reviews)
File name (edited - used for my research) : Musical_instruments_reviews2.csv (contains 7137 reviews)
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Overview: This is a large-scale dataset with impedance and signal loss data recorded on volunteer test subjects using low-voltage alternate current sine-shaped signals. The signal frequencies are from 50 kHz to 20 MHz.
Applications: The intention of this dataset is to allow to investigate the human body as a signal propagation medium, and capture information related to how the properties of the human body (age, sex, composition etc.), the measurement locations, and the signal frequencies impact the signal loss over the human body.
Overview statistics:
Number of subjects: 30
Number of transmitter locations: 6
Number of receiver locations: 6
Number of measurement frequencies: 19
Input voltage: 1 V
Load resistance: 50 ohm and 1 megaohm
Measurement group statistics:
Height: 174.10 (7.15)
Weight: 72.85 (16.26)
BMI: 23.94 (4.70)
Body fat %: 21.53 (7.55)
Age group: 29.00 (11.25)
Male/female ratio: 50%
Included files:
experiment_protocol_description.docx - protocol used in the experiments
electrode_placement_schematic.png - schematic of placement locations
electrode_placement_photo.jpg - visualization on the experiment, on a volunteer subject
RawData - the full measurement results and experiment info sheets
all_measurements.csv - the most important results extracted to .csv
all_measurements_filtered.csv - same, but after z-score filtering
all_measurements_by_freq.csv - the most important results extracted to .csv, single frequency per row
all_measurements_by_freq_filtered.csv - same, but after z-score filtering
summary_of_subjects.csv - key statistics on the subjects from the experiment info sheets
process_json_files.py - script that creates .csv from the raw data
filter_results.py - outlier removal based on z-score
plot_sample_curves.py - visualization of a randomly selected measurement result subset
plot_measurement_group.py - visualization of the measurement group
CSV file columns:
subject_id - participant's random unique ID
experiment_id - measurement session's number for the participant
height - participant's height, cm
weight - participant's weight, kg
BMI - body mass index, computed from the valued above
body_fat_% - body fat composition, as measured by bioimpedance scales
age_group - age rounded to 10 years, e.g. 20, 30, 40 etc.
male - 1 if male, 0 if female
tx_point - transmitter point number
rx_point - receiver point number
distance - distance, in relative units, between the tx and rx points. Not scaled in terms of participant's height and limb lengths!
tx_point_fat_level - transmitter point location's average fat content metric. Not scaled for each participant individually.
rx_point_fat_level - receiver point location's average fat content metric. Not scaled for each participant individually.
total_fat_level - sum of rx and tx fat levels
bias - constant term to simplify data analytics, always equal to 1.0
CSV file columns, frequency-specific:
tx_abs_Z_... - transmitter-side impedance, as computed by the process_json_files.py
script from the voltage drop
rx_gain_50_f_... - experimentally measured gain on the receiver, in dB, using 50 ohm load impedance
rx_gain_1M_f_... - experimentally measured gain on the receiver, in dB, using 1 megaohm load impedance
Acknowledgments: The dataset collection was funded by the Latvian Council of Science, project “Body-Coupled Communication for Body Area Networks”, project No. lzp-2020/1-0358.
References: For a more detailed information, see this article: J. Ormanis, V. Medvedevs, A. Sevcenko, V. Aristovs, V. Abolins, and A. Elsts. Dataset on the Human Body as a Signal Propagation Medium for Body Coupled Communication. Submitted to Elsevier Data in Brief, 2023.
Contact information: info@edi.lv
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
MS Excel table containing Z-scores of all parameters to build the multiparametric phenotypic signatures as described in the main manuscript under Methods, section 'Assembly of phenotypic HC signatures'.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
physical tests, updrs
The Area Deprivation Index (ADI) can show where areas of deprivation and affluence exist within a community. The ADI is calculated with 17 indicators from the American Community Survey (ACS) having been well-studied in the peer-reviewed literature since 2003, and used for 20 years by the Health Resources and Services Administration (HRSA). High levels of deprivation have been linked to health outcomes such as 30-day hospital readmission rates, cardiovascular disease deaths, cervical cancer incidence, cancer deaths, and all-cause mortality. The 17 indicators from the ADI encompass income, education, employment, and housing conditions at the Census Block Group level.The ADI is available on BigQuery for release years 2018-2020 and is reported as a percentile that is 0-100% with 50% indicating a "middle of the nation" percentile. Data is provided at the county, ZIP, and Census Block Group levels. Neighborhood and racial disparities occur when some neighborhoods have high ADI scores and others have low scores. A low ADI score indicates affluence or prosperity. A high ADI score is indicative of high levels of deprivation. Raw ADI scores and additional statistics and dataviz can be seen in this ADI story with a BroadStreet free account.Much of the ADI research and popularity would not be possible without the excellent work of Dr. Amy Kind and colleagues at HIPxChange and at The University of Wisconsin Madison.This public dataset is hosted in Google BigQuery and is included in BigQuery's 1TB/mo of free tier processing. This means that each user receives 1TB of free BigQuery processing every month, which can be used to run queries on this public dataset. Watch this short video to learn how to get started quickly using BigQuery to access public datasets. What is BigQuery. Learn more
http://researchdatafinder.qut.edu.au/display/n10932http://researchdatafinder.qut.edu.au/display/n10932
QUT Research Data Respository Dataset Resource available for download