The interview data was gathered for a project that investigated the practices of instructors who use quantitative data to teach undergraduate courses within the Social Sciences. The study was undertaken by employees of the University of California, Santa Barbara (UCSB) Library, who participated in this research project with 19 other colleges and universities across the U.S. under the direction of Ithaka S+R. Ithaka S+R is a New York-based research organization, which, among other goals, seeks to develop strategies, services, and products to meet evolving academic trends to support faculty and students.
The field of Social Sciences has been notoriously known for valuing the contextual component of data and increasingly entertaining more quantitative and computational approaches to research in response to the prevalence of data literacy skills needed to navigate both personal and professional contexts. Thus, this study becomes particularly timely to identify current instructors’ practi..., The project followed a qualitative and exploratory approach to understand current practices of faculty teaching with data. The study was IRB approved and was exempt by the UCSB’s Office of Research in July 2020 (Protocol 1-20-0491).Â
The identification and recruitment of potential participants took into account the selection criteria pre-established by Ithaka S+R: a) instructors of courses within the Social Sciences, considering the field as broadly defined, and making the best judgment in cases the discipline intersects with other fields; b) instructors who teach undergraduate courses or courses where most of the students are at the undergraduate level; c) instructors of any rank, including adjuncts and graduate students; as long as they were listed as instructors of record of the selected courses; d) instructors who teach courses were students engage with quantitative/computational data.Â
The sampling process followed a combination of strategies to more easily identify instructo..., The data folder contains 10Â pdf files with de-identified transcriptions of the interviews and the pdf files with the recruitment email and the interview guide.Â
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
It's no secret that US university students often graduate with debt repayment obligations that far outstrip their employment and income prospects. While it's understood that students from elite colleges tend to earn more than graduates from less prestigious universities, the finer relationships between future income and university attendance are quite murky. In an effort to make educational investments less speculative, the US Department of Education has matched information from the student financial aid system with federal tax returns to create the College Scorecard dataset.
Kaggle is hosting the College Scorecard dataset in order to facilitate shared learning and collaboration. Insights from this dataset can help make the returns on higher education more transparent and, in turn, more fair.
Here's a script showing an exploratory overview of some of the data.
college-scorecard-release-*.zip contains a compressed version of the same data available through Kaggle Scripts.
It consists of three components:
New to data exploration in R? Take the free, interactive DataCamp course, "Data Exploration With Kaggle Scripts," to learn the basics of visualizing data with ggplot. You'll also create your first Kaggle Scripts along the way.
R Code for Original Paper Replication R Code for Model Improvements Datasets for replication and model improvements
This dataset was created by Akshit Soneji
Groundbreaking biomedical research requires access to cutting edge scientific resources; however such resources are often invisible beyond the laboratories or universities where they were developed. eagle-i is a national research resource discovery platform to help biomedical scientists search for and find previously invisible, but highly valuable, resources. Resource descriptions collected at each participating institution are freely available as linked open data.
https://www.icpsr.umich.edu/web/ICPSR/studies/9585/termshttps://www.icpsr.umich.edu/web/ICPSR/studies/9585/terms
This project examined different aspects of campus crime -- specifically, the prevalence of crimes among college students, whether the crime rate was increasing or decreasing on college campuses, and the factors related to campus crime. Researchers made the assumption that crimes committed by and against college students were likely to be related to drug and alcohol use. Specific questions designed to be answered by the data include: (1) Do students who commit crimes differ in their use of drugs and alcohol from students who do not commit crimes? (2) Do students who are victims of crimes differ in their use of drugs and alcohol from students who are not victims? (3) How do multiple offenders differ from single offenders in their use of drugs and alcohol? (4) How do victims of violent crimes differ from victims of nonviolent crimes in their use of drugs and alcohol? (5) What types of student crimes are more strongly related to drug or alcohol use than others? (6) Other than drug and alcohol use, in what ways can victims and perpetrators of crimes be differentiated from students who have had no direct experiences with crime? Variables include basic demographic information, academic information, drug use information, and experiences with crime since becoming a student.
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
The University of Florida Herbarium (FLAS) in the Florida Museum of Natural History contains approximately 500,000 specimens of vascular plants, bryophytes, lichens, fungi, and wood, with the earliest specimens dating to the early to mid-1800s. The FLAS acronym is the standard international abbreviation for the Herbarium, derived from its early association with the Florida Agricultural Experiment Station. The herbarium was established in 1891 by Peter H. Rolfs of Florida Agricultural College in Lake City, and later moved to the University of Florida in Gainesville in 1906. The vascular plant collection (ca. 320,000 specimens) has an excellent representation of the USA (esp. Florida and the southeastern USA), the West Indies (esp. Hispaniola), and other Neotropical areas. The bryophyte collection (ca. 70,000 specimens) and lichen collection (ca. 16,000 specimens) are worldwide in scope (esp. Florida and Neotropical areas such as Costa Rica, Venezuela, and Brazil). The wood collection (ca. 16,000 specimens) is worldwide in scope (esp. tropical woods). The algal collection includes ca. 3,500 specimens, mainly from Florida. The Fungal Herbarium contains ca. 55,000 specimens (primarily non-lichenized fungi and slime molds). This digital dataset serves the vascular plant collection, of which ca. 2/3 are digitized, including ca. 500 type specimens (holo-, lecto-, iso-, neo-, or epi-types). The herbarium's digitization effort has been supported by several institutions, including the Florida Museum of Natural History, Institute of Food and Agricultural Sciences, National Science Foundation, United States Department of Agriculture (Hatch Project FLAS-HRB-04170), UF Libraries Digital Library Center, Florida Center for Library Automation, Florida Museum Associates, and the Andrew W. Mellon Foundation.
Data and metadata on stomatal conductance were collected through field measurements, data on particulate matter were collected through lab analyses of samples collected in the field. All data were managed using R v. 4.2.2.
This dataset reflects reported incidents of crime (with the exception of murders where data exists for each victim) that occurred in the City of Chicago from 2001 to present, minus the most recent seven days. Data is extracted from the Chicago Police Department's CLEAR (Citizen Law Enforcement Analysis and Reporting) system. In order to protect the privacy of crime victims, addresses are shown at the block level only and specific locations are not identified. Should you have questions about this dataset, you may contact the Research & Development Division of the Chicago Police Department at 312.745.6071 or RandD@chicagopolice.org. Disclaimer: These crimes may be based upon preliminary information supplied to the Police Department by the reporting parties that have not been verified. The preliminary crime classifications may be changed at a later date based upon additional investigation and there is always the possibility of mechanical or human error. Therefore, the Chicago Police Department does not guarantee (either expressed or implied) the accuracy, completeness, timeliness, or correct sequencing of the information and the information should not be used for comparison purposes over time. The Chicago Police Department will not be responsible for any error or omission, or for the use of, or the results obtained from the use of this information. All data visualizations on maps should be considered approximate and attempts to derive specific addresses are strictly prohibited. The Chicago Police Department is not responsible for the content of any off-site pages that are referenced by or that reference this web page other than an official City of Chicago or Chicago Police Department web page. The user specifically acknowledges that the Chicago Police Department is not responsible for any defamatory, offensive, misleading, or illegal conduct of other users, links, or third parties and that the risk of injury from the foregoing rests entirely with the user. The unauthorized use of the words "Chicago Police Department," "Chicago Police," or any colorable imitation of these words or the unauthorized use of the Chicago Police Department logo is unlawful. This web page does not, in any way, authorize such use. Data are updated daily. The dataset contains more than 65,000 records/rows of data and cannot be viewed in full in Microsoft Excel. Therefore, when downloading the file, select CSV from the Export menu. Open the file in an ASCII text editor, such as Wordpad, to view and search. To access a list of Chicago Police Department - Illinois Uniform Crime Reporting (IUCR) codes, go to http://data.cityofchicago.org/Public-Safety/Chicago-Police-Department-Illinois-Uniform-Crime-R/c7ck-438e
This data release contains the input-data files and R scripts associated with the analysis presented in [citation of manuscript]. The spatial extent of the data is the contiguous U.S. The input-data files include one comma separated value (csv) file of county-level data, and one csv file of city-level data. The county-level csv (“county_data.csv”) contains data for 3,109 counties. This data includes two measures of water use, descriptive information about each county, three grouping variables (climate region, urban class, and economic dependency), and contains 18 explanatory variables: proportion of population growth from 2000-2010, fraction of withdrawals from surface water, average daily water yield, mean annual maximum temperature from 1970-2010, 2005-2010 maximum temperature departure from the 40-year maximum, mean annual precipitation from 1970-2010, 2005-2010 mean precipitation departure from the 40-year mean, Gini income disparity index, percent of county population with at least some college education, Cook Partisan Voting Index, housing density, median household income, average number of people per household, median age of structures, percent of renters, percent of single family homes, percent apartments, and a numeric version of urban class. The city-level csv (city_data.csv) contains data for 83 cities. This data includes descriptive information for each city, water-use measures, one grouping variable (climate region), and 6 explanatory variables: type of water bill (increasing block rate, decreasing block rate, or uniform), average price of water bill, number of requirement-oriented water conservation policies, number of rebate-oriented water conservation policies, aridity index, and regional price parity. The R scripts construct fixed-effects and Bayesian Hierarchical regression models. The primary difference between these models relates to how they handle possible clustering in the observations that define unique water-use settings. Fixed-effects models address possible clustering in one of two ways. In a "fully pooled" fixed-effects model, any clustering by group is ignored, and a single, fixed estimate of the coefficient for each covariate is developed using all of the observations. Conversely, in an unpooled fixed-effects model, separate coefficient estimates are developed only using the observations in each group. A hierarchical model provides a compromise between these two extremes. Hierarchical models extend single-level regression to data with a nested structure, whereby the model parameters vary at different levels in the model, including a lower level that describes the actual data and an upper level that influences the values taken by parameters in the lower level. The county-level models were compared using the Watanabe-Akaike information criterion (WAIC) which is derived from the log pointwise predictive density of the models and can be shown to approximate out-of-sample predictive performance. All script files are intended to be used with R statistical software (R Core Team (2017). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. URL https://www.R-project.org) and Stan probabilistic modeling software (Stan Development Team. 2017. RStan: the R interface to Stan. R package version 2.16.2. http://mc-stan.org).
Higher education plays a critical role in driving an innovative economy by equipping students with knowledge and skills demanded by the workforce. While researchers and practitioners have developed data systems to track detailed occupational skills, such as those established by the U.S. Department of Labor (DOL), much less effort has been made to document which of these skills are being developed in higher education at a similar granularity. Here, we fill this gap by presenting Course-Skill Atlas – a longitudinal dataset of skills inferred from over three million course syllabi taught at nearly three thousand U.S. higher education institutions. To construct Course-Skill Atlas, we apply natural language processing to quantify the alignment between course syllabi and detailed workplace activities (DWAs) used by the DOL to describe occupations. We then aggregate these alignment scores to create skill profiles for institutions and academic majors. Our dataset offers a large-scale representation of college education’s role in preparing students for the labor market. Overall, Course-Skill Atlas can enable new research on the source of skills in the context of workforce development and provide actionable insights for shaping the future of higher education to meet evolving labor demands, especially in the face of new technologies.
We investigate whether the degree production and R&D activities of colleges and universities are related to the amount and types of human capital in the metropolitan areas where they are located. Our results indicate only a small positive relationship exists between a metropolitan area’s production and stock of human capital, suggesting that migration plays an important role in the geographic distribution of human capital. We also find that academic R&D activities increase local human capital le vels, suggesting that spillovers from such activities can raise the demand for human capital. Consistent with these results, we show that metropolitan areas with more higher education activity tend to have a larger share of workers in high human capital occupations. Thus, this research indicates that colleges and universities can raise local human capital levels by increasing both the supply of and demand for skill.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Data Notes:\r \r * Data tables from 2016 onwards report school size by number of students. The previous “school classification” is no longer applicable.\r \r * NSW School of Languages and Aurora College are included in the ‘Other’ category under Secondary Schools. These schools do not have full-time enrolment.\r \r Data Source:\r \r * Schools and Students: Statistical Bulletin . Centre for Education Statistics and Evaluation.
This dataset was created by Nithin Kumar K R
This large, international dataset contains survey responses from N = 12,570 students from 100 universities in 35 countries, collected in 21 languages. We measured anxieties (statistics, mathematics, test, trait, social interaction, performance, creativity, intolerance of uncertainty, and fear of negative evaluation), self-efficacy, persistence, and the cognitive reflection test, and collected demographics, previous mathematics grades, self-reported and official statistics grades, and statistics module details. Data reuse potential is broad, including testing links between anxieties and statistics/mathematics education factors, and examining instruments’ psychometric properties across different languages and contexts. Data and metadata are stored on the Open Science Framework website [https://osf.io/mhg94/].
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The research into the efficacy of blended EFL (English as a Foreign Language) teaching at the collegiate level holds significant importance for comprehending and implementing this novel pedagogical approach on a larger scale within universities. Within this domain, scholars have primarily concentrated on feedback mechanisms and quality assurance, while comparatively neglecting the advancement of college students' foreign language proficiency and the individual variances in the acceptance and rewards of blended teaching across distinct language proficiency groups. In light of this, leveraging micro-data from a provincial normal university's blended college English teaching, this study employs R 3.6.1 and R Studio to implement multiple linear regression and conditional quantile models so as to assess the impact of blended teaching on different language proficiency groups across four dimensions: listening, reading, writing, and overall language proficiency. To mitigate endogenous system risk, students admitted to the same major are selected as samples and their data undergoes additional screening, excluding learners who failed the CET4 exam or did not participate in the CET6 exam. After employing purposive sampling techniques, a valid sample of 676 learners is established, comprising 363 learners in the experimental group for blended teaching intervention and 313 learners in the control group receiving traditional teaching. The study results indicates that the samples had random characteristics. The study findings suggest the following: (1) Blended teaching has a significant positive impact on enhancing the efficiency of English acquisition. (2) The effectiveness of blended teaching in improving learners' reading, listening, and writing skills follows a sequential decrease, exhibiting a downward trend as students' language ability increases. This indicates that blended teaching facilitates the acquisition of foundational language knowledge, however, its impact on more advanced language processing abilities is limited. (3) Blended teaching demonstrates a range effect, primarily benefiting learners at the intermediate level and below in terms of enhancing their language proficiency. Conversely, learners at the medium-high and high proficiency levels derive comparatively lesser benefits from this approach. This study introduces a new methodology by employing multiple linear regression and conditional quantile models to assess the impact of blended teaching. This methodology not only enables us to examine the overall impact of blended teaching, but also allows assessment of its effect on different proficiency groups, helping to identify its effectiveness on individual learners across four dimensions.
This study is part of a broader longitudinal study, with two-repeated measures, in which we assessed several mental health-related variables. In this study we are aimed at examine the within-person changes in the mental health state of college students with and without mental disorder background, during successive time cuts of the Argentina’s lengthy mandatory quarantine, while adjusting for quarantine duration, main demographic factors (sex and age), and additional factors such as suicidal behavior history, loneliness, and region of residence. Este estudio forma parte de un estudio longitudinal más amplio, con dos medidas repetidas, en el que evaluamos diversas variables relacionadas con la salud mental. En este estudio nos proponemos examinar los cambios intrapersonales en el estado de salud mental de estudiantes universitarios con y sin antecedentes de trastornos mentales, durante los sucesivos cortes temporales de la prolongada cuarentena obligatoria de Argentina, ajustando al mismo tiempo por la duración de la cuarentena, los principales factores demográficos (sexo y edad), y factores adicionales como los antecedentes de conducta suicida, la soledad y la región de residencia.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
These are the basic settings used in the simulations. This resulted in 6400 simulations.
Table 1.7.10: Regional distribution of R & D staff of scientific institutions outside universities (full-time equivalent) 1)
Dataset and R scripts accompanying manuscript submitted to Scientific Reports by Matthew H. E. M. Browning and co-authors. Paper evaluated the impact of 3 to 4 week doses of daily virtual nature on college student's anxiety and depressive symptoms as well as ruminative thinking.
The interview data was gathered for a project that investigated the practices of instructors who use quantitative data to teach undergraduate courses within the Social Sciences. The study was undertaken by employees of the University of California, Santa Barbara (UCSB) Library, who participated in this research project with 19 other colleges and universities across the U.S. under the direction of Ithaka S+R. Ithaka S+R is a New York-based research organization, which, among other goals, seeks to develop strategies, services, and products to meet evolving academic trends to support faculty and students.
The field of Social Sciences has been notoriously known for valuing the contextual component of data and increasingly entertaining more quantitative and computational approaches to research in response to the prevalence of data literacy skills needed to navigate both personal and professional contexts. Thus, this study becomes particularly timely to identify current instructors’ practi..., The project followed a qualitative and exploratory approach to understand current practices of faculty teaching with data. The study was IRB approved and was exempt by the UCSB’s Office of Research in July 2020 (Protocol 1-20-0491).Â
The identification and recruitment of potential participants took into account the selection criteria pre-established by Ithaka S+R: a) instructors of courses within the Social Sciences, considering the field as broadly defined, and making the best judgment in cases the discipline intersects with other fields; b) instructors who teach undergraduate courses or courses where most of the students are at the undergraduate level; c) instructors of any rank, including adjuncts and graduate students; as long as they were listed as instructors of record of the selected courses; d) instructors who teach courses were students engage with quantitative/computational data.Â
The sampling process followed a combination of strategies to more easily identify instructo..., The data folder contains 10Â pdf files with de-identified transcriptions of the interviews and the pdf files with the recruitment email and the interview guide.Â