Facebook
TwitterThis dataset was an inspiration to me to analytically find the best value Master's programs in data science given the statistics and rankings of each respective university. I acquired a majority of this data through Forbes. Though this data doesn't entirely go through every university from last year's ranking system, I went through each schools webpages through the top 250 universities to find the best value programs and if they offered a Data Science MS. I hope you use this data to make the best decision for yourself and make a respectable upgrade in your career as a Data Scientist.
NOTE: Some of the metrics are skewed for my usage i.e. I am a citizen in New York State and the cost of public universities in NY will be lesser than if you did not come from New York.
I also set a standard of 3.0 as a minimum GPA to be admitted to programs if a university did not provide a minimum GPA to be admitted.
1) School Name: Name of Given University
2) State: US State Abbreviation
3) City: US City University is located in
4) Ranking: 2021 Forbes ranking of University
5) Online: 0 -> in-person program, 1 -> online
6) Total_Tuition_Cost: Cost of Tuition in USD
7) Program_Years_Full_Time: Number of years to finish program
8) Min_Quant_GRE_Score: Quant GRE score needed to be accepted (blank if not found)
9) Min_Undergraduate_GPA: GPA needed to be accepted into program
10) Median_Salary_10yr: 10 year Median salary of former graduates (Not Exclusive to DS Majors)
11) Need_GRE: 0-> Do not need to take GRE, 1-> must take GRE
12) Institution Type: Either 'Private' or 'Public'
Facebook
Twitterhttps://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
Over the last 20 years, statistics preparation has become vital for a broad range of scientific fields, and statistics coursework has been readily incorporated into undergraduate and graduate programs. However, a gap remains between the computational skills taught in statistics service courses and those required for the use of statistics in scientific research. Ten years after the publication of "Computing in the Statistics Curriculum,'' the nature of statistics continues to change, and computing skills are more necessary than ever for modern scientific researchers. In this paper, we describe research on the design and implementation of a suite of data science workshops for environmental science graduate students, providing students with the skills necessary to retrieve, view, wrangle, visualize, and analyze their data using reproducible tools. These workshops help to bridge the gap between the computing skills necessary for scientific research and the computing skills with which students leave their statistics service courses. Moreover, though targeted to environmental science graduate students, these workshops are open to the larger academic community. As such, they promote the continued learning of the computational tools necessary for working with data, and provide resources for incorporating data science into the classroom.
Methods Surveys from Carpentries style workshops the results of which are presented in the accompanying manuscript.
Pre- and post-workshop surveys for each workshop (Introduction to R, Intermediate R, Data Wrangling in R, Data Visualization in R) were collected via Google Form.
The surveys administered for the fall 2018, spring 2019 academic year are included as pre_workshop_survey and post_workshop_assessment PDF files.
The raw versions of these data are included in the Excel files ending in survey_raw or assessment_raw.
The data files whose name includes survey contain raw data from pre-workshop surveys and the data files whose name includes assessment contain raw data from the post-workshop assessment survey.
The annotated RMarkdown files used to clean the pre-workshop surveys and post-workshop assessments are included as workshop_survey_cleaning and workshop_assessment_cleaning, respectively.
The cleaned pre- and post-workshop survey data are included in the Excel files ending in clean.
The summaries and visualizations presented in the manuscript are included in the analysis annotated RMarkdown file.
Facebook
TwitterThe Graduate Students and Postdoctorates in Science and Engineering survey is an annual census of all U.S. academic institutions granting research-based master's degrees or doctorates in science, engineering, and selected health fields as of fall of the survey year. The survey, sponsored by the National Center for Science and Engineering Statistics within the National Science Foundation and by the National Institutes of Health, collects the total number of master's and doctoral students, postdoctoral appointees, and doctorate-level nonfaculty researchers by demographic and other characteristics such as source of financial support. Results are used to assess shifts in graduate enrollment and postdoc appointments and trends in financial support.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Probabilistic models such as logistic regression, Bayesian classification, neural networks, and models for natural language processing, are increasingly more present in both undergraduate and graduate statistics and data science curricula due to their wide range of applications. In this article, we present a one-week course module for students in advanced undergraduate and applied graduate courses on variational inference, a popular optimization-based approach for approximate inference with probabilistic models. Our proposed module is guided by active learning principles: In addition to lecture materials on variational inference, we provide an accompanying class activity, an R shiny app, and guided labs based on real data applications of logistic regression and clustering documents using Latent Dirichlet Allocation with R code. The main goal of our module is to expose students to a method that facilitates statistical modeling and inference with large datasets. Using our proposed module as a foundation, instructors can adopt and adapt it to introduce more realistic case studies and applications in data science, Bayesian statistics, multivariate analysis, and statistical machine learning courses.
Facebook
TwitterThe Graduate Students and Postdoctorates in Science and Engineering survey is an annual census of all U.S. academic institutions granting research-based master's degrees or doctorates in science, engineering, and selected health fields as of fall of the survey year. The survey, sponsored by the National Center for Science and Engineering Statistics within the National Science Foundation and by the National Institutes of Health, collects the total number of master's and doctoral students, postdoctoral appointees, and doctorate-level nonfaculty researchers by demographic and other characteristics such as source of financial support. Results are used to assess shifts in graduate enrollment and postdoc appointments and trends in financial support. This dataset includes GSS assets for 2022.
Facebook
TwitterI am studying for a master's degree in CS at Unversity of Illinois - Urbana Champaign and was curious what's the number of students enrolling and graduating the CS undergraduate and graduate school.
Fortunately, there is a page from UIUC that has the latest years of data for undergraduate and graduate students.
https://cs.illinois.edu/about-us/statistics
The data includes how many students are enrolled in CS undergraduate and graduate school, how many of them are actually graduated, and what major that students took with CS, how many of them are Ph.D. awarded, etc.
Thank you UIUC for providing statistics on https://cs.illinois.edu/about-us/statistics. All the numbers and data are from the website as of 3/10/2020.
It would be fun to find out any trend in UIUC CS, e.g, what major is getting famous for years from students, if the number of PhD/M/Master degree enrollment is increasing or decresing, etc.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Frequency of reported types of studies and use of descriptive and inferential statistics (n = 216).
Facebook
TwitterIn 2024, it was projected that people in the United States with a Master’s degree in Computer Science would have the highest average starting salary, at 85,403 U.S. dollars. People who held a Master’s degree in Engineering were projected to have the second-highest starting salary, at 83,628 U.S. dollars. An abundance of Masters As higher education in the United States has become more common, and even expected, the number of Master’s degrees awarded has increased. During the 1949-50 academic year, about 58,180 Master’s degrees were awarded to students, with the vast majority being earned by male students. In the 2018-19 academic year, this figure increased to about 833,710 Master’s degrees awarded, with the majority being earned by female students. The right career While Engineering might have the highest starting pay for Master’s degree holders, those with a Master’s degree as a Physician Assistant had the highest mid-career median pay in 2021. Engineering continues to be one of the most popular fields for those seeking their Master’s degree, and STEM fields continue to dominate the field in number of Master’s degrees awarded.
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
This dataset contains the results of the experiments that I ran for my master thesis. The full code (and more) can be found at https://github.com/dimitris93/msc-thesis
Facebook
TwitterWeighted average tuition fees by field of study for full-time Canadian graduate students. Data are collected from all publicly funded Canadian degree-granting institutions.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The International STEM Graduate Student Survey assesses why international students are coming to the United States for their graduate studies, the challenges they have faced while studying in the US, their future career plans, and whether they wish to stay or leave the US upon graduation. According to the Survey of Earned Doctorates by the National Science Foundation and the National Center for Science and Engineering Statistics, international students accounted for over 40% of all US doctoral graduates in STEM in 2013. The factors that influence international students' decisions to study in the US and whether they will stay or leave are important to US economic competitiveness. We contacted graduate students (both domestic and international) in STEM disciplines from the top 10 universities ranked by the total number of enrolled international students. We estimate that we contacted approximately 15,990 students. Individuals were asked to taken an online survey regarding their background, reasons for studying in the US, and whether they plan to stay or leave the US upon graduation. We received a total of 2,322 completed surveys, giving us a response rate of 14.5%. 1,535 of the completed were from domestic students and 787 of which were from international students. Raw survey data are presented here.Survey participants were contacted via Qualtrics to participate in this survey. The Universe of this survey data set pertains to all graduate students (Master's and PhD) in STEM disciplines from the following universities: Columbia University, University of Illinois-Urbana Champaign, Michigan State University, Northeastern University, Purdue University, University of Southern California, Arizona State University, University of California at Los Angeles, New York University, University of Washington at Seattle. Data are broken into 2 subsets: one for international STEM graduate students and one for domestic STEM graduate students, please see respective files.
Facebook
TwitterThe joint UNESCO-OECD-Eurostat (UOE) data collection on formal education systems provides annual data on student participation and completion of educational programmes as well as data on personnel, cost and type of resources devoted to education. The reference period for non-monetary education data is the school year and for monetary data it is the calendar year. The International Statistics of Education and Training Systems ÔÇô UNESCO-UIS/OECD/Eurostat (UOE) Questionnaire aims to provide the data required by international bodies, in addition to offering results at the national level. It is a synthesis and analysis operation that appears in the National Statistical Plan 2021-2024 (Prog. 8677) and is carried out by the S.G. of Statistics and Studies of the Ministry of Education and Vocational Training in collaboration with the Ministry of Universities and the National Institute of Statistics. Its purpose is to integrate the statistical information of the activity of the educational-training system in its different levels of education in order to meet the demands of international statistics, of the same name, requested by Eurostat, OECD and UNESCO-UIS. A selection of tables with data derived from this statistic is provided below, together with a presentation summary note:
Facebook
TwitterAttribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
In the rapidly moving proteomics field, a diverse patchwork of data analysis pipelines and algorithms for data normalization and differential expression analysis is used by the community. We generated a mass spectrometry downstream analysis pipeline (MS-DAP) that integrates both popular and recently developed algorithms for normalization and statistical analyses. Additional algorithms can be easily added in the future as plugins. MS-DAP is open-source and facilitates transparent and reproducible proteome science by generating extensive data visualizations and quality reporting, provided as standardized PDF reports. Second, we performed a systematic evaluation of methods for normalization and statistical analysis on a large variety of data sets, including additional data generated in this study, which revealed key differences. Commonly used approaches for differential testing based on moderated t-statistics were consistently outperformed by more recent statistical models, all integrated in MS-DAP. Third, we introduced a novel normalization algorithm that rescues deficiencies observed in commonly used normalization methods. Finally, we used the MS-DAP platform to reanalyze a recently published large-scale proteomics data set of CSF from AD patients. This revealed increased sensitivity, resulting in additional significant target proteins which improved overlap with results reported in related studies and includes a large set of new potential AD biomarkers in addition to previously reported.
Facebook
TwitterEmployment income (in 2019 and 2020) by detailed major field of study and highest certificate, diploma or degree, including work activity (full time full year, part time full year, or part year).
Facebook
Twitterhttps://fred.stlouisfed.org/legal/#copyright-public-domainhttps://fred.stlouisfed.org/legal/#copyright-public-domain
Graph and download economic data for All Employees: Professional, Scientific, and Technical Services in Jackson, MS (MSA) (SMU28271406054000001A) from 2001 to 2024 about Jackson, science, MS, professional, services, employment, and USA.
Facebook
Twitterhttps://fred.stlouisfed.org/legal/#copyright-public-domainhttps://fred.stlouisfed.org/legal/#copyright-public-domain
Graph and download economic data for All Employees: Professional, Scientific, and Technical Services in Mississippi (SMU28000006054000001A) from 1990 to 2024 about science, MS, professional, services, employment, and USA.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset is originally from Dhaka Stock Exchange Ltd. The objective of the dataset is to assign analytical report writing tasks to Summer 2020 students enrolled in ASDS18: Data Mining course in proceedings of the partial fulfillment of the requirements for the Professional Masters in Applied Statistics and Data Science (PMASDS) degree. This data set was collected using the Dhaka Stock Exchange API.
The datasets consist of several stock company predictor (independent) variables and one target (dependent) variable, Outcome. Independent variables include the last price, net asset value (NAV) of the stock, Earnings Per Share (EPS), price-to-earnings (P/E) ratio of the stock, paid-up capital per share, and so on.
It contains information on 374 listed companies from Dhaka Stock Exchange - DSE, Bangladesh. The outcome tested was Category, 258 tested positive and 500 tested negative. Therefore, there is one target (dependent) variable and 8 attributes.
Dr. Md. Rezaul Karim, Associate Professor, Department of Statistics, Jahangirnagar University, Dhaka, Bangladesh (2021) provided us with this dataset. Using the Dhaka Stock Exchange API this data set was collected to assign analytical report writing tasks to Summer 2020 students in proceedings of the partial fulfillment of the requirements for the Professional Masters in Applied Statistics and Data Science (PMASDS) degree.
Facebook
TwitterNumber of graduates who have the same destination as the location of study one year after graduation, and the associated retention rate, by graduates’ origin (same as or different from the location of study), field of study (Variant of the Classification of Instructional Programs (CIP) Canada 2021 for Science, technology, engineering and mathematics (STEM) and Business, humanities, health, arts, social science and education (BHASE) groupings), gender and age group.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Dataset Overview
This dataset provides a detailed and realistic simulation of Indian international students studying in the United States, their educational paths, job outcomes after graduation, university information, and visa approval statistics.
It can be used for:
📂 Files Included | File Name| Description | | --- | --- | | indian_international_students_us.csv | Profiles of 10,000 Indian international students, including university, major, degree level, and study status. | |job_outcomes_indian_students_us.csv | Job outcomes for students who graduated, including job title, company, salary, visa status, and time to first job. | |universities_info_us.csv | Information about major US universities, including acceptance rates, GRE/TOEFL averages, and international student percentages. | | visa_approval_stats.csv | Yearly visa approval and denial rates for F1, OPT, and H1B visa types from 2015 to 2023. |
✨ Potential Project Ideas 1. Predict job offer chances based on major, degree, and university. 2. Analyze salary distributions by major, company, and visa status. 3. Visualize visa approval trends over time. 4. Build a career advisory tool for international students.
✨ SQL Potential Project 1. List all students studying in "Computer Science" major. 2. Count how many students are currently enrolled vs graduated. 3. Find top 5 universities with the highest number of students. 4. Get the list of all students whose degree level is "Masters". 5. Find average salary of students who received a job offer. 6. List all companies that hired at least one student.
Find visa approval rate for each visa type (F1, OPT, H1B) over the years.
Build a report showing: University Name Number of students Number of students who got jobs Average salary Job offer rate (%)
Identify majors with the highest average salaries after graduation.
Compare visa approval trends: How have F1, OPT, and H1B approval rates changed from 2015 to 2023?
Create a view showing: Students with highest probability of getting a job based on major, university, and degree level.
Predict (with SQL logic): If a new student graduates from [University X] with [Major Y] and [Degree Level Z], what is their expected salary range?
Cohort Analysis: Analyze students who graduated in a particular year, how many got jobs within 6 months.
⚡ Important Note This dataset is synthetic but designed to be realistic based on trends among Indian students studying abroad. No real personal information is included. Great for educational, research, and portfolio purposes.
🔖 Acknowledgment Generated by Pushkar Joshi using simulated data sources. Inspired by real-world patterns and publicly available educational statistics.
🏷️ Suggested Tags
Facebook
TwitterThe proportion of male and female postsecondary graduates, by Classification of Instructional Programs, Primary groupings (CIP_PG), International Standard Classification of Education (ISCED) and age group.
Facebook
TwitterThis dataset was an inspiration to me to analytically find the best value Master's programs in data science given the statistics and rankings of each respective university. I acquired a majority of this data through Forbes. Though this data doesn't entirely go through every university from last year's ranking system, I went through each schools webpages through the top 250 universities to find the best value programs and if they offered a Data Science MS. I hope you use this data to make the best decision for yourself and make a respectable upgrade in your career as a Data Scientist.
NOTE: Some of the metrics are skewed for my usage i.e. I am a citizen in New York State and the cost of public universities in NY will be lesser than if you did not come from New York.
I also set a standard of 3.0 as a minimum GPA to be admitted to programs if a university did not provide a minimum GPA to be admitted.
1) School Name: Name of Given University
2) State: US State Abbreviation
3) City: US City University is located in
4) Ranking: 2021 Forbes ranking of University
5) Online: 0 -> in-person program, 1 -> online
6) Total_Tuition_Cost: Cost of Tuition in USD
7) Program_Years_Full_Time: Number of years to finish program
8) Min_Quant_GRE_Score: Quant GRE score needed to be accepted (blank if not found)
9) Min_Undergraduate_GPA: GPA needed to be accepted into program
10) Median_Salary_10yr: 10 year Median salary of former graduates (Not Exclusive to DS Majors)
11) Need_GRE: 0-> Do not need to take GRE, 1-> must take GRE
12) Institution Type: Either 'Private' or 'Public'