We know that students at elite universities tend to be from high-income families, and that graduates are more likely to end up in high-status or high-income jobs. But very little public data has been available on university admissions practices. This dataset, collected by Opportunity Insights, gives extensive detail on college application and admission rates for 139 colleges and universities across the United States, including data on the incomes of students. How do admissions practices vary by institution, and are wealthy students overrepresented?
Education equality is one of the most contested topics in society today. It can be defined and explored in many ways, from accessible education to disabled/low-income/rural students to the cross-generational influence of doctorate degrees and tenure track positions. One aspect of equality is the institutions students attend. Consider the “Ivy Plus” universities, which are all eight Ivy League schools plus MIT, Stanford, Duke, and Chicago. Although less than half of one percent of Americans attend Ivy-Plus colleges, they account for more than 10% of Fortune 500 CEOs, a quarter of U.S. Senators, half of all Rhodes scholars, and three-fourths of Supreme Court justices appointed in the last half-century.
A 2023 study (Chetty et al, 2023) tried to understand how these elite institutions affect educational equality:
Do highly selective private colleges amplify the persistence of privilege across generations by taking students from high-income families and helping them obtain high-status, high-paying leadership positions? Conversely, to what extent could such colleges diversify the socioeconomic backgrounds of society’s leaders by changing their admissions policies?
To answer these questions, they assembled a dataset documenting the admission and attendance rate for 13 different income bins for 139 selective universities around the country. They were able to access and link not only student SAT/ACT scores and high school grades, but also parents’ income through their tax records, students’ post-college graduate school enrollment or employment (including earnings, employers, and occupations), and also for some selected colleges, their internal admission ratings for each student. This dataset covers students in the entering classes of 2010–2015, or roughly 2.4 million domestic students.
They found that children from families in the top 1% (by income) are more than twice as likely to attend an Ivy-Plus college as those from middle-class families with comparable SAT/ACT scores, and two-thirds of this gap can be attributed to higher admission rates with similar scores, with the remaining third due to the differences in rates of application and matriculation (enrollment conditional on admission). This is not a shocking conclusion, but we can further explore elite college admissions by socioeconomic status to understand the differences between elite private colleges and public flagships admission practices, and to reflect on the privilege we have here and to envision what a fairer higher education system could look like.
The data has been aggregated by university and by parental income level, grouped into 13 income brackets. The income brackets are grouped by percentile relative to the US national income distribution, so for instance the 75.0 bin represents parents whose incomes are between the 70th and 80th percentile. The top two bins overlap: the 99.4 bin represents parents between the 99 and 99.9th percentiles, while the 99.5 bin represents parents in the top 1%.
Each row represents students’ admission and matriculation outcomes from one income bracket at a given university. There are 139 colleges covered in this dataset.
The variables include an array of different college-level-income-binned estimates for things including attendance rate (both raw and reweighted by SAT/ACT scores), application rate, and relative attendance rate conditional on application, also with respect to specific test score bands for each college and in/out-of state. Colleges are categorized into six tiers: Ivy Plus, other elite schools (public and private), highly selective public/private, and selective public/private, with selectivity generally in descending order. It also notes whether a college is public and/or flagship, where “flagship” means public flagship universities. Furthermore, they also report the relative application rate for each income bin within specific test bands, which are 50-point bands that had the most attendees in each school tier/category.
Several values are reported in “test-score-reweighted” form. These values control for SAT score: they are calculated separately for each SAT score value, then averaged with weights based on the distribution of SAT scores at the institution.
Note that since private schools typically don’t differentiate between in-...
There were approximately 18.58 million college students in the U.S. in 2022, with around 13.49 million enrolled in public colleges and a further 5.09 million students enrolled in private colleges. The figures are projected to remain relatively constant over the next few years.
What is the most expensive college in the U.S.? The overall number of higher education institutions in the U.S. totals around 4,000, and California is the state with the most. One important factor that students – and their parents – must consider before choosing a college is cost. With annual expenses totaling almost 78,000 U.S. dollars, Harvey Mudd College in California was the most expensive college for the 2021-2022 academic year. There are three major costs of college: tuition, room, and board. The difference in on-campus and off-campus accommodation costs is often negligible, but they can change greatly depending on the college town.
The differences between public and private colleges Public colleges, also called state colleges, are mostly funded by state governments. Private colleges, on the other hand, are not funded by the government but by private donors and endowments. Typically, private institutions are much more expensive. Public colleges tend to offer different tuition fees for students based on whether they live in-state or out-of-state, while private colleges have the same tuition cost for every student.
https://en.wikipedia.org/wiki/Public_domainhttps://en.wikipedia.org/wiki/Public_domain
The Colleges and Universities feature class/shapefile is composed of all Post Secondary Education facilities as defined by the Integrated Post Secondary Education System (IPEDS, http://nces.ed.gov/ipeds/), National Center for Education Statistics (NCES, https://nces.ed.gov/), US Department of Education for the 2018-2019 school year. Included are Doctoral/Research Universities, Masters Colleges and Universities, Baccalaureate Colleges, Associates Colleges, Theological seminaries, Medical Schools and other health care professions, Schools of engineering and technology, business and management, art, music, design, Law schools, Teachers colleges, Tribal colleges, and other specialized institutions. Overall, this data layer covers all 50 states, as well as Puerto Rico and other assorted U.S. territories. This feature class contains all MEDS/MEDS+ as approved by the National Geospatial-Intelligence Agency (NGA) Homeland Security Infrastructure Program (HSIP) Team. Complete field and attribute information is available in the ”Entities and Attributes” metadata section. Geographical coverage is depicted in the thumbnail above and detailed in the "Place Keyword" section of the metadata. This feature class does not have a relationship class but is related to Supplemental Colleges. Colleges and Universities that are not included in the NCES IPEDS data are added to the Supplemental Colleges feature class when found. This release includes the addition of 175 new records, the removal of 468 no longer reported by NCES, and modifications to the spatial location and/or attribution of 6682 records.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
It's no secret that US university students often graduate with debt repayment obligations that far outstrip their employment and income prospects. While it's understood that students from elite colleges tend to earn more than graduates from less prestigious universities, the finer relationships between future income and university attendance are quite murky. In an effort to make educational investments less speculative, the US Department of Education has matched information from the student financial aid system with federal tax returns to create the College Scorecard dataset.
Kaggle is hosting the College Scorecard dataset in order to facilitate shared learning and collaboration. Insights from this dataset can help make the returns on higher education more transparent and, in turn, more fair.
Here's a script showing an exploratory overview of some of the data.
college-scorecard-release-*.zip contains a compressed version of the same data available through Kaggle Scripts.
It consists of three components:
New to data exploration in R? Take the free, interactive DataCamp course, "Data Exploration With Kaggle Scripts," to learn the basics of visualizing data with ggplot. You'll also create your first Kaggle Scripts along the way.
https://search.gesis.org/research_data/datasearch-httpwww-da-ra-deoaip--oaioai-da-ra-de450955https://search.gesis.org/research_data/datasearch-httpwww-da-ra-deoaip--oaioai-da-ra-de450955
Abstract (en): The American College Catalog Study Database (CCS) contains academic data on 286 four-year colleges and universities in the United States. CCS is one of two databases produced by the Colleges and Universities 2000 project based at the University of California-Riverside. The CCS database comprises a sampled subset of institutions from the related Institutional Data Archive (IDA) on American Higher Education (ICPSR 34874). Coding for CCS was based on college catalogs obtained from College Source, Inc. The data are organized in a panel design, with measurements taken at five-year intervals: academic years 1975-76, 1980-81, 1985-86, 1990-91, 1995-96, 2000-01, 2005-06, and 2010-11. The database is based on information reported in each institution's college catalog, and includes data regarding changes in major academic units (schools and colleges), departments, interdisciplinary programs, and general education requirements. For schools and departments, changes in structure were coded, including new units, name changes, splits in units, units moved to new schools, reconstituted units, consolidated units, departments reduced to program status, and eliminated units. The American College Catalog Study Database (CCS) is intended to allow researchers to examine changes in the structure of institutionalized knowledge in four-year colleges and universities within the United States. For information on the study design, including detailed coding conventions, please see the Original P.I. Documentation section of the ICPSR Codebook. The data are not weighted. Dataset 1, Characteristics Variables, contains three weight variables (IDAWT, CCSWT, and CASEWEIGHT) which users may wish to apply during analysis. For additional information on weights, please see the Original P.I. Documentation section of the ICPSR Codebook. ICPSR data undergo a confidentiality review and are altered when necessary to limit the risk of disclosure. ICPSR also routinely creates ready-to-go data files along with setups in the major statistical software formats as well as standard codebooks to accompany the data. In addition to these procedures, ICPSR performed the following processing steps for this data collection: Checked for undocumented or out-of-range codes.. Response Rates: Approximately 75 percent of IDA institutions are included in CCS. For additional information on response rates, please see the Original P.I. Documentation section of the ICPSR Codebook. Four-year not-for-profit colleges and universities in the United States. Smallest Geographic Unit: state CCS includes 286 institutions drawn from the IDA sample of 384 United States four-year colleges and universities. CCS contains every IDA institution for which a full set of catalogs could be located at the initiation of the project in 2000. CCS contains seven datasets that can be linked through an institutional identification number variable (PROJ_ID). Since the data are organized in a panel format, it is also necessary to use a second variable (YEAR) to link datasets. For a brief description of each CCS dataset, please see Appendix B within the Original P.I. Documentation section of the ICPSR Codebook.There are date discrepancies between the data and the Original P.I. Documentation. Study Time Periods and Collection Dates reflect dates that are present in the data. No additional information was provided.Please note that the related data collection featuring the Institutional Data Archive on American Higher Education, 1970-2011, will be available as ICPSR 34874. Additional information on the American College Catalog Study Database (CCS) and the Institutional Data Archive (IDA) database can be found on the Colleges and Universities 2000 Web site.
Data product is provided by ASL Marketing. It contains current college students who are attending colleges and universities nationwide. Connect with this market by: Class Year Field of Study Home/School address College Attending Ethnicity School Type Region Sports Conference Gender eSports Email
This study was designed to collect college student victimization data to satisfy four primary objectives: (1) to determine the prevalence and nature of campus crime, (2) to help the campus community more fully assess crime, perceived risk, fear of victimization, and security problems, (3) to aid in the development and evaluation of location-specific and campus-wide security policies and crime prevention measures, and (4) to make a contribution to the theoretical study of campus crime and security. Data for Part 1, Student-Level Data, and Part 2, Incident-Level Data, were collected from a random sample of college students in the United States using a structured telephone interview modeled after the redesigned National Crime Victimization Survey administered by the Bureau of Justice Statistics. Using stratified random sampling, over 3,000 college students from 12 schools were interviewed. Researchers collected detailed information about the incident and the victimization, and demographic characteristics of victims and nonvictims, as well as data on self-protection, fear of crime, perceptions of crime on campus, and campus security measures. For Part 3, School Data, the researchers surveyed campus officials at the sampled schools and gathered official data to supplement institution-level crime prevention information obtained from the students. Mail-back surveys were sent to directors of campus security or campus police at the 12 sampled schools, addressing various aspects of campus security, crime prevention programs, and crime prevention services available on the campuses. Additionally, mail-back surveys were sent to directors of campus planning, facilities management, or related offices at the same 12 schools to obtain information on the extent and type of planning and design actions taken by the campus for crime prevention. Part 3 also contains data on the characteristics of the 12 schools obtained from PETERSON'S GUIDE TO FOUR-YEAR COLLEGES (1994). Part 4, Census Data, is comprised of 1990 Census data describing the census tracts in which the 12 schools were located and all tracts adjacent to the schools. Demographic variables in Part 1 include year of birth, sex, race, marital status, current enrollment status, employment status, residency status, and parents' education. Victimization variables include whether the student had ever been a victim of theft, burglary, robbery, motor vehicle theft, assault, sexual assault, vandalism, or harassment. Students who had been victimized were also asked the number of times victimization incidents occurred, how often the police were called, and if they knew the perpetrator. All students were asked about measures of self-protection, fear of crime, perceptions of crime on campus, and campus security measures. For Part 2, questions were asked about the location of each incident, whether the offender had a weapon, a description of the offense and the victim's response, injuries incurred, characteristics of the offender, and whether the incident was reported to the police. For Part 3, respondents were asked about how general campus security needs were met, the nature and extent of crime prevention programs and services available at the school (including when the program or service was first implemented), and recent crime prevention activities. Campus planners were asked if specific types of campus security features (e.g., emergency telephone, territorial markers, perimeter barriers, key-card access, surveillance cameras, crime safety audits, design review for safety features, trimming shrubs and underbrush to reduce hiding places, etc.) were present during the 1993-1994 academic year and if yes, how many or how often. Additionally, data were collected on total full-time enrollment, type of institution, percent of undergraduate female students enrolled, percent of African-American students enrolled, acreage, total fraternities, total sororities, crime rate of city/county where the school was located, and the school's Carnegie classification. For Part 4, Census data were compiled on percent unemployed, percent having a high school degree or higher, percent of all persons below the poverty level, and percent of the population that was Black.
This dataset contains data on credits registered and earned by students in designated Early College programs since school year 2021-22. Early College is a program that designates partnerships between high schools and colleges to support high school students to complete college courses. The list of designated partnerships is available here.
Students are counted in this dataset if they are marked as an Early College student by the district. Credits are counted in this dataset if they are submitted to DHE. The credits are counted based on where they are taken, even if that is an institution of higher ed (IHE) outside of the student's designated Early College partnership. Credits from Fall and Spring semester are counted; summer credits are not counted. Data includes any credits at public IHEs (Wentworth Institute of Technology data have not been submitted as of July 2024), and '22-23 data from private IHEs that are part of an Early College designated partnership.
The dataset is updated in the fall of each year to add in the previous year's credits. Credit counts are suppressed (hidden) for groups in which there are fewer than 6 students to protect student privacy, though those credits are still counted in totals.
The data here are the same as the credit data in the Early College Dashboard.
This dataset provides information for Academic Years 2017-2021 which included: By College and VCCS System:
1) Annual Headcount and FTEs 2) Gender (categories are: Female & Male; Unknown may be inferred) 3) Ethnicity (categories are: American Indian & Alaskan Native, Asian, Black & African-American, Native Hawaiian & Pacific Islander, Hispanic, Two or More Races, Unknown/Not Specified, and White) 4) Age (categories are: 17 and Under, 18-19, 20-21, 22-24, 25-29, 30-34, 35-39, 40-49, 50-64, & 65 and Over) 5) 18-Month Outcomes for Dual-Enrolled High School Grads by Year (categories are: Total Grads, Continued in any Higher Ed program, Employed with no Higher Ed, and Unknown) 6) 18-Month Outcomes for VCCS Graduates by Year (categories are: Total Grads, Continued at VCCS, Transferred to a 4yr college, Employed with no Higher Ed, and Unknown)
For Fiscal Years 2018-2021, by Service Area and VCCS System:
1) Fast Forward Credentialers Employed by Fiscal Year (categories are: Total Distinct Students, Employed within 6 Months, Employed within 12 Months, and Employed within 18 Months)
Notes:
1) Headcounts are Unduplicated student counts.
2) One FTE represents 30 credit hours of classes taken by a student over an academic year and is calculated on an annual basis by taking the total credit hours taught divided by 30.
3) 2017 Fiscal Year Fast Forward data was not included as it was considered incomplete- the Fast Forward program began in 2017 and did not encompass all areas for the entire year.
4) In Workforce (Fast Forward data) the service region for the Richmond Metro Area is called CCWA (Community College Workforce Alliance) and combines data for Brightpoint and J Sargeant Reynolds.
4a) Therefore, there are no Reynolds data entries for Fast Forward variables. All CCWA data is listed under Brightpoint for this portion of the data set.
5) 18-Month Outcomes for Fast Forward Credentialers are cumulative (6 months to 12 months to 18 months)
By Education [source]
Welcome to the U.S. News & World Report's 2017 National Universities Rankings, a comprehensive dataset of over 1,800 schools across the United States providing quality data on admissions criteria, cost of tuition and fees, enrollment numbers, and overall rankings! Here you'll find up-to-date information on institutes of higher learning from Princeton University at the top spot in Best National Universities to Williams College at No. 1 on the Best National Liberal Arts Colleges list.
This collection of data is all that's needed for potential students - parents, counselors and more - to evaluate their choices in selecting a college or university that perfectly meets their needs. For instance: what is the total tuition & fees cost? What are student enrollment numbers? How have students rated this school? Which universities have been recognized as top institutions in academics by U.S. News & World Report? What admissions criteria do these schools evaluate when considering an applicant's profile? The answers lie within this dataset!
Explore each category separately as well as with other considerations through visuals like our scatter plot to get an inside look into collegiate education from enrollment patterns charted against yearly expenses including room & board charges without forgetting several crucial factors such as six-year graduation rates and freshman retention rates measured among nations' universities included here -allowing for comparison and assessment beforehand for a well-rounded experience such that you can find your own path ahead!
For more datasets, click here.
- 🚨 Your notebook can be here! 🚨!
This dataset contains information on the quality, tuition, and enrollment data of 1,800 U.S.-based universities ranked by U.S. News & World Report from 2017. It includes rankings from the National University and Liberal Arts College lists in addition to relevant data points like tuition fees and undergraduate enrollments for each school.
Users can take advantage of this dataset to build models that predict ranking or predicting cost-benefit results for students by using cost-related (tuition) metrics along with quality metrics (rankings). Alternatively users can use it to analyze trends between investments in higher education versus outcomes (ranking), or explore the relationship between enrollments for schools of varying rank tiers, etc...
For more information on how rankings are calculated please refer to this methodology explainer on U.S news website
Here is an overview of all columns included in this dataset:
Columns:Name - institution name,Location - City and state where located,Rank - Ranking according to U.S News & World Report ,Description - Snippet of text overview from U.S News ,Tuition and fees – Combined tuition and fees for out–of–state students ,In–state – Tuition and fees for in–state students ,Undergraduate Enrollment – Number of enrolled undergraduate students .
Using this column detail as a guide we can answer questions like ‘which colleges give highest ROI ?’ or ‘Which college has highest number undergraduates?’ . For statistical analysis such as correlation we may use a visual representation such as a scatter plots or bar graphs accordingly making it easier analyses trends found within our dataset ans well as exploring any relationships between different factors such us tuitions vs ranks
- Developing a searchable database to help high school students identify colleges that match their criteria in terms of tuition, graduation rate, location, and rank.
- Identifying correlations between enrollment numbers and university rank in order to better understand how the number of enrolled students effects the overall ranking of a university.
- Comparing universities with similar rankings in order to highlight differences between programs’ tuition and fees as well as retention rates
If you use this dataset in your research, please credit the original authors. Data Source
License: Dataset copyright by authors - You are free to: - Share - copy and redistribute the material in any medium or format for any purpose, even commercially. - Adapt - remix, ...
This dataset explore the Residence and migration of all freshmen students in degree-granting institutions who graduated from high school in the previous 12 months, by state: Fall 2004 NOTE: Includes all first-time postsecondary students enrolled at reporting institutions. Degree-granting institutions grant associate's or higher degrees and participate in Title IV federal financial aid programs. SOURCE: U.S. Department of Education, National Center for Education Statistics, Integrated Postsecondary Education Data System (IPEDS), Spring 2005. (This table was prepared September 2005.) http://nces.ed.gov/programs/digest/d06/tables/dt06_208.asp Accessed on 12 November 2007
A dataset to play with how an applicant's different parameters (e.g. GRE score, SOP, CGPA) may impact the admission decision
The dataset is collected from www.kaggle.com/mohansacharya/ graduate-admissions
The National Survey of College Graduates is a repeated cross-sectional biennial survey that provides data on the nation's college graduates, with a focus on those in the science and engineering workforce. This survey is a unique source for examining the relationship of degree field and occupation in addition to other characteristics of college-educated individuals, including work activities, salary, and demographic information.
https://www.icpsr.umich.edu/web/ICPSR/studies/37932/termshttps://www.icpsr.umich.edu/web/ICPSR/studies/37932/terms
The Higher Education Randomized Controlled Trial (THE-RCT) study aims to capitalize on existing data from postsecondary education RCTs to foster substantive and methodological scholarship and encourage teaching and learning opportunities. The cornerstone of THE-RCT is a restricted access file (RAF). The initial version contains individual-participant data from more than 25 of MDRC's higher education RCTs covering 50 institutions and over 50,000 students. The data were originally collected as part of different randomized controlled trial evaluations of a variety of higher education interventions. The data were collected for different student samples, at different times, and in different locations for each study. The data were collected from four data sources: 1. Baseline: Baseline student demographic data (e.g., gender, race/ethnicity, age, etc.) were gathered, either via a survey administered to students upon joining the study (but prior to random assignment) or from study colleges' administrative records; 2. College Transcript: Student transcript data (e.g., enrollment, credits attempted, credits earned, GPA) were provided by the study colleges or state higher education agencies; 3. College Credential Attainment: Student credential attainment data were provided by the study colleges or state higher education agencies; 4. National Student Clearinghouse: Student enrollment and credential attainment data were provided by the National Student Clearinghouse via their StudentTracker database. This includes enrollment and credential attainment data at colleges beyond the colleges where the study took place. The RAF contains student-level data, including baseline demographics (e.g., gender, race/ethnicity), outcomes (e.g., enrollment, credits earned, credentials), an indicator of experimental group (e.g., program or control group), and study variables (e.g., a variable that allows users to link to the RCT-level database).
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset contains anonymized responses from undergraduate students in ACBSP-accredited business schools in Mexico, Peru, Ecuador, Paraguay, and Colombia, collected between May 2024 and May 2025. Variables include psychological, cognitive, and contextual measures related to entrepreneurial intention and action.
This dataset contains college enrollment information, by U.S. Census Block Group, for the state of Michigan. College enrollment was defined as the number of public high school students who graduated in 2017, who enrolled in a college or university. This dataset includes enrollment in two-year and four-year institutions of higher education. Click here for metadata (descriptions of the fields).
New York City school level College Board SAT results for the graduating seniors of 2010. Records contain 2010 College-bound seniors mean SAT scores. Records with 5 or fewer students are suppressed (marked ‘s’). College-bound seniors are those students that complete the SAT Questionnaire when they register for the SAT and identify that they will graduate from high school in a specific year. For example, the 2010 college-bound seniors are those students that self-reported they would graduate in 2010. Students are not required to complete the SAT Questionnaire in order to register for the SAT. Students who do not indicate which year they will graduate from high school will not be included in any college-bound senior report. Students are linked to schools by identifying which school they attend when registering for a College Board exam. A student is only included in a school’s report if he/she self-reports being enrolled at that school. Data collected and processed by the College Board.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
This dataset was collected as part of a cross-sectional study examining factors influencing student success at a private Christian university in the South-Central U.S. The dataset includes responses from 963 first-year students (ages 18-25) who met inclusion criteria, providing comprehensive data on academic performance, personality traits, lifestyle habits, physical health, and socioeconomic status.Data Components:Academic Performance: GPA (university records)Self-Reported Surveys: Personality traits, lifestyle habits, stress levelsPhysical Health Indicators: BMI, resting heart rate, daily step countsSocioeconomic Status: Estimated via Expected Family Contribution (EFC)This dataset offers valuable insights into the relationships between academic achievement, well-being, and socioeconomic factors in higher education.
Bytemine offers a comprehensive Alumni Contact Data solution covering all major colleges and universities across the United States. Designed for teams that need accurate, scalable, and multi-dimensional alumni information, our database includes both historical and current alumni records enriched with verified contact details and career insights.
Whether you’re building alumni engagement programs, fundraising campaigns, talent pipelines, or marketing outreach initiatives, our alumni dataset enables you to connect with graduates at every stage of their career. Each contact record is enriched with key data points such as name, graduation year, degree, university attended, current job title, employer, location, work email, personal email, mobile number, and more.
Bytemine’s alumni data includes:
Coverage across all US colleges and universities Verified work and personal emails Direct mobile numbers Historical alumni data for past graduates Current data for recent graduates and active professionals Job title, employer, industry, seniority, and location data Education details including major, degree, and graduation year
This dataset is ideal for:
University alumni relations and advancement teams Fundraising and donor development Career services and job placement teams Recruiting firms and HR tech platforms SaaS tools and marketplaces building alumni networks Marketing and sales teams targeting professionals by educational background EdTech and continuing education providers targeting past students
Access is available via a searchable web platform or API, allowing you to query alumni records, filter by graduation year, field of study, employer, location, or other attributes, and export lists for outreach or integration into your CRM or product.
Our data is sourced through direct licensing from educational databases, employment platforms, and verified third-party aggregators. All data is validated for accuracy and updated regularly to reflect current career paths and contact channels. Historical data allows you to connect with long-graduated professionals, while real-time updates ensure you’re reaching alumni in their most recent roles.
Bytemine’s Alumni Contact Data is trusted by educational institutions, nonprofit organizations, recruitment firms, and technology platforms looking to strengthen alumni engagement, drive fundraising success, and expand outreach to educated professionals.
Whether you need to build an alumni community, launch a fundraising campaign, or recruit from a targeted university background, Bytemine provides the data infrastructure to make it happen.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
📌 Description - The SAT is a standardized test adminstered by the College Board and widely used for college admissions in the United States. - The source dataset gives the mean SAT math and verbal scores for males (M), for females (F), and for all students (A) for the years 1967 to 2001. - I have added the last three columns for verbal+math averages: for males, females, and for all students.
Column | Description |
---|---|
Year | The years 1967 to 2001. |
M_verbal | Verbal scores for males. |
F_verbal | Verbal scores for females. |
M_math | Math scores for males. |
F_math | Math scores for females. |
A_verbal | Verbal scores for all students. |
A_math | Math scores for all students. |
M_averages | Average [Verbal+Math] scores for males. |
F_averages | Average [Verbal+Math] scores for females. |
A_averages | Average [Verbal+Math] scores for all students. |
🎯 Objective: - To compare scores by year. - To compare scores by gender. - To compare students' performance in verbal and math.
📦 Source: The College Board
📥 Download TSV source file: SATbyYear.tsv
We know that students at elite universities tend to be from high-income families, and that graduates are more likely to end up in high-status or high-income jobs. But very little public data has been available on university admissions practices. This dataset, collected by Opportunity Insights, gives extensive detail on college application and admission rates for 139 colleges and universities across the United States, including data on the incomes of students. How do admissions practices vary by institution, and are wealthy students overrepresented?
Education equality is one of the most contested topics in society today. It can be defined and explored in many ways, from accessible education to disabled/low-income/rural students to the cross-generational influence of doctorate degrees and tenure track positions. One aspect of equality is the institutions students attend. Consider the “Ivy Plus” universities, which are all eight Ivy League schools plus MIT, Stanford, Duke, and Chicago. Although less than half of one percent of Americans attend Ivy-Plus colleges, they account for more than 10% of Fortune 500 CEOs, a quarter of U.S. Senators, half of all Rhodes scholars, and three-fourths of Supreme Court justices appointed in the last half-century.
A 2023 study (Chetty et al, 2023) tried to understand how these elite institutions affect educational equality:
Do highly selective private colleges amplify the persistence of privilege across generations by taking students from high-income families and helping them obtain high-status, high-paying leadership positions? Conversely, to what extent could such colleges diversify the socioeconomic backgrounds of society’s leaders by changing their admissions policies?
To answer these questions, they assembled a dataset documenting the admission and attendance rate for 13 different income bins for 139 selective universities around the country. They were able to access and link not only student SAT/ACT scores and high school grades, but also parents’ income through their tax records, students’ post-college graduate school enrollment or employment (including earnings, employers, and occupations), and also for some selected colleges, their internal admission ratings for each student. This dataset covers students in the entering classes of 2010–2015, or roughly 2.4 million domestic students.
They found that children from families in the top 1% (by income) are more than twice as likely to attend an Ivy-Plus college as those from middle-class families with comparable SAT/ACT scores, and two-thirds of this gap can be attributed to higher admission rates with similar scores, with the remaining third due to the differences in rates of application and matriculation (enrollment conditional on admission). This is not a shocking conclusion, but we can further explore elite college admissions by socioeconomic status to understand the differences between elite private colleges and public flagships admission practices, and to reflect on the privilege we have here and to envision what a fairer higher education system could look like.
The data has been aggregated by university and by parental income level, grouped into 13 income brackets. The income brackets are grouped by percentile relative to the US national income distribution, so for instance the 75.0 bin represents parents whose incomes are between the 70th and 80th percentile. The top two bins overlap: the 99.4 bin represents parents between the 99 and 99.9th percentiles, while the 99.5 bin represents parents in the top 1%.
Each row represents students’ admission and matriculation outcomes from one income bracket at a given university. There are 139 colleges covered in this dataset.
The variables include an array of different college-level-income-binned estimates for things including attendance rate (both raw and reweighted by SAT/ACT scores), application rate, and relative attendance rate conditional on application, also with respect to specific test score bands for each college and in/out-of state. Colleges are categorized into six tiers: Ivy Plus, other elite schools (public and private), highly selective public/private, and selective public/private, with selectivity generally in descending order. It also notes whether a college is public and/or flagship, where “flagship” means public flagship universities. Furthermore, they also report the relative application rate for each income bin within specific test bands, which are 50-point bands that had the most attendees in each school tier/category.
Several values are reported in “test-score-reweighted” form. These values control for SAT score: they are calculated separately for each SAT score value, then averaged with weights based on the distribution of SAT scores at the institution.
Note that since private schools typically don’t differentiate between in-...