Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
This dataset provides a small collection of information about universities and institutions from different countries. Each record includes the following details:
The dataset can be used for educational research, location-based analysis of educational institutions, or as part of a larger database of global universities.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset is the perfect resource for researchers who need access to a comprehensive list of universities worldwide. The data includes each university's .edu domain name, making it easy to identify a college based on an email address. With this data, you can investigate trends in higher education, compare institutions across the globe, and more!
This dataset can be used to research a variety of topics related to universities around the world. Some possible topics of research include:
-The number of universities in each country -The types of degrees offered by universities around the world -The tuition cost of attending university in different countries -The average SAT/ACT score requirements for admission to universities in different countries
-This dataset can be used to create a web application that allows users to search for universities by country code. -This dataset can be used to create a script that automatically generates .edu email addresses for users
This dataset was compiled from various sources, including the List of Universities in the World (https://en.wikipedia.org/wiki/List_of_universities_in_the_world) and the .edu website directory.
This dataset was compiled from various sources, including the List of Universities in the World (https://en.wikipedia.org/wiki/List_of_universities_in_the_world) and the .edu website directory
License
License: CC0 1.0 Universal (CC0 1.0) - Public Domain Dedication No Copyright - You can copy, modify, distribute and perform the work, even for commercial purposes, all without asking permission. See Other Information.
File: world-universities.csv | Column name | Description | |:--------------------------|:----------------------------------------| | AD | The country code for Andorra. (String) | | University of Andorra | The name of the university. (String) | | http | The website of the university. (String) |
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset is a compilation of processed data on citation and references for research papers including their author, institution and open access info for a selected sample of academics analysed using Microsoft Academic Graph (MAG) data and CORE. The data for this dataset was collected during December 2019 to January 2020.Six countries (Austria, Brazil, Germany, India, Portugal, United Kingdom and United States) were the focus of the six questions which make up this dataset. There is one csv file per country and per question (36 files in total). More details about the creation of this dataset are available on the public ON-MERRIT D3.1 deliverable report.The dataset is a combination of two different data sources, one part is a dataset created on analysing promotion policies across the target countries, while the second part is a set of data points available to understand the publishing behaviour. To facilitate the analysis the dataset is organised in the following seven folders:PRTThe dataset with the file name "PRT_policies.csv" contains the related information as this was extracted from promotion, review and tenure (PRT) policies. Q1: What % of papers coming from a university are Open Access?- Dataset Name format: oa_status_countryname_papers.csv- Dataset Contents: Open Access (OA) status of all papers of all the universities listed in Times Higher Education World University Rankings (THEWUR) for the given country. A paper is marked OA if there is at least an OA link available. OA links are collected using the CORE Discovery API.- Important considerations about this dataset: - Papers with multiple authorship are preserved only once towards each of the distinct institutions their authors may belong to. - The service we used to recognise if a paper is OA, CORE Discovery, does not contain entries for all paperids in MAG. This implies that some of the records in the dataset extracted will not have either a true or false value for the _is_OA_ field. - Only those records marked as true for _is_OA_ field can be said to be OA. Others with false or no value for is_OA field are unknown status (i.e. not necessarily closed access).Q2: How are papers, published by the selected universities, distributed across the three scientific disciplines of our choice?- Dataset Name format: fsid_countryname_papers.csv- Dataset Contents: For the given country, all papers for all the universities listed in THEWUR with the information of fieldofstudy they belong to.- Important considerations about this dataset: * MAG can associate a paper to multiple fieldofstudyid. If a paper belongs to more than one of our fieldofstudyid, separate records were created for the paper with each of those _fieldofstudyid_s.- MAG assigns fieldofstudyid to every paper with a score. We preserve only those records whose score is more than 0.5 for any fieldofstudyid it belongs to.- Papers with multiple authorship are preserved only once towards each of the distinct institutions their authors may belong to. Papers with authorship from multiple universities are counted once towards each of the universities concerned.Q3: What is the gender distribution in authorship of papers published by the universities?- Dataset Name format: author_gender_countryname_papers.csv- Dataset Contents: All papers with their author names for all the universities listed in THEWUR.- Important considerations about this dataset :- When there are multiple collaborators(authors) for the same paper, this dataset makes sure that only the records for collaborators from within selected universities are preserved.- An external script was executed to determine the gender of the authors. The script is available here.Q4: Distribution of staff seniority (= number of years from their first publication until the last publication) in the given university.- Dataset Name format: author_ids_countryname_papers.csv- Dataset Contents: For a given country, all papers for authors with their publication year for all the universities listed in THEWUR.- Important considerations about this work :- When there are multiple collaborators(authors) for the same paper, this dataset makes sure that only the records for collaborators from within selected universities are preserved.- Calculating staff seniority can be achieved in various ways. The most straightforward option is to calculate it as _academic_age = MAX(year) - MIN(year) _for each authorid.Q5: Citation counts (incoming) for OA vs Non-OA papers published by the university.- Dataset Name format: cc_oa_countryname_papers.csv- Dataset Contents: OA status and OA links for all papers of all the universities listed in THEWUR and for each of those papers, count of incoming citations available in MAG.- Important considerations about this dataset :- CORE Discovery was used to establish the OA status of papers.- Papers with multiple authorship are preserved only once towards each of the distinct institutions their authors may belong to.- Only those records marked as true for _is_OA_ field can be said to be OA. Others with false or no value for is_OA field are unknown status (i.e. not necessarily closed access).Q6: Count of OA vs Non-OA references (outgoing) for all papers published by universities.- Dataset Name format: rc_oa_countryname_-papers.csv- Dataset Contents: Counts of all OA and unknown papers referenced by all papers published by all the universities listed in THEWUR.- Important considerations about this dataset :- CORE Discovery was used to establish the OA status of papers being referenced.- Papers with multiple authorship are preserved only once towards each of the distinct institutions their authors may belong to. Papers with authorship from multiple universities are counted once towards each of the universities concerned.Additional files:- _fieldsofstudy_mag_.csv: this file contains a dump of fieldsofstudy table of MAG mapping each of the ids to their actual field of study name.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset is about universities. It has 2,021 rows. It features 3 columns: country, and international students. It is 80% filled with non-null values.
Facebook
TwitterThis dataset contains information on university rankings from around the world. It includes key details such as university name, country, and global rank.
Upvote, if you find it interesting š
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset provides data on the number of students enrolled in public colleges and universities in Qatar, categorized by nationality, country of origin, and gender. The dataset includes students from the Gulf Cooperation Council (G.C.C.) countries, such as Qatar, the United Arab Emirates, Bahrain, Kuwait, Saudi Arabia, and Oman, as well as students from other Arab countries, such as Iraq. This dataset helps in understanding the distribution of students from different countries and genders in Qatarās higher education institutions.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This Cost of International Education dataset compiles detailed financial information for students pursuing higher education abroad. It covers multiple countries, cities, and universities around the world, capturing the full tuition and living expenses spectrum alongside key ancillary costs. With standardized fields such as tuition in USD, living-cost indices, rent, visa fees, insurance, and up-to-date exchange rates, it enables comparative analysis across programs, degree levels, and geographies. Whether youāre a prospective international student mapping out budgets, an educational consultant advising on affordability, or a researcher studying global education economics, this dataset offers a comprehensive foundation for data-driven insights.
| Column | Type | Description |
|---|---|---|
| Country | string | ISO country name where the university is located (e.g., āGermanyā, āAustraliaā). |
| City | string | City in which the institution sits (e.g., āMunichā, āMelbourneā). |
| University | string | Official name of the higher-education institution (e.g., āTechnical University of Munichā). |
| Program | string | Specific course or major (e.g., āMaster of Computer Scienceā, āMBAā). |
| Level | string | Degree level of the program: āUndergraduateā, āMasterāsā, āPhDā, or other certifications. |
| Duration_Years | integer | Length of the program in years (e.g., 2 for a typical Masterās). |
| Tuition_USD | numeric | Total program tuition cost, converted into U.S. dollars for ease of comparison. |
| Living_Cost_Index | numeric | A normalized index (often based on global city indices) reflecting relative day-to-day living expenses (food, transport, utilities). |
| Rent_USD | numeric | Average monthly student accommodation rent in U.S. dollars. |
| Visa_Fee_USD | numeric | One-time visa application fee payable by international students, in U.S. dollars. |
| Insurance_USD | numeric | Annual health or student insurance cost in U.S. dollars, as required by many host countries. |
| Exchange_Rate | numeric | Local currency units per U.S. dollar at the time of data collectionāvital for currency conversion and trend analysis if rates fluctuate. |
Feel free to explore, visualize, and extend this dataset for deeper insights into the true cost of studying abroad!
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset is about universities in Stillwater. It has 1 row. It features 3 columns: country, and international students.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset is about universities in Stanford. It has 1 row. It features 3 columns: country, and international students.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The COKI Open Access Dataset measures open access performance for 225 countries and 50,000 institutions and is available in JSON Lines format. The data is visualised at the COKI Open Access Dashboard: https://open.coki.ac/.
The COKI Open Access Dataset is created with the COKI Academic Observatory data collection pipeline, which fetches data about research publications from multiple sources, synthesises the datasets and creates the open access calculations for each country and institution.
Each week a number of specialised research publication datasets are collected. The datasets that are used for the COKI Open Access Dataset release include Crossref Metadata, OpenAlex, Unpaywall and the Research Organization Registry.
After fetching the datasets, they are synthesised to produce aggregate time series statistics for each country and institution in the dataset. The aggregate timeseries statistics include publication count, open access status and citation count.
See https://open.coki.ac/data/ for the dataset schema. A new version of the dataset is deposited every week.
Code
The COKI Academic Observatory data collection pipeline is used to create the dataset.
The COKI OA Website Github project contains the code for the web app that visualises the dataset at open.coki.ac. It can be found on Zenodo here.
LicenseCOKI Open Access Dataset Ā© 2022 by Curtin University is licenced under CC BY 4.0.
AttributionsThis work contains information from:
OpenAlex which is made available under the CC0 license.
Crossref Metadata via the Metadata Plus program. Bibliographic metadata is made available without copyright restriction and Crossref generated data under a CC0 licence. See metadata licence information for more details.
Unpaywall. The Unpaywall Data Feed is used under license. Data is freely available from Unpaywall via the API, data dumps and as a data feed.
Research Organization Registry which is made available under a CC0 licence.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset is about universities in Columbus. It has 1 row. It features 3 columns: country, and international students.
Facebook
Twitterš¬Also have a look at
š” UNIVERSITIES & Research INSTITUTIONS Rank - SCImagoIR
š” Scientific JOURNALS Indicators & Info - SCImagoJR
ā¢ļøāThe entire dataset is obtained from public and open-access data of ScimagoJR (SCImago Journal & Country Rank)
ScimagoJR Country Rank
SCImagoJR About Us
Documents: Number of documents published during the selected year. It is usually called the country's scientific output.
Citable Documents: Selected year citable documents. Exclusively articles, reviews and conference papers are considered.
Citations: Number of citations by the documents published during the source year, --i.e. citations in years X, X+1, X+2, X+3... to documents published during year X. When referred to the period 1996-2021, all published documents during this period are considered.
Citations per Document: Average citations per document published during the source year, --i.e. citations in years X, X+1, X+2, X+3... to documents published during year X. When referred to the period 1996-2021, all published documents during this period are considered.
Self Citations: Country self-citations. Number of self-citations of all dates received by the documents published during the source year, --i.e. self-citations in years X, X+1, X+2, X+3... to documents published during year X. When referred to the period 1996-2021, all published documents during this period are considered.
H index: The h index is a country's number of articles (h) that have received at least h- citations. It quantifies both country's scientific productivity and scientific impact and it is also applicable to scientists, journals, etc.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset presents the number of teaching staff in public colleges and universities, categorized by country, university title, and gender. It supports higher education workforce analysis and planning.
Facebook
Twitterhttps://www.ontario.ca/page/open-government-licence-ontariohttps://www.ontario.ca/page/open-government-licence-ontario
Data from the Ministry of Colleges and Universities' College Enrolment Statistical Reporting system.
Provides aggregated key enrolment data for college students, such as:
To protect privacy, numbers are suppressed in categories with less than 10 students.
Facebook
TwitterThis dataset was created by Mikayel Grigoryan
Data from John Hopkins University
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset contains a set of daily time series representing the percentage changes of 6 aspects due to COVID-19: retail/recreation, grocery/pharmacy, parks, workplaces, residential and transit stations in a set of countries and regions. This file contains 559 daily time series which represent the average percentage changes of the above 6 aspects in 131 countries.
The original dataset contains missing values and they have been replaced by zeros.
Facebook
Twitterlicense: apache-2.0 tags: - africa - sustainable-development-goals - world-health-organization - development
Schools with access to computers for pedagogical purposes (%)
Dataset Description
This dataset provides country-level data for the indicator "4.a.1 Schools with access to computers for pedagogical purposes (%)" across African nations, sourced from the World Health Organization's (WHO) data portal on Sustainable Development Goals (SDGs). The data is⦠See the full description on the dataset page: https://huggingface.co/datasets/electricsheepafrica/schools-with-access-to-computers-for-pedagogical-purposes-for-african-countries.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This list contains the names of almost all the universities in the world along with their - 1. Webpage(s) 2. Domain name(s) 3. Country of university in aplha code-2 4. Country of university 5. Name of the university 6. State/province
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset is about universities in New York. It has 10 rows. It features 2 columns including country.
Facebook
Twitterhttp://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/
Dataset Name: World University Rankings 2023 - Cleaned
This dataset is a cleaned and preprocessed version of the "World University Rankings 2023" originally provided by Syed Ali Taqi on Kaggle. The original dataset included 13 features, covering information about universities worldwide.
This cleaned version of the dataset has undergone rigorous preprocessing, including handling missing values and encoding categorical features, resulting in a dataset with enhanced usability and cleanliness. It now consists of 2,341 rows and 2,361 columns, providing valuable insights for data analysis, machine learning, and research in the field of higher education.
The original version of the "World University Rankings 2023" dataset was a comprehensive collection of data on 1,799 universities across 104 countries and regions. While it provided valuable insights into higher education worldwide, it presented some challenges due to missing values, inconsistencies, and a mix of data types.
Original Dataset Source: World University Rankings 2023
In this cleaned version of the dataset, significant efforts have been made to enhance its quality and usability. The following improvements were made:
Handling Missing Values: - All missing values, including NaN and Null values, have been meticulously addressed for every feature in the dataset. - Specifically, missing values in the "Name of University" and "Location" columns have been replaced with meaningful placeholders: "Unknown University" and "Unknown Location," respectively.
Encoding and Transformation: - One-hot encoding has been applied to the "Name of University" and "Location" columns, converting categorical data into a numerical format suitable for analysis and modeling. - The "Female Ratio" and "Male Ratio" columns have been separated, allowing for more straightforward analysis of gender ratios. - "OverAll Score" has been divided into "OverAll Score Min" and "OverAll Score Max" columns, providing insights into the range of scores. - "International Student" values have been encoded as fractional values, making it easier to interpret and analyze. - Several features, including "Female Ratio," "Male Ratio," "OverAll Score Min," "OverAll Score Max," "No of Student," and "International Student," have been encoded as numerical values, improving their compatibility with data analysis and modeling techniques.
These enhancements have transformed the dataset into a cleaned and well-structured resource for data analysis, machine learning, and research in the field of higher education. Researchers and data enthusiasts can now explore and gain valuable insights from this improved dataset with confidence.
Whether you are conducting exploratory data analysis, building predictive models, or conducting research, this cleaned version of the dataset provides a solid foundation for your analytical endeavors.
For more details on the data preprocessing steps and to access the cleaned dataset, you can visit the GitHub repository where the preprocessing was performed: GitHub Repository
If you find value in this "World University Rankings 2023 - Cleaned" dataset, please consider upvoting it on Kaggle to boost its visibility. Additionally, star our GitHub repository to show your support for the data preprocessing efforts. Your support is greatly appreciated!
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
This dataset provides a small collection of information about universities and institutions from different countries. Each record includes the following details:
The dataset can be used for educational research, location-based analysis of educational institutions, or as part of a larger database of global universities.