Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This repository contains a dataset of higher education institutions in the United States of America. This dataset was compiled in response to a cybersecurity research of American higher education institutions' websites [1]. The data is being made publicly available to promote open science principles [2].
The data includes the following fields for each institution:
The dataset was obtained from the Higher Education Integrated Data System (IPEDS) website [3], which is administered by the National Center for Education Statistics (NCES). NCES serves as the primary federal entity for collecting and analyzing education-related data in the United States. The data was collected on February 2, 2023.
The initial list of institutions was derived from the IPEDS database using the following criteria: (1) US institutions only, (2) degree-granting institutions, primarily bachelor's or higher, and (3) industry classification, which includes: public 4 - year or above, private not-for-profit 4 years or more, private for-profit 4 years or more, public 2 years, private not-for-profit 2 years, private for-profit 2 years, public less than 2 years, private not-for-profit for-profit less than 2 years and private for-profit less than 2 years.
The following variables have been added to the list of institutions: Control of the institution, state abbreviation, degree-granting status, Status of the institution, and Institution's internet website address. This resulted in a report with 1,979 institutions.
The institution's status was labeled with the following values: A (Active), N (New), R (Restored), M (Closed in the current year), C (Combined with another institution), D (Deleted out of business), I (Inactive due to hurricane-related issues), O (Outside IPEDS scope), P (Potential new/add institution), Q (Potential institution reestablishment), W (Potential addition outside IPEDS scope), X ( Potential restoration outside the scope of IPEDS) and G (Perfect Children's Campus).
A filter was applied to the report to retain only institutions with an A, N, or R status, resulting in 1,978 institutions. Finally, a data cleaning process was applied, which involved removing the whitespace at the beginning and end of cell content and duplicate whitespace. The final data were compiled into the dataset included in this repository.
This data is available under the Creative Commons Zero (CC0) license and can be used for any purpose, including academic research purposes. We encourage the sharing of knowledge and the advancement of research in this field by adhering to open science principles [2].
If you use this data in your research, please cite the source and include a link to this repository. To properly attribute this data, please use the following DOI: 10.5281/zenodo.7614862
If you have any updates or corrections to the data, please feel free to open a pull request or contact us directly. Let's work together to keep this data accurate and up-to-date.
We would like to acknowledge the support of the Norte Portugal Regional Operational Programme (NORTE 2020), under the PORTUGAL 2020 Partnership Agreement, through the European Regional Development Fund (ERDF), within the project "Cybers SeC IP" (NORTE-01-0145-FEDER-000044). This study was also developed as part of the Master in Cybersecurity Program at the Instituto Politécnico de Viana do Castelo, Portugal.
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
This repository contains a dataset of higher education institutions in Germany. This includes 400 higher education institutions in Germany, including universities, universities of applied sciences and Higher Institutes as Higher Institute of Engineering, Higher Institute of biotechnologies and few others. This dataset was compiled in response to a cybersecurity investigation of Germany higher education institutions' websites [1]. The data is being made publicly available to promote open science principles [2].
The data includes the following fields for each institution:
The methodology for creating the dataset involved obtaining data from two sources: The European Higher Education Sector Observatory (ETER)[3]. The data was collected on December 26, 2024, the Eurostat for NUTS - Nomenclature of territorial units for statistics 2013-16[4] and 2021[5].
This section outlines the methodology used to create the dataset for Higher Education Institutions (HEIs) in France. The dataset consolidates information from various sources, processes the data, and enriches it to provide accurate and reliable insights.
Data Sources
eter-export-2021-DE.xlsxNUTS2013-NUTS2016.xlsxNUTS2021.xlsxData Cleaning and Preprocessing Column Renaming Columns in the raw dataset were renamed for consistency and readability. Examples include:
ETER ID → ETER_IDInstitution Name → NameLegal status → CategoryValue Replacement
Category column was cleaned, with government-dependent institutions classified as "public."Handling Missing or Incorrect Data
ETER_ID. For instance:
DE0012 (updated to www.zeppelin-university.com)FR0906 (updated to hmtm.de)FR0104 (updated to www.dhfpg.de)FR0466 (updated to fhf.brandenburg.de)FR0907 (updated to hr-nord.niedersachsen.de)FR0333 (updated to www.srh-university.de)Regional Data Integration
Final Dataset The final dataset was saved as a CSV file: germany-heis.csv, encoded in UTF-8 for compatibility. It includes detailed information about HEIs in France, their categories, regional affiliations, and membership in European alliances.
Summary This methodology ensures that the dataset is accurate, consistent, and enriched with valuable regional and institutional details. The final dataset is intended to serve as a reliable resource for analyzing French HEIs.
This data is available under the Creative Commons Zero (CC0) license and can be used for any purpose, including academic research purposes. We encourage the sharing of knowledge and the advancement of research in this field by adhering to open science principles [2].
If you use this data in your research, please cite the source and include a link to this repository. To properly attribute this data, please use the following DOI: 10.5281/zenodo.7614862
If you have any updates or corrections to the data, please feel free to open a pull request or contact us directly. Let's work together to keep this data accurate and up-to-date.
We would like to acknowledge the support of the Norte Portugal Regional Operational Programme (NORTE 2020), under the PORTUGAL 2020 Partnership Agreement, through the European Regional Development Fund (ERDF), within the project "Cybers SeC IP" (NORTE-01-0145-FEDER-000044). This study was also developed as part of the Master in Cybersecurity Program at the Instituto Politécnico de Viana do Castelo, Portugal.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This table includes repositories with at least 10 shared datasets on the site.
Facebook
TwitterThis dataset includes data files on the individual data of all respondents to international surveys conducted and published between 1995 and 2022, including data on trust in institutions. It also includes answers to three questions about democracy, covering satisfaction, appreciation of the level of democracy, and support for democracy. There are several files: 1) a general file including all combined and harmonized data. 2) a file including the same data but only for respondents who answered at least one question on trust. The next four files allow for a four-level multilevel analysis with HLM, with one file per level: 3) a file at the measurement level (level 1), including one line per trust-related response, per respondent 4) a file including only information on respondents (level 2); 5) a file including information on surveys - and thus on country-year-data source (level 3); 6) a file including information on countries combined with data sources -- region - characteristics of sources. All files include identifiers for the country, year, and data source. Cet ensemble de données comprend les fichiers de données relatifs aux données individuelles de tous les répondants aux sondages internationaux réalisés et publiés entre 1995 et 2022 comprenant des données sur la confiance dans les institutions. Il comprend également les réponses à trois questions sur la démocratie portant sur la satisfaction, l'appréciation du niveau de démocratie et l'appui à la démocratie. Il y a plusieurs fichiers: 1) un fichier général comprenant toutes les données combinées et harmonisées. 2) un fichier comprenant ces mêmes données mais uniquement pour les répondants qui ont répondu à au moins une questions sur la confiance. Les quatre fichiers suivants permettent de faire une analyse multiniveaux à quatre niveaux avec HLM, soit un fichier par niveau 3) un fichier au niveau des mesures (niveau 1), comprenant une ligne par réponse relative à la confiance, par répondant 4) un fichier comprenant uniquement les informations sur les répondants (niveau 2); 5) un fichier comprenant les informations sur les sondages - et donc sur les pays-années-sources de données (niveau 3) 6) un fichier comprenant les informations sur les pays combinés aux sources de données -- région - caractéristiques des sources. Tous les fichiers comprennent des identifiants pour le pays, l'année et la source des données.
Facebook
TwitterAttribution 1.0 (CC BY 1.0)https://creativecommons.org/licenses/by/1.0/
License information was derived automatically
Higher Education Institutions in Poland Dataset
This repository contains a dataset of higher education institutions in Poland. The dataset comprises 131 public higher education institutions and 216 private higher education institutions in Poland. The data was collected on 24/11/2022.
This dataset was compiled in response to a cybersecurity investigation of Poland's higher education institutions' websites [1]. The data is being made publicly available to promote open science principles [2].
Data
The data includes the following fields for each institution:
Methodology
The dataset was compiled using data from two primary sources:
For the international names in English, the following methodology was employed:
Both Polish and English names were retained for each institution. This decision was based on the fact that some universities do not have their English versions available in official sources.
English names were primarily sourced from:
In instances where English names were not readily available from the aforementioned sources, the GPT-3.5 model was employed to propose suitable names. These proposed names are distinctly marked in blue within the dataset file (hei_poland_en.xls).
Usage
This data is available under the Creative Commons Zero (CC0) license and can be used for academic research purposes. We encourage the sharing of knowledge and the advancement of research in this field by adhering to open science principles [2].
If you use this data in your research, please cite the source and include a link to this repository. To properly attribute this data, please use the following DOI:
10.5281/zenodo.8333573
Contribution
If you have any updates or corrections to the data, please feel free to open a pull request or contact us directly. Let's work together to keep this data accurate and up-to-date.
Acknowledgment
We would like to express our gratitude to the Ministry of Education and Science of Poland and the RAD-on system for providing the information used in this dataset.
We would like to acknowledge the support of the Norte Portugal Regional Operational Programme (NORTE 2020), under the PORTUGAL 2020 Partnership Agreement, through the European Regional Development Fund (ERDF), within the project "Cybers SeC IP" (NORTE-01-0145-FEDER-000044). This study was also developed as part of the Master in Cybersecurity Program at the Polytechnic University of Viana do Castelo, Portugal.
References
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The OpenAIRE Graph is exported as several files, so you can download the parts you are interested into.
publication_[part].tar: metadata records about research literature (includes types of publications listed here)dataset_[part].tar: metadata records about research data (includes the subtypes listed here) software.tar: metadata records about research software (includes the subtypes listed here)otherresearchproduct_[part].tar: metadata records about research products that cannot be classified as research literature, data or software (includes types of products listed here)organization.tar: metadata records about organizations involved in the research life-cycle, such as universities, research organizations, funders.datasource.tar: metadata records about data sources whose content is available in the OpenAIRE Graph. They include institutional and thematic repositories, journals, aggregators, funders' databases.project.tar: metadata records about project grants.relation_[part].tar: metadata records about relations between entities in the graph.communities_infrastructures.tar: metadata records about research communities and research infrastructures
Each file is a tar archive containing gz files, each with one json per line. Each json is compliant to the schema available at http://doi.org/10.5281/zenodo.14608526. The documentation for the model is available at https://graph.openaire.eu/docs/data-model/
Learn more about the OpenAIRE Graph at https://graph.openaire.eu.
Discover the graph's content on OpenAIRE EXPLORE and our API for developers.
This deposition contains:
192,934,523 publications,
73,443,566 datasets,
596,316 software,
24,797,142 other research products,
141,568 datasources,
3,482,537 projects,
454,601 organizations,
34 communities,
7,241,517,003 relations
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This table includes websites with at least 10 shared datasets on the site.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The dataset ‘Open Access-related Infrastructures and Services at German Universities (OARIS) was compiled in the context of the BMBF-funded project ‘OAUNI - Entwicklung und Einflussfaktoren des Open-Access-Publizierens an Universitäten in Deutschland‘. The aim of the project is to analyze the uptake of Open Access (OA) at German universities and to identify the most important determinants.1 As suggested by its name, the dataset is restricted to German universities and collects information about OA-Infrastructures and services at these institutions. As the dataset is a result of a project but not an OA information resource, it is not intended to update the data. OARIS includes structural data about German universities as well as data about institutional repositories, institutional OA policies, publications funds, university presses, and journals hosted by Open Journal Systems (OJS). The data collection and data cleaning took place between May 2020 and May 2021. Table 1 provides more detailed information about the sources of data used for OARIS.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Using Web of Science and Unpaywall data, we here provide an update of Open Access (OA) levels of Dutch universities, for 2016 and 2017.
Our previous analysis (10.5281/zenodo.1133759 and 10.7287/peerj.preprints.3520v1) looked at OA classification as included in Web of Science (gold and green OA, based on Unpaywall data), and supplemented that with a breakdown of gold OA into pure gold, hybrid and bronze, taken from Unpaywall data (formerly OADOI) directly. Here, we improve on this by running all DOIs retrieved from WoS through Unpaywall data (using their web interface that allows batch checking of up to 10,000 DOIs at a time). Unlike WoS, Unpaywall data itself includes author-submitted versions in their green OA classification, resulting in more complete green OA levels.
In addition, since our initial analysis of December 2017, Unpaywall data has considerably expanded its coverage of institutional repositories (IRs) (see https://unpaywall.org/sources). This now includes coverage of the IRs from all Dutch universities.
Taken together, the current data show higher levels of green open access, including author-submitted versions, compared to our previous analysis.
In this update, we include output (articles and reviews) from 2016 and 2017 for all 14 universities in the Netherlands.
The following categories are distinguished (description taken from Piwowar at al., 2018, doi: 10.7717/peerj.4375)
Data for Dutch universities were collected from Web of Science using the organization-enhanced field. Only articles and reviews were included. DOIs were extracted from the Web of Science export, run through the Unpaywall data Simple Query Tool. From the resulting data from Unpaywall, OA classification was done using a simple formula in Excel (to be replaced by an R script in a future update). The Excel template used is included in this dataset, as is the OADOI API output for each Dutch university's article subset, and the lists of DOIs derived from Web of Science. The dataset also includes summarized data and three charts generated from these data, showing levels of different types of OA for 2016, 2017 and the two years compared.
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Facebook
TwitterAttribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
The database Cretan Institutional Inscriptions was created as part of the PhD research project in Ancient Heritage Studies Kretikai Politeiai: Cretan Institutions from VII to I century BC, carried out at the University of Venice Ca’ Foscari by Irene Vagionakis from 2016 to 2019, under the supervision of Claudia Antonetti and Gabriel Bodard. The research project aimed at collecting the epigraphic sources related to the institutional elements of the many political entities of Crete, with a view to highlighting the specificity of each context in the period between the rise of the poleis and the Roman conquest of the island. The main component of the database consists of the epigraphic collection of the 600 inscriptions constituting the core of the documentary base of the study, for each of which an XML edition compliant with the TEI EpiDoc international standard was created. Each EpiDoc edition includes a descriptive and a bibliographic lemma, the text of the inscription, a selective apparatus criticus and a commentary focused on the institutional data offered by the document. In addition to the epigraphic collection, the database includes a collection of the main related literary sources, a catalogue of the attested Cretan institutions (assemblies, boards, officials, associations, civic subdivisions, social statuses, age classes, months, festivities and other celebrations, institutional practices, institutional instruments, public spaces) and a catalogue of the political entities of Crete (poleis, koina, dependent communities, extra-urban sanctuaries, hegemonic alliances). Data and SW available at https://github.com/IreneVagionakis/CretanInscriptions
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Abstract:
Soil spectroscopy has emerged as a solution to the limitations associated with traditional soil surveying and analysis methods, addressing the challenges of time and financial resources. Analyzing the soil's spectral reflectance enables to observe the soil composition and simultaneously evaluate several attributes because the matter, when exposed to electromagnetic energy, leaves a "spectral signature" that makes such evaluations possible. The Soil Spectral Library (SSL) consolidates soil spectral patterns from a specific location, facilitating accurate modeling and reducing time, cost, chemical products, and waste in surveying and mapping processes. Therefore, an open access SSL benefits society by providing a fine collection of free data for multiple applications for both research and commercial use.
BSSL Description and Usefulness
The Brazilian Soil Spectral Library (BSSL), available at https://bibliotecaespectral.wixsite.com/english, is a comprehensive repository of soil spectral data. Coordinated by JAM Demattê and managed by the GeoCiS research group, the BSSL was initiated in 1995 and published by Demattê and collaborators in 2019. This initiative stands out due to its coverage of diverse soil types, given Brazil's significance in the agricultural and environmental domains and its status as the fifth largest territory in the world (IBGE, 2023). In addition, a Middle Infrared (MIR) dataset has been published (Mendes et al., 2022), part of which is included in this repository. The database covers 16,084 sites and includes harmonized physicochemical and spectral (Vis-NIR-SWIR and MIR range) soil data from various sources at 0-20 cm depth. All soil samples have Vis-NIR-SWIR data, but not all have MIR data.
The BSSL provides open and free access to curated data for the scientific community and interested individuals. Unrestricted access to the BSSL supports researchers in validating their results by comparing measured data with predicted values. This initiative also facilitates the development of new models and the improvement of existing ones. Moreover, users can employ the library to test new models and extract information about previously unknown soil properties. With its extensive coverage of tropical soil classes, the BSSL is considered one of the most significant soil spectral libraries worldwide, with 42 institutions and 61 researchers participating. However, 47 collaborators from 29 institutions have authorized the data opening. Other researchers can also provide their data upon request through the coordinator of this initiative.
The data from the BSSL project can also help wet labs to improve their analytical capabilities, contributing to developing hybrid wet soil laboratory techniques and digital soil maps while informing decision-makers in formulating conservation and land use policies. The soil's capacity for different land uses promotes soil health and sustainability.
Coverage
The BSSL data covers all regions of Brazil, including 26 states and the Federal District. It is in a .xlsx format and has a total size of 305 Mb. The table is structured in sheets with rows for observations, and columns, representing various soil attributes in the surface layer, from 0 to 20 cm depth. The database includes environmental and physicochemical properties (20 columns and 16,084 rows), Vis-NIR-SWIR spectral bands (2151 columns and 16,084 rows), and MIR channels (681 columns and 1783 rows). An ID unique column can merge the sheet for each attribute or spectral range.
Accessing original data source
Using these data requires their reference in any situation under copyright infringement penalty. Three mechanisms are available for users to reach the original and complete data contributors:
a) Refer to sheet two for name and code-based searches;
b) Visit the website https://bibliotecaespectral.wixsite.com/english/lista-de-cedentes or locate the contributors' list by Brazilian state;
c) Visit the website of the Brazilian Soil Spectral Service – Braspecs http://www.besbbr.com.br/, an online platform for soil analysis that uses part of the current SSL (Demattê et al., 2022) - It was developed and managed by GeoCiS. There, owners from all over the country can be found.
Proceeding to data analysis
We registered and organized the samples at the ESALQ/USP Soil Laboratory. Some samples arrived without preliminary data analyses, so we analyzed them for soil organic matter (SOM), granulometry, cation exchange capacity (CEC), pH in water, and the presence of Ca, Mg, and Na, following the recommendations of Donagemma et al. (2011).
The GeoCiS research group performed spectral analyses following the procedures described by Bellinaso et al. (2010). Demattê et al. (2019) provide detailed methods for sampling, preparation, and soil analyses, including reflectance spectroscopy. Latitude and longitude data can be requested directly from the data owner. In summary, the following steps are involved in data acquisition.
a) We subjected the soil samples to a preliminary treatment, which involved drying them in an oven at 45°C for 48 hours, grinding them, and sieving them through a 2mm mesh;
b) We placed the samples in Petri dishes with a diameter of 9 cm and a height of 1.5 cm;
c) We homogenized and flattened the surface of the samples to reduce the shading caused by larger particles or foreign bodies, making them ready for spectral readings;
d) The spectral analyses took place in a darkened room to avoid interference from natural light. We used a computer to record the electromagnetic pulses through an optical fiber connected to the sensor, capturing the spectral response of the soil sample;
e) We obtained reflectance data in the Visible-Near Infrared-Shortwave Infrared (Vis-NIR-SWIR) range using a FieldSpec 3 spectroradiometer (Analytical Spectral Devices, ASD, Boulder, CO), which operates in the spectral range from 350 to 2500 nm;
f) The sensor had a spectral resolution of 3 nm from 350-700 nm and 10 nm from 700-2500 nm, automatically interpolated to 1 nm spectral resolution in the output data, resulting in 2151 channels (or bands); and
g) We positioned the lamps at 90° from each other and 35 cm away from the sample, with a zenith angle of 30°.
The sensor captured the light reflected through the fiber optic cable, which was positioned 8 cm from the sample's surface.
We used two 50W halogen lamps as the power source for the artificial light. It's important to note that we took three readings for each sample at different positions by rotating the Petri dish by 90°.
Each reading represents the average of 100 scans taken by the sensor. From these three readings, we calculated the final spectrum of the samples. Notably, the laboratory's equipment and procedures for soil sample spectral analyses followed the ASD's recommendations, particularly about sensor calibration using a white spectralon plate as a 100% reflectance standard.
For the analysis in the Middle Infrared (MIR) spectral region, we followed the procedures outlined by Mendes et al. (2022). We milled the soil fraction smaller than 2 mm, sieved it to 0.149 mm, and scanned it using a Fourier Transform Infrared (FT-IR) alpha spectroradiometer (Bruker Optics Corporation, Billerica, MA 01821, USA) equipped with a DRIFT accessory.
The spectroradiometer measured the diffuse reflectance using Fourier transformation in the spectral range from 4000 cm-1 to 600 cm-1, with a resolution of 2 cm-1. We conducted these measurements in the Geotechnology Laboratory of the Department of Soil Science at Esalq-USP. We took the average of 32 successive readings to obtain a soil spectrum. Sensor calibration took place before each spectral acquisition of the sample set by standardizing it against the maximum reflectance of a gold plate.
Dataset characterization
The database, named BSSL_DB_Key_Soils, has five sheets containing the key soil attributes, Vis-NIR-SWIR and MIR datasets, descriptions of the contributors and the proximal sensing methods used for spectral soil analysis. The sheets can be linked by "ID_Unique" columns, which bring the corresponding rows according to the data type. Some cells are empty because collaborators have already provided data in this way. However, we have decided to keep them in the database because they have other soil key attributes. Every Column in the data sheets is described as follows:
Sheet 1. BSSL_Soil_Attributes_Dataset
Column 1. ID_unique: Sequential code assigned to every record;
Column 2. Owner code: Acronym assigned to each contributor who allowed access to their proprietary data;
Column 3. Vis_NIR_SWIR_availability: availability of spectral data in visible, near-infrared, and shortwave infrared ranges;
Column 4. MIR_availability: availability of spectral data in the middle infrared range;
Column 5. Sampling: type of soil sampling;
Column 6. Depth_cm: soil surface layer depth in centimeters;
Column 7. Region: Brazilian geographical region of samples' source;
Column 8. Municipality: Brazilian municipality of samples' source;
Column 9. State: Brazilian Federation Unit of samples'
Facebook
TwitterThe repository of higher education libraries includes all libraries associated with a higher education institution in the Wallonia-Brussels Federation (FWB) and a series of information on their subjects.
Link The library repository is associated with the repository of higher education institutions.
Granularity The ID field uniquely identifies each row in the dataset.
Source The data come from collections organized by the "Commission for Libraries and Collective Academic Services" (CBS). For more information on the CBS visit this page.
Logs modifications
2022-02-24: Updating table structure and metadata 2022-04-19: Updating the table structure and metadata (1 line per library instead of 1 line per domain and per topic) 2022-04-20: Updating the map tooltip 2022-12-12: Updated Geographic Coverage Metadata (Correction: Error "zero or more than one hit") 2022-12-22: Addition of the data dictionary2023-01-27: Data Correction (ULB)
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This repository contains a dataset of higher education institutions in the United States of America. This dataset was compiled in response to a cybersecurity research of American higher education institutions' websites [1]. The data is being made publicly available to promote open science principles [2].
The data includes the following fields for each institution:
The dataset was obtained from the Higher Education Integrated Data System (IPEDS) website [3], which is administered by the National Center for Education Statistics (NCES). NCES serves as the primary federal entity for collecting and analyzing education-related data in the United States. The data was collected on February 2, 2023.
The initial list of institutions was derived from the IPEDS database using the following criteria: (1) US institutions only, (2) degree-granting institutions, primarily bachelor's or higher, and (3) industry classification, which includes: public 4 - year or above, private not-for-profit 4 years or more, private for-profit 4 years or more, public 2 years, private not-for-profit 2 years, private for-profit 2 years, public less than 2 years, private not-for-profit for-profit less than 2 years and private for-profit less than 2 years.
The following variables have been added to the list of institutions: Control of the institution, state abbreviation, degree-granting status, Status of the institution, and Institution's internet website address. This resulted in a report with 1,979 institutions.
The institution's status was labeled with the following values: A (Active), N (New), R (Restored), M (Closed in the current year), C (Combined with another institution), D (Deleted out of business), I (Inactive due to hurricane-related issues), O (Outside IPEDS scope), P (Potential new/add institution), Q (Potential institution reestablishment), W (Potential addition outside IPEDS scope), X ( Potential restoration outside the scope of IPEDS) and G (Perfect Children's Campus).
A filter was applied to the report to retain only institutions with an A, N, or R status, resulting in 1,978 institutions. Finally, a data cleaning process was applied, which involved removing the whitespace at the beginning and end of cell content and duplicate whitespace. The final data were compiled into the dataset included in this repository.
This data is available under the Creative Commons Zero (CC0) license and can be used for any purpose, including academic research purposes. We encourage the sharing of knowledge and the advancement of research in this field by adhering to open science principles [2].
If you use this data in your research, please cite the source and include a link to this repository. To properly attribute this data, please use the following DOI: 10.5281/zenodo.7614862
If you have any updates or corrections to the data, please feel free to open a pull request or contact us directly. Let's work together to keep this data accurate and up-to-date.
We would like to acknowledge the support of the Norte Portugal Regional Operational Programme (NORTE 2020), under the PORTUGAL 2020 Partnership Agreement, through the European Regional Development Fund (ERDF), within the project "Cybers SeC IP" (NORTE-01-0145-FEDER-000044). This study was also developed as part of the Master in Cybersecurity Program at the Instituto Politécnico de Viana do Castelo, Portugal.