56 datasets found

f
Data from: Use of Linked Data principles for semantic management of scanned...
scielo.figshare.com
jpeg
Updated Jun 1, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Luciane Lena Pessanha Monteiro; Mark Douglas de Azevedo Jacyntho (2023). Use of Linked Data principles for semantic management of scanned documents [Dataset]. http://doi.org/10.6084/m9.figshare.7512719.v1
Explore at:
jpegAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.7512719.v1
Dataset updated
Jun 1, 2023
Dataset provided by
SciELO journals
Authors
Luciane Lena Pessanha Monteiro; Mark Douglas de Azevedo Jacyntho
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The study addresses the use of the Semantic Web and Linked Data principles proposed by the World Wide Web Consortium for the development of Web application for semantic management of scanned documents. The main goal is to record scanned documents describing them in a way the machine is able to understand and process them, filtering content and assisting us in searching for such documents when a decision-making process is in course. To this end, machine-understandable metadata, created through the use of reference Linked Data ontologies, are associated to documents, creating a knowledge base. To further enrich the process, (semi)automatic mashup of these metadata with data from the new Web of Linked Data is carried out, considerably increasing the scope of the knowledge base and enabling to extract new data related to the content of stored documents from the Web and combine them, without the user making any effort or perceiving the complexity of the whole process.
Data from: The FAIR Assessment Conundrum: Reflections on Tools and Metrics -...
data.europa.eu
data.niaid.nih.gov
+1more
unknown
Updated Jul 3, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Zenodo (2025). The FAIR Assessment Conundrum: Reflections on Tools and Metrics - Data Set [Dataset]. https://data.europa.eu/data/datasets/oai-zenodo-org-10986748?locale=cs
Explore at:
unknown(7186)Available download formats
Dataset updated
Jul 3, 2025
Dataset authored and provided by
Zenodohttp://zenodo.org/
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Data sets accompanying the paper "The FAIR Assessment Conundrum: Reflections on Tools and Metrics", an analysis of a comprehensive set of FAIR assessment tools and the metrics used by these tools for the assessment. The data set "metrics.csv" consists of the metrics collected from several sources linked to the analysed FAIR assessments tools. It is structured into 11 columns: (i) tool_id, (ii) tool_name, (iii) metric_discarded, (iv) metric_fairness_scope_declared, (v) metric_fairness_scope_observed, (vi) metric_id, (vii) metric_text, (viii) metric_technology, (ix) metric_approach, (x) last_accessed_date, and (xi) provenance. The columns tool_id and tool_name are used for the identifier we assigned to each tool analysed and the full name of the tool respectively. The metric_discarded column refers to the selection we operated on the collected metrics, since we excluded the metrics created for testing purposes or written in a language different from English. The possible values are boolean. We assigned TRUE if the metric was discarded. The columns metric_fairness_scope_declared and metric_fairness_scope_observed are used for indicating the declared intent of the metrics, with respect to the FAIR principle assessed, and the one we observed respectively. Possible values are: (a) a letter of the FAIR acronym (for the metrics without a link declared to a specific FAIR principle), (b) one or more identifiers of the FAIR principles (F1, F2…), (c) n/a, if no FAIR references were declared, or (d) none, if no FAIR references were observed. The metric_id and metric_text columns are used for the identifiers of the metrics and the textual and human-oriented content of the metrics respectively. The column metric_technology is used for enumerating the technologies (a term used in its widest acceptation) mentioned or used by the metrics for the specific assessment purpose. Such technologies include very diverse typologies ranging from (meta)data formats to standards, semantic technologies, protocols, and services. For tools implementing automated assessments, the technologies listed take into consideration also the available code and documentation, not just the metric text. The column metric_approach is used for identifying the type of implementation observed in the assessments. The identification of the implementation types followed a bottom-to-top approach applied to the metrics organised by the metric_fairness_scope_declared values. Consequently, while the labels used for creating the implementation type strings are the same, their combination and specialisation varies based on the characteristics of the actual set of metrics analysed. The main labels used are: (a) 3rd party service-based, (b) documentation-centred, (c) format-centred, (d) generic, (e) identifier-centred, (f) policy-centred, (g) protocol-centred, (h) metadata element-centred, (i) metadata schema-centred, (j) metadata value-centred, (k) service-centred, and (l) na. The columns provenance and last_accessed_date are used for the main source of information about each metric (at least with regard to the text) and the date we last accessed it respectively. The data set "classified_technologies.csv" consists of the technologies mentioned or used by the metrics for the specific assessment purpose. It is structured into 3 columns: (i) technology, (ii) class, and (iii) discarded. The column technology is used for the names of the different technologies mentioned or used by the metrics. The column class is used for specifying the type of technology used. Possible values are: (a) application programming interface, (b) format, (c) identifier, (d) library, (e) licence, (f) protocol, (g) query language, (h) registry, (i) repository, (j) search engine, (k) semantic artefact, and (l) service. The discarded column refers to the exclusion of the value 'linked data' from the accepted technologies since it is too generic. The possible values are boolean. We assigned TRUE if the technology was discarded.
Data from: DATA QUALITY ON THE WEB: INTEGRATIVE REVIEW OF PUBLICATION...
scielo.figshare.com
tiff
Updated May 30, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Morgana Carneiro de Andrade; Maria José Baños Moreno; Juan-Antonio Pastor-Sánchez (2023). DATA QUALITY ON THE WEB: INTEGRATIVE REVIEW OF PUBLICATION GUIDELINES [Dataset]. http://doi.org/10.6084/m9.figshare.22815541.v1
Explore at:
tiffAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.22815541.v1
Dataset updated
May 30, 2023
Dataset provided by
SciELOhttp://www.scielo.org/
Authors
Morgana Carneiro de Andrade; Maria José Baños Moreno; Juan-Antonio Pastor-Sánchez
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
ABSTRACT The exponential increase of published data and the diversity of systems require the adoption of good practices to achieve quality indexes that enable discovery, access, and reuse. To identify good practices, an integrative review was used, as well as procedures from the ProKnow-C methodology. After applying the ProKnow-C procedures to the documents retrieved from the Web of Science, Scopus and Library, Information Science & Technology Abstracts databases, an analysis of 31 items was performed. This analysis allowed observing that in the last 20 years the guidelines for publishing open government data had a great impact on the Linked Data model implementation in several domains and currently the FAIR principles and the Data on the Web Best Practices are the most highlighted in the literature. These guidelines presents orientations in relation to various aspects for the publication of data in order to contribute to the optimization of quality, independent of the context in which they are applied. The CARE and FACT principles, on the other hand, although they were not formulated with the same objective as FAIR and the Best Practices, represent great challenges for information and technology scientists regarding ethics, responsibility, confidentiality, impartiality, security, and transparency of data.
n
LinkedCT
neuinfo.org
dknet.org
+2more
Updated Oct 17, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). LinkedCT [Dataset]. http://identifiers.org/RRID:SCR_004585
Explore at:
Unique identifier
https://identifiers.org/RRID:SCR_004585
Dataset updated
Oct 17, 2024
Description
THIS RESOURCE IS NO LONGER IN SERVICE. Documented on January 11, 2023.The Linked Clinical Trials (LinkedCT) project aims at publishing the first open Semantic Web data source for clinical trials data. The data exposed by LinkedCT is generated by (1) transforming existing data sources of clinical trials into RDF, and (2) discovering links between the records in the trials data and several other data sources. You may download static data dumps. The LinkedCT data space is published according to the principles of publishing Linked Data. These principles greatly enhance adaptability and usability of data on the web. Each entity in LinkedCT is identified by a unique HTTP dereferenceable Uniform Resource Identifier (URI). When the URI is looked up, related RDF statements about the entity is returned in HTML or RDF/XML based on the user''s agent. Moreover, a SPARQL endpoint is provided as the standard access method for RDF data.
w
Linked Railway Data Project
data.wu.ac.at
xhtml
Updated Jul 30, 2016
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Linking Open Data (2016). Linked Railway Data Project [Dataset]. https://data.wu.ac.at/schema/datahub_io/NmU5YjExYjAtM2YyMy00Yzk0LWE0ODEtZTFlYmE1MzgwNzUw
Explore at:
xhtmlAvailable download formats
Dataset updated
Jul 30, 2016
Dataset provided by
Linking Open Data
Description
About

Bringing together data on the United Kingdom's railway network under linked data principles.
D
Towards FAIRer Biological Knowledge Networks Using a Hybrid Linked Data and...
ckan.grassroots.tools
html, pdf
Updated Aug 7, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Rothamsted Research (2019). Towards FAIRer Biological Knowledge Networks Using a Hybrid Linked Data and Graph Database Approach [Dataset]. https://ckan.grassroots.tools/dataset/571131d4-08bf-41cc-ad4a-a6605bd05e37
Explore at:
html, pdfAvailable download formats
Dataset updated
Aug 7, 2019
Dataset provided by
Rothamsted Research
License
Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically
Description
jats:titleAbstract/jats:title jats:pThe speed and accuracy of new scientific discoveries – be it by humans or artificial intelligence – depends on the quality of the underlying data and on the technology to connect, search and share the data efficiently. In recent years, we have seen the rise of graph databases and semi-formal data models such as knowledge graphs to facilitate software approaches to scientific discovery. These approaches extend work based on formalised models, such as the Semantic Web. In this paper, we present our developments to connect, search and share data about genome-scale knowledge networks (GSKN). We have developed a simple application ontology based on OWL/RDF with mappings to standard schemas. We are employing the ontology to power data access services like resolvable URIs, SPARQL endpoints, JSON-LD web APIs and Neo4j-based knowledge graphs. We demonstrate how the proposed ontology and graph databases considerably improve search and access to interoperable and reusable biological knowledge (i.e. the FAIRness data principles)./jats:p
Z
Smarter open government data for Society 5.0: analysis of 51 OGD portals
data.niaid.nih.gov
data-staging.niaid.nih.gov
+1more
Updated Aug 4, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Anastasija Nikiforova (2021). Smarter open government data for Society 5.0: analysis of 51 OGD portals [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_5142244
Explore at:
Dataset updated
Aug 4, 2021
Dataset provided by
University of Latvia
Authors
Anastasija Nikiforova
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset contains data collected during a study "Smarter open government data for Society 5.0: are your open data smart enough" (Sensors. 2021; 21(15):5204) conducted by Anastasija Nikiforova (University of Latvia). It being made public both to act as supplementary data for "Smarter open government data for Society 5.0: are your open data smart enough" paper and in order for other researchers to use these data in their own work.

The data in this dataset were collected in the result of the inspection of 60 countries and their OGD portals (total of 51 OGD portal in May 2021) to find out whether they meet the trends of Society 5.0 and Industry 4.0 obtained by conducting an analysis of relevant OGD portals.

Each portal has been studied starting with a search for a data set of interest, i.e. “real-time”, “sensor” and “covid-19”, follwing by asking a list of additional questions. These questions were formulated on the basis of combination of (1) crucial open (government) data-related aspects, including open data principles, success factors, recent studies on the topic, PSI Directive etc., (2) trends and features of Society 5.0 and Industry 4.0, (3) elements of the Technology Acceptance Model (TAM) and the Unified Theory of Acceptance and Use Model (UTAUT).

The method used belongs to typical / daily tasks of open data portals sometimes called “usability test” – keywords related to a research question are used to filter data sets, i.e. “real-time”, “real time” and “real time”, “sensor”, covid”, “covid-19”, “corona”, “coronavirus”, “virus”. In most cases, “real-time”, “sensor” and “covid” keywords were sufficient. The examination of the respective aspects for less user-friendly portals was adapted to particular case based on the portal or data set specifics, by checking: 1. are the open data related to the topic under question ({sensor; real-time; Covid-19}) published, i.e. available? 2. are these data available in a machine-readable format? 3. are these data current, i.e. regularly updated? Where the criteria on the currency depends on the nature of data, i.e. Covid-19 data on the number of cases per day is expected to be updated daily, which won’t be sufficient for real-time data as the title supposes etc. 4. is API ensured for these data? having most importance for real-time and sensor data; 5. have they been published in a timely manner? which was verified mainly for Covid-19 related data. The timeliness is assessed by comparing the dates of the first case identified in a given country and the first release of open data on this topic. 6. what is the total number of available data sets? 7. does the open government data portal provides use-cases / showcases?
8. does the open government portal provide an opportunity to gain insight into the popularity of the data, i.e. does the portal provide statistics of this nature, such as the number of views, downloads, reuses, rating etc.? 9. is there an opportunity to provide a feedback, comment, suggestion or complaint? 10. (9a) is the artifact, i.e. feedback, comment, suggestion or complaint, visible to other users?

Format of the file .xls, .ods, .csv (for the first spreadsheet only)

Licenses or restrictions CC-BY

For more info, see README.txt
USPTO patents data
figshare.com
zip
Updated Oct 28, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mofeed Hassan (2018). USPTO patents data [Dataset]. http://doi.org/10.6084/m9.figshare.5970925.v9
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.5970925.v9
Dataset updated
Oct 28, 2018
Dataset provided by
Figsharehttp://figshare.com/
figshare
Authors
Mofeed Hassan
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
A patent is a set of exclusive rights granted to an inventor by a sovereign state for a solution, be it a product or a process, for a solution to a particular technological problem. The United States Patent and Trademark Office (USPTO) is part of the US department of Commerce that provides patents to businesses and inventors for their inventions in addition to registration of products and intellectual property identification. Each year, the USPTO grants over 150,000 patents to individuals and companies all over the world. As of December 2011, 8,743,423 patents have been issued and 16,020,302 applications have been received. The USPTO patents are accepted in electronic form and are filed as PDF documents. However, the indexing is not perfect and it is cumbersome to search through the PDF documents. Additionally, Google has also made all the patents available for download in XML format, albeit only from the years 2002 to 2015. Thus, we converted this bulk of data (spanning 13 years) from XML to RDF to conform to the Linked Data principles.
Data from: Knowledge graphs in BERD and in NFDI
meta4ds.fokus.fraunhofer.de
pdf, unknown
Updated Nov 28, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Zenodo (2022). Knowledge graphs in BERD and in NFDI [Dataset]. https://meta4ds.fokus.fraunhofer.de/datasets/oai-zenodo-org-7373258?locale=en
Explore at:
pdf(5348053), unknownAvailable download formats
Dataset updated
Nov 28, 2022
Dataset authored and provided by
Zenodohttp://zenodo.org/
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Knowledge graphs are able to capture, enrich and disseminate research data objects so that the FAIR and Linked Data principles are fulfilled. How knowledge graphs can improve the domain-specific (BERD) and cross-domain (NFDI) research data infrastructures? The answer is based on the use cases in BERD@NFDI and on activities of the NFDI working group “Knowledge graphs”. First, we describe the architecture, knowledge graphs and use cases in BERD@NFDI. Then, we present the NFDI working group “Knowledge Graphs”, its work plan and potential base services.
c
ckanext-data-depositario
catalog.civicdataecosystem.org
Updated Aug 24, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). ckanext-data-depositario [Dataset]. https://catalog.civicdataecosystem.org/dataset/ckanext-data-depositario
Explore at:
Dataset updated
Aug 24, 2025
Description
The ckanext-data-depositario extension customizes CKAN specifically for the depositar research data repository. It contains most of the instance-specific modifications, providing a tailored user experience. Functioning alongside other extensions like ckanext-depositartheme, ckanext-wikidatakeyword, and ckanext-citation, this central extension manages the core customizations required by the depositar instance. Key Features: Core Depositar Customizations: Centralizes the major site-specific modifications for the depositar CKAN instance. This implies handling unique data structures, workflows, or validation rules tailored to the repository's research data focus. Extension Dependency: Operates in conjunction with other specialized extensions, indicating a modular design. This layering enables focused development and maintenance of related features. Theming Support (via ckanext-depositartheme): Integration with a dedicated theming extension allows consistent branding and user interface customization specific to depositar. This ensures the visual identity aligns with the repository's goals. Wikidata Integration (via ckanext-wikidatakeyword): Enables the use of Wikidata for keyword management, which enriches metadata with linked data principles and improves discoverability by linking datasets to Wikidata concepts. Citation Management (via ckanext-citation): Facilitates the display and export of dataset citations, acknowledging the research effort in creating and sharing data. This feature supports academic standards and ensures proper data attribution. Technical Integration: While detailed integration steps are available in the linked documentation, the extension likely uses CKAN's plugin architecture to modify various aspects of the platform. This includes:
d
Data from: Eagle I
dknet.org
neuinfo.org
+2more
Updated Jan 29, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2022). Eagle I [Dataset]. http://identifiers.org/RRID:SCR_013153
Explore at:
Unique identifier
https://identifiers.org/RRID:SCR_013153
Dataset updated
Jan 29, 2022
Description
Web application to discover resources available at participating networked universities. This distributed platform for creating and sharing semantically rich data is built around semantic web technologies and follows linked open data principles.
Data from: Supporting Scientometric Studies with Linked Open Data
scielo.figshare.com
jpeg
Updated May 30, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sandro Rautenberg; Edgard Marx; Antonio Costa Gomes Filho; Sören Auer (2023). Supporting Scientometric Studies with Linked Open Data [Dataset]. http://doi.org/10.6084/m9.figshare.5931487.v1
Explore at:
jpegAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.5931487.v1
Dataset updated
May 30, 2023
Dataset provided by
SciELOhttp://www.scielo.org/
Authors
Sandro Rautenberg; Edgard Marx; Antonio Costa Gomes Filho; Sören Auer
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
ABSTRACT In Scientometric Studies, measuring scientific indicators is a complex task due to the challenges associated with data collection, organization and linking, especially in the web, where data is distributed in various sources and incompatible formats. These problems can be tackled with the technological and methodological techniques based on the Linked Open Data principles. These principles cover a set of the best practices from the fields of Semantic Web and Open Data to organize, publish and interlink the data on the Web. With the use of those best practices, the data can be accessed and consumed without restrictions, in many applications. This paper addresses the availability of a Qualis historical dataset, according to the mentioned principles. In Scientometric studies, this effort is important for data reuse, taking into the account: measuring an evolution of scientific journals; assisting production of qualitative and quantitative measures of scientific publications; or obtaining relevant information by interlinking and exploring other scientific indicators. The availability of the Qualis dataset is verified by the three use cases. As a result, the Qualis index (historical series 2005-2013) is shared by a web interface for: (i) furthering the data reuse and integration; and (ii) supporting the interoperability and computational processability of the available resources.
Z
Data from: INGRIDKG: A FAIR Knowledge Graph of Graffiti
data.niaid.nih.gov
ris.uni-paderborn.de
+1more
Updated Mar 22, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ahmed Sherif, Mohamed; Morim da Silva, Ana Alexandra; Pestryakova, Svetlana; Fathi Ahmed, Abdullah; Niemann, Sven; Ngonga Ngomo, Axel-Cyrille (2023). INGRIDKG: A FAIR Knowledge Graph of Graffiti [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_7559895
Explore at:
Dataset updated
Mar 22, 2023
Dataset provided by
Paderborn University
Authors
Ahmed Sherif, Mohamed; Morim da Silva, Ana Alexandra; Pestryakova, Svetlana; Fathi Ahmed, Abdullah; Niemann, Sven; Ngonga Ngomo, Axel-Cyrille
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Graffiti is an urban phenomenon that is increasingly attracting the interest of the sciences. To the best of our knowledge, no suitable data corpora are available for systematic research until now. The Information System Graffiti in Germany project (INGRID) closes this gap by dealing with graffiti image collections that have been made available to the project for public use. Within INGRID, the graffiti images are collected, digitized and annotated. With this work, we aim to support the rapid access to a comprehensive data source on INGRID targeted especially by researchers. In particular, we present INGRIDKG, an RDF knowledge graph of annotated graffiti, abides by the Linked Data and FAIR principles. We weekly update INGRIDKG by augmenting the new annotated graffiti to our knowledge graph. Our generation pipeline applies RDF data conversion, link discovery and data fusion approaches to the original data. The current version of INGRIDKG contains 460,640,154 triples and is linked to 3 other knowledge graphs by over 200,000 links. In our use case studies, we demonstrate the usefulness of our knowledge graph for different applications. INGRIDKG is publicly available under the Creative Commons Attribution 4.0 International license.
Higher Education Institutions in the USA
kaggle.com
zip
Updated Apr 8, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jackson Júnior (2023). Higher Education Institutions in the USA [Dataset]. https://www.kaggle.com/datasets/jacksonbarreto/higher-education-institutions-in-the-usa/data
Explore at:
zip(35907 bytes)Available download formats
Dataset updated
Apr 8, 2023
Authors
Jackson Júnior
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Area covered
United States
Description
Higher Education Institutions in the United States of America Dataset

This repository contains a dataset of higher education institutions in the United States of America. This dataset was compiled in response to a cybersecurity research of American higher education institutions' websites [1]. The data is being made publicly available to promote open science principles [2].

Data

The data includes the following fields for each institution:

Id: A unique identifier assigned to each institution.

Region: The federal state in which the institution is located.

Name: The full name of the institution.

Category: Indicates whether the institution is public or private.

Url: The website of the institution.

Methodology

The dataset was obtained from the Higher Education Integrated Data System (IPEDS) website [3], which is administered by the National Center for Education Statistics (NCES). NCES serves as the primary federal entity for collecting and analyzing education-related data in the United States. The data was collected on February 2, 2023.

The initial list of institutions was derived from the IPEDS database using the following criteria: (1) US institutions only, (2) degree-granting institutions, primarily bachelor's or higher, and (3) industry classification, which includes: public 4 - year or above, private not-for-profit 4 years or more, private for-profit 4 years or more, public 2 years, private not-for-profit 2 years, private for-profit 2 years, public less than 2 years, private not-for-profit for-profit less than 2 years and private for-profit less than 2 years.

The following variables have been added to the list of institutions: Control of the institution, state abbreviation, degree-granting status, Status of the institution, and Institution's internet website address. This resulted in a report with 1,979 institutions.

The institution's status was labeled with the following values: A (Active), N (New), R (Restored), M (Closed in the current year), C (Combined with another institution), D (Deleted out of business), I (Inactive due to hurricane-related issues), O (Outside IPEDS scope), P (Potential new/add institution), Q (Potential institution reestablishment), W (Potential addition outside IPEDS scope), X ( Potential restoration outside the scope of IPEDS) and G (Perfect Children's Campus).

A filter was applied to the report to retain only institutions with an A, N, or R status, resulting in 1,978 institutions. Finally, a data cleaning process was applied, which involved removing the whitespace at the beginning and end of cell content and duplicate whitespace. The final data were compiled into the dataset included in this repository.

Usage

This data is available under the Creative Commons Zero (CC0) license and can be used for any purpose, including academic research purposes. We encourage the sharing of knowledge and the advancement of research in this field by adhering to open science principles [2].

If you use this data in your research, please cite the source and include a link to this repository. To properly attribute this data, please use the following DOI: 10.5281/zenodo.7614862

Contribution

If you have any updates or corrections to the data, please feel free to open a pull request or contact us directly. Let's work together to keep this data accurate and up-to-date.

Acknowledgment

We would like to acknowledge the support of the Norte Portugal Regional Operational Programme (NORTE 2020), under the PORTUGAL 2020 Partnership Agreement, through the European Regional Development Fund (ERDF), within the project "Cybers SeC IP" (NORTE-01-0145-FEDER-000044). This study was also developed as part of the Master in Cybersecurity Program at the Instituto Politécnico de Viana do Castelo, Portugal.

References

Pending.

S. Bezjak, A. Clyburne-Sherin, P. Conzett, P. Fernandes, E. Görögh, K. Helbig, B. Kramer, I. Labastida, K. Niemeyer, F. Psomopoulos, T. Ross-Hellauer, R. Schneider, J. Tennant, E. Verbakel, H. Brinken, and L. Heller, Open Science Training Handbook. Zenodo, Apr. 2018. [Online]. Available: [https://doi.org/10.5281/zenodo.1212496]

Integrated Postsecondary Education Data System, "Compare Institutions", Fev 2023. [online]. Available: https://nces.ed.gov/ipeds/use-the-data
H
Data from: Water Data Explorer
hydroshare.org
search.dataone.org
+1more
zip
Updated Nov 13, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Elkin Romero (2025). Water Data Explorer [Dataset]. https://www.hydroshare.org/resource/651e066ac1de434eb7949723143ec154
Explore at:
zip(30 bytes)Available download formats
Dataset updated
Nov 13, 2025
Dataset provided by
HydroShare
Authors
Elkin Romero
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
May 19, 2025
Description
The World Hydrological Observing System (WHOS), operating under the World Meteorological Organization (WMO) Data Policy, serves as a global gateway for the standardized exchange of hydrological, meteorological, and climate-related environmental data. Designed to uphold principles of open access and transparency, WHOS eliminates the need for centralized data storage by dynamically linking users to original data providers—such as national hydrometeorological agencies, research institutions, and monitoring networks—through its advanced Discovery and Access Broker (DAB) technology. This middleware framework harmonizes disparate data formats and protocols (e.g., OGC WaterML 2.0, ISO metadata standards), enabling seamless interoperability across geographic and institutional boundaries. Users gain real-time access to critical datasets, including river discharge, groundwater levels, and precipitation trends, while adhering to strict Terms of Use that prohibit unauthorized commercial exploitation, mandate attribution to source agencies in publications or downstream services, and require acknowledgment of inherent risks (e.g., data latency, sensor inaccuracies).

The WMO explicitly disclaims liability for decisions or damages arising from data use, emphasizing user responsibility to verify data quality and applicability. Terms are subject to change, potentially altering access permissions or usage rights, necessitating regular policy reviews by stakeholders. By prioritizing decentralized governance and FAIR (Findable, Accessible, Interoperable, Reusable) data principles, WHOS empowers global collaboration in addressing water-related challenges, from transboundary basin management to climate adaptation strategies, while safeguarding data sovereignty and intellectual property rights of contributing entities.
d
Data from: The mechanics of predator-prey interactions: first principles of...
datadryad.org
data.niaid.nih.gov
+1more
zip
Updated Dec 11, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sebastien Portalier; Gregor Fussmann; Michel Loreau; Mehdi Cherif (2018). The mechanics of predator-prey interactions: first principles of physics predict predator-prey size ratios [Dataset]. http://doi.org/10.5061/dryad.8c40mb0
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5061/dryad.8c40mb0
Dataset updated
Dec 11, 2018
Dataset provided by
Dryad
Authors
Sebastien Portalier; Gregor Fussmann; Michel Loreau; Mehdi Cherif
Time period covered
Nov 22, 2018
Description
READMEThis file explains all the variables and provides full references for the data in each of the datasets that accompany: Portalier S., Fussmann G. F., Loreau M. & Cherif M., 2018ms, The mechanics of predator-prey interactions: first principles of physics predict predator-prey size ratios.Predator-prey species-based dataThe file provides average body masses for predators and prey, across a wide range of sizes and different life media.Portalier_etal_2018_Predator_Prey_Species_Based_Data.csvPredator-prey individual-based dataThe file provides individual body masses of predators and prey in marine food webs.Portalier_etal_2018_Predator_Prey_Individual_Based_Data.csv
Keywords to identify general-purpose databases.
plos.figshare.com
csv
Updated Nov 18, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ahmad Sofi-Mahmudi; Eero Raittio; Yeganeh Khazaei; Javed Ashraf; Falk Schwendicke; Sergio E. Uribe; David Moher (2024). Keywords to identify general-purpose databases. [Dataset]. http://doi.org/10.1371/journal.pone.0313991.s004
Explore at:
csvAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0313991.s004
Dataset updated
Nov 18, 2024
Dataset provided by
PLOShttp://plos.org/
Authors
Ahmad Sofi-Mahmudi; Eero Raittio; Yeganeh Khazaei; Javed Ashraf; Falk Schwendicke; Sergio E. Uribe; David Moher
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
BackgroundAccording to the FAIR principles (Findable, Accessible, Interoperable, and Reusable), scientific research data should be findable, accessible, interoperable, and reusable. The COVID-19 pandemic has led to massive research activities and an unprecedented number of topical publications in a short time. However, no evaluation has assessed whether this COVID-19-related research data has complied with FAIR principles (or FAIRness).ObjectiveOur objective was to investigate the availability of open data in COVID-19-related research and to assess compliance with FAIRness.MethodsWe conducted a comprehensive search and retrieved all open-access articles related to COVID-19 from journals indexed in PubMed, available in the Europe PubMed Central database, published from January 2020 through June 2023, using the metareadr package. Using rtransparent, a validated automated tool, we identified articles with links to their raw data hosted in a public repository. We then screened the link and included those repositories that included data specifically for their pertaining paper. Subsequently, we automatically assessed the adherence of the repositories to the FAIR principles using FAIRsFAIR Research Data Object Assessment Service (F-UJI) and rfuji package. The FAIR scores ranged from 1–22 and had four components. We reported descriptive analysis for each article type, journal category, and repository. We used linear regression models to find the most influential factors on the FAIRness of data.Results5,700 URLs were included in the final analysis, sharing their data in a general-purpose repository. The mean (standard deviation, SD) level of compliance with FAIR metrics was 9.4 (4.88). The percentages of moderate or advanced compliance were as follows: Findability: 100.0%, Accessibility: 21.5%, Interoperability: 46.7%, and Reusability: 61.3%. The overall and component-wise monthly trends were consistent over the follow-up. Reviews (9.80, SD = 5.06, n = 160), articles in dental journals (13.67, SD = 3.51, n = 3) and Harvard Dataverse (15.79, SD = 3.65, n = 244) had the highest mean FAIRness scores, whereas letters (7.83, SD = 4.30, n = 55), articles in neuroscience journals (8.16, SD = 3.73, n = 63), and those deposited in GitHub (4.50, SD = 0.13, n = 2,152) showed the lowest scores. Regression models showed that the repository was the most influential factor on FAIRness scores (R2 = 0.809).ConclusionThis paper underscored the potential for improvement across all facets of FAIR principles, specifically emphasizing Interoperability and Reusability in the data shared within general repositories during the COVID-19 pandemic.
o
US EPA WATERS Geoviewer Map Servcies
oregonwaterdata.org
hub.arcgis.com
Updated Feb 14, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Oregon ArcGIS Online (2025). US EPA WATERS Geoviewer Map Servcies [Dataset]. https://www.oregonwaterdata.org/maps/0aba6097a8c54c26beb6427871bb4753
Explore at:
Dataset updated
Feb 14, 2025
Dataset authored and provided by
Oregon ArcGIS Online
Area covered
Description
The EPA Office of Water’s Watershed Assessment, Tracking and Environmental Results system (WATERS) integrates water-related information by linking it to the NHDPlus stream network. The National Hydrgraphy Dataset Plus (NHDPlus) provides the underlying geospatial hydrologic framework that supports a variety of network-based capabilities including upstream/downstream search and watershed delineation. The WATERS GeoViewer provides easy access to these data and capabilities via the Internet on any desktop or mobile device. It implements the concepts and principles of the Open Water Data Initiative, including the hydrologic Network Linked Data Index.
L
KEES Ontology
liveschema.eu
csv, rdf, ttl
Updated Dec 17, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Linked Open Vocabulary (2020). KEES Ontology [Dataset]. http://liveschema.eu/dataset/cue/lov_kees
Explore at:
ttl, rdf, csvAvailable download formats
Dataset updated
Dec 17, 2020
Dataset provided by
Linked Open Vocabulary
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
KEES (Knowledge Exchange Engine Schema ) ontology describes a knowledge base configuration in terms of ABox and TBox statements together with their accrual and reasoning policies. This vocabulary is designed to drive automatic data ingestion in a graph database according KEES and Linked (Open) Data principles. @en
f
Data from: Metadata Standard
fairsharing.org
Updated Jun 28, 2017
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
University of Oxford, Dept. of Engineering Science, Data Readiness Group (2017). Metadata Standard [Dataset]. https://fairsharing.org/
Explore at:
Dataset updated
Jun 28, 2017
Dataset authored and provided by
University of Oxford, Dept. of Engineering Science, Data Readiness Group
License
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Description
A manually curated registry of standards, split into three types - Terminology Artifacts (ontologies, e.g. Gene Ontology), Models and Formats (conceptual schema, formats, data models, e.g. FASTA), and Reporting Guidelines (e.g. the ARRIVE guidelines for in vivo animal testing). These are linked to the databases that implement them and the funder and journal publisher data policies that recommend or endorse their use.

Facebook

Twitter

Click to copy link

Link copied

Cite

Luciane Lena Pessanha Monteiro; Mark Douglas de Azevedo Jacyntho (2023). Use of Linked Data principles for semantic management of scanned documents [Dataset]. http://doi.org/10.6084/m9.figshare.7512719.v1

Data from: Use of Linked Data principles for semantic management of scanned documents

Explore at:

jpegAvailable download formats

Unique identifier

https://doi.org/10.6084/m9.figshare.7512719.v1

Dataset updated

Jun 1, 2023

Dataset provided by

SciELO journals

Authors

Luciane Lena Pessanha Monteiro; Mark Douglas de Azevedo Jacyntho

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

The study addresses the use of the Semantic Web and Linked Data principles proposed by the World Wide Web Consortium for the development of Web application for semantic management of scanned documents. The main goal is to record scanned documents describing them in a way the machine is able to understand and process them, filtering content and assisting us in searching for such documents when a decision-making process is in course. To this end, machine-understandable metadata, created through the use of reference Linked Data ontologies, are associated to documents, creating a knowledge base. To further enrich the process, (semi)automatic mashup of these metadata with data from the new Web of Linked Data is carried out, considerably increasing the scope of the knowledge base and enabling to extract new data related to the content of stored documents from the Web and combine them, without the user making any effort or perceiving the complexity of the whole process.

Clear search

Close search

Google apps

Main menu

Data from: Use of Linked Data principles for semantic management of scanned...

Data from: The FAIR Assessment Conundrum: Reflections on Tools and Metrics -...

Data from: DATA QUALITY ON THE WEB: INTEGRATIVE REVIEW OF PUBLICATION...

LinkedCT

Linked Railway Data Project

About

Towards FAIRer Biological Knowledge Networks Using a Hybrid Linked Data and...

Smarter open government data for Society 5.0: analysis of 51 OGD portals

USPTO patents data

Data from: Knowledge graphs in BERD and in NFDI

ckanext-data-depositario

Data from: Eagle I

Data from: Supporting Scientometric Studies with Linked Open Data

Data from: INGRIDKG: A FAIR Knowledge Graph of Graffiti

Higher Education Institutions in the USA

Higher Education Institutions in the United States of America Dataset

Data

Methodology

Usage

Contribution

Acknowledgment

References

Data from: Water Data Explorer

Data from: The mechanics of predator-prey interactions: first principles of...

Keywords to identify general-purpose databases.

US EPA WATERS Geoviewer Map Servcies

KEES Ontology

Data from: Metadata Standard

Data from: Use of Linked Data principles for semantic management of scanned documents