56 datasets found
  1. f

    Data from: Use of Linked Data principles for semantic management of scanned...

    • scielo.figshare.com
    jpeg
    Updated Jun 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Luciane Lena Pessanha Monteiro; Mark Douglas de Azevedo Jacyntho (2023). Use of Linked Data principles for semantic management of scanned documents [Dataset]. http://doi.org/10.6084/m9.figshare.7512719.v1
    Explore at:
    jpegAvailable download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    SciELO journals
    Authors
    Luciane Lena Pessanha Monteiro; Mark Douglas de Azevedo Jacyntho
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The study addresses the use of the Semantic Web and Linked Data principles proposed by the World Wide Web Consortium for the development of Web application for semantic management of scanned documents. The main goal is to record scanned documents describing them in a way the machine is able to understand and process them, filtering content and assisting us in searching for such documents when a decision-making process is in course. To this end, machine-understandable metadata, created through the use of reference Linked Data ontologies, are associated to documents, creating a knowledge base. To further enrich the process, (semi)automatic mashup of these metadata with data from the new Web of Linked Data is carried out, considerably increasing the scope of the knowledge base and enabling to extract new data related to the content of stored documents from the Web and combine them, without the user making any effort or perceiving the complexity of the whole process.

  2. Data from: The FAIR Assessment Conundrum: Reflections on Tools and Metrics -...

    • data.europa.eu
    • data.niaid.nih.gov
    • +1more
    unknown
    Updated Jul 3, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Zenodo (2025). The FAIR Assessment Conundrum: Reflections on Tools and Metrics - Data Set [Dataset]. https://data.europa.eu/data/datasets/oai-zenodo-org-10986748?locale=cs
    Explore at:
    unknown(7186)Available download formats
    Dataset updated
    Jul 3, 2025
    Dataset authored and provided by
    Zenodohttp://zenodo.org/
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Data sets accompanying the paper "The FAIR Assessment Conundrum: Reflections on Tools and Metrics", an analysis of a comprehensive set of FAIR assessment tools and the metrics used by these tools for the assessment. The data set "metrics.csv" consists of the metrics collected from several sources linked to the analysed FAIR assessments tools. It is structured into 11 columns: (i) tool_id, (ii) tool_name, (iii) metric_discarded, (iv) metric_fairness_scope_declared, (v) metric_fairness_scope_observed, (vi) metric_id, (vii) metric_text, (viii) metric_technology, (ix) metric_approach, (x) last_accessed_date, and (xi) provenance. The columns tool_id and tool_name are used for the identifier we assigned to each tool analysed and the full name of the tool respectively. The metric_discarded column refers to the selection we operated on the collected metrics, since we excluded the metrics created for testing purposes or written in a language different from English. The possible values are boolean. We assigned TRUE if the metric was discarded. The columns metric_fairness_scope_declared and metric_fairness_scope_observed are used for indicating the declared intent of the metrics, with respect to the FAIR principle assessed, and the one we observed respectively. Possible values are: (a) a letter of the FAIR acronym (for the metrics without a link declared to a specific FAIR principle), (b) one or more identifiers of the FAIR principles (F1, F2…), (c) n/a, if no FAIR references were declared, or (d) none, if no FAIR references were observed. The metric_id and metric_text columns are used for the identifiers of the metrics and the textual and human-oriented content of the metrics respectively. The column metric_technology is used for enumerating the technologies (a term used in its widest acceptation) mentioned or used by the metrics for the specific assessment purpose. Such technologies include very diverse typologies ranging from (meta)data formats to standards, semantic technologies, protocols, and services. For tools implementing automated assessments, the technologies listed take into consideration also the available code and documentation, not just the metric text. The column metric_approach is used for identifying the type of implementation observed in the assessments. The identification of the implementation types followed a bottom-to-top approach applied to the metrics organised by the metric_fairness_scope_declared values. Consequently, while the labels used for creating the implementation type strings are the same, their combination and specialisation varies based on the characteristics of the actual set of metrics analysed. The main labels used are: (a) 3rd party service-based, (b) documentation-centred, (c) format-centred, (d) generic, (e) identifier-centred, (f) policy-centred, (g) protocol-centred, (h) metadata element-centred, (i) metadata schema-centred, (j) metadata value-centred, (k) service-centred, and (l) na. The columns provenance and last_accessed_date are used for the main source of information about each metric (at least with regard to the text) and the date we last accessed it respectively. The data set "classified_technologies.csv" consists of the technologies mentioned or used by the metrics for the specific assessment purpose. It is structured into 3 columns: (i) technology, (ii) class, and (iii) discarded. The column technology is used for the names of the different technologies mentioned or used by the metrics. The column class is used for specifying the type of technology used. Possible values are: (a) application programming interface, (b) format, (c) identifier, (d) library, (e) licence, (f) protocol, (g) query language, (h) registry, (i) repository, (j) search engine, (k) semantic artefact, and (l) service. The discarded column refers to the exclusion of the value 'linked data' from the accepted technologies since it is too generic. The possible values are boolean. We assigned TRUE if the technology was discarded.

  3. Data from: DATA QUALITY ON THE WEB: INTEGRATIVE REVIEW OF PUBLICATION...

    • scielo.figshare.com
    tiff
    Updated May 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Morgana Carneiro de Andrade; Maria José Baños Moreno; Juan-Antonio Pastor-Sánchez (2023). DATA QUALITY ON THE WEB: INTEGRATIVE REVIEW OF PUBLICATION GUIDELINES [Dataset]. http://doi.org/10.6084/m9.figshare.22815541.v1
    Explore at:
    tiffAvailable download formats
    Dataset updated
    May 30, 2023
    Dataset provided by
    SciELOhttp://www.scielo.org/
    Authors
    Morgana Carneiro de Andrade; Maria José Baños Moreno; Juan-Antonio Pastor-Sánchez
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    ABSTRACT The exponential increase of published data and the diversity of systems require the adoption of good practices to achieve quality indexes that enable discovery, access, and reuse. To identify good practices, an integrative review was used, as well as procedures from the ProKnow-C methodology. After applying the ProKnow-C procedures to the documents retrieved from the Web of Science, Scopus and Library, Information Science & Technology Abstracts databases, an analysis of 31 items was performed. This analysis allowed observing that in the last 20 years the guidelines for publishing open government data had a great impact on the Linked Data model implementation in several domains and currently the FAIR principles and the Data on the Web Best Practices are the most highlighted in the literature. These guidelines presents orientations in relation to various aspects for the publication of data in order to contribute to the optimization of quality, independent of the context in which they are applied. The CARE and FACT principles, on the other hand, although they were not formulated with the same objective as FAIR and the Best Practices, represent great challenges for information and technology scientists regarding ethics, responsibility, confidentiality, impartiality, security, and transparency of data.

  4. n

    LinkedCT

    • neuinfo.org
    • dknet.org
    • +2more
    Updated Oct 17, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). LinkedCT [Dataset]. http://identifiers.org/RRID:SCR_004585
    Explore at:
    Dataset updated
    Oct 17, 2024
    Description

    THIS RESOURCE IS NO LONGER IN SERVICE. Documented on January 11, 2023.The Linked Clinical Trials (LinkedCT) project aims at publishing the first open Semantic Web data source for clinical trials data. The data exposed by LinkedCT is generated by (1) transforming existing data sources of clinical trials into RDF, and (2) discovering links between the records in the trials data and several other data sources. You may download static data dumps. The LinkedCT data space is published according to the principles of publishing Linked Data. These principles greatly enhance adaptability and usability of data on the web. Each entity in LinkedCT is identified by a unique HTTP dereferenceable Uniform Resource Identifier (URI). When the URI is looked up, related RDF statements about the entity is returned in HTML or RDF/XML based on the user''s agent. Moreover, a SPARQL endpoint is provided as the standard access method for RDF data.

  5. w

    Linked Railway Data Project

    • data.wu.ac.at
    xhtml
    Updated Jul 30, 2016
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Linking Open Data (2016). Linked Railway Data Project [Dataset]. https://data.wu.ac.at/schema/datahub_io/NmU5YjExYjAtM2YyMy00Yzk0LWE0ODEtZTFlYmE1MzgwNzUw
    Explore at:
    xhtmlAvailable download formats
    Dataset updated
    Jul 30, 2016
    Dataset provided by
    Linking Open Data
    Description

    About

    Bringing together data on the United Kingdom's railway network under linked data principles.

  6. D

    Towards FAIRer Biological Knowledge Networks Using a Hybrid Linked Data and...

    • ckan.grassroots.tools
    html, pdf
    Updated Aug 7, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rothamsted Research (2019). Towards FAIRer Biological Knowledge Networks Using a Hybrid Linked Data and Graph Database Approach [Dataset]. https://ckan.grassroots.tools/dataset/571131d4-08bf-41cc-ad4a-a6605bd05e37
    Explore at:
    html, pdfAvailable download formats
    Dataset updated
    Aug 7, 2019
    Dataset provided by
    Rothamsted Research
    License

    Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
    License information was derived automatically

    Description

    jats:titleAbstract/jats:title jats:pThe speed and accuracy of new scientific discoveries – be it by humans or artificial intelligence – depends on the quality of the underlying data and on the technology to connect, search and share the data efficiently. In recent years, we have seen the rise of graph databases and semi-formal data models such as knowledge graphs to facilitate software approaches to scientific discovery. These approaches extend work based on formalised models, such as the Semantic Web. In this paper, we present our developments to connect, search and share data about genome-scale knowledge networks (GSKN). We have developed a simple application ontology based on OWL/RDF with mappings to standard schemas. We are employing the ontology to power data access services like resolvable URIs, SPARQL endpoints, JSON-LD web APIs and Neo4j-based knowledge graphs. We demonstrate how the proposed ontology and graph databases considerably improve search and access to interoperable and reusable biological knowledge (i.e. the FAIRness data principles)./jats:p

  7. Z

    Smarter open government data for Society 5.0: analysis of 51 OGD portals

    • data.niaid.nih.gov
    • data-staging.niaid.nih.gov
    • +1more
    Updated Aug 4, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Anastasija Nikiforova (2021). Smarter open government data for Society 5.0: analysis of 51 OGD portals [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_5142244
    Explore at:
    Dataset updated
    Aug 4, 2021
    Dataset provided by
    University of Latvia
    Authors
    Anastasija Nikiforova
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset contains data collected during a study "Smarter open government data for Society 5.0: are your open data smart enough" (Sensors. 2021; 21(15):5204) conducted by Anastasija Nikiforova (University of Latvia). It being made public both to act as supplementary data for "Smarter open government data for Society 5.0: are your open data smart enough" paper and in order for other researchers to use these data in their own work.

    The data in this dataset were collected in the result of the inspection of 60 countries and their OGD portals (total of 51 OGD portal in May 2021) to find out whether they meet the trends of Society 5.0 and Industry 4.0 obtained by conducting an analysis of relevant OGD portals.

    Each portal has been studied starting with a search for a data set of interest, i.e. “real-time”, “sensor” and “covid-19”, follwing by asking a list of additional questions. These questions were formulated on the basis of combination of (1) crucial open (government) data-related aspects, including open data principles, success factors, recent studies on the topic, PSI Directive etc., (2) trends and features of Society 5.0 and Industry 4.0, (3) elements of the Technology Acceptance Model (TAM) and the Unified Theory of Acceptance and Use Model (UTAUT).

    The method used belongs to typical / daily tasks of open data portals sometimes called “usability test” – keywords related to a research question are used to filter data sets, i.e. “real-time”, “real time” and “real time”, “sensor”, covid”, “covid-19”, “corona”, “coronavirus”, “virus”. In most cases, “real-time”, “sensor” and “covid” keywords were sufficient. The examination of the respective aspects for less user-friendly portals was adapted to particular case based on the portal or data set specifics, by checking: 1. are the open data related to the topic under question ({sensor; real-time; Covid-19}) published, i.e. available? 2. are these data available in a machine-readable format? 3. are these data current, i.e. regularly updated? Where the criteria on the currency depends on the nature of data, i.e. Covid-19 data on the number of cases per day is expected to be updated daily, which won’t be sufficient for real-time data as the title supposes etc. 4. is API ensured for these data? having most importance for real-time and sensor data; 5. have they been published in a timely manner? which was verified mainly for Covid-19 related data. The timeliness is assessed by comparing the dates of the first case identified in a given country and the first release of open data on this topic. 6. what is the total number of available data sets? 7. does the open government data portal provides use-cases / showcases?
    8. does the open government portal provide an opportunity to gain insight into the popularity of the data, i.e. does the portal provide statistics of this nature, such as the number of views, downloads, reuses, rating etc.? 9. is there an opportunity to provide a feedback, comment, suggestion or complaint? 10. (9a) is the artifact, i.e. feedback, comment, suggestion or complaint, visible to other users?

    Format of the file .xls, .ods, .csv (for the first spreadsheet only)

    Licenses or restrictions CC-BY

    For more info, see README.txt

  8. USPTO patents data

    • figshare.com
    zip
    Updated Oct 28, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mofeed Hassan (2018). USPTO patents data [Dataset]. http://doi.org/10.6084/m9.figshare.5970925.v9
    Explore at:
    zipAvailable download formats
    Dataset updated
    Oct 28, 2018
    Dataset provided by
    Figsharehttp://figshare.com/
    figshare
    Authors
    Mofeed Hassan
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    A patent is a set of exclusive rights granted to an inventor by a sovereign state for a solution, be it a product or a process, for a solution to a particular technological problem. The United States Patent and Trademark Office (USPTO) is part of the US department of Commerce that provides patents to businesses and inventors for their inventions in addition to registration of products and intellectual property identification. Each year, the USPTO grants over 150,000 patents to individuals and companies all over the world. As of December 2011, 8,743,423 patents have been issued and 16,020,302 applications have been received. The USPTO patents are accepted in electronic form and are filed as PDF documents. However, the indexing is not perfect and it is cumbersome to search through the PDF documents. Additionally, Google has also made all the patents available for download in XML format, albeit only from the years 2002 to 2015. Thus, we converted this bulk of data (spanning 13 years) from XML to RDF to conform to the Linked Data principles.

  9. Data from: Knowledge graphs in BERD and in NFDI

    • meta4ds.fokus.fraunhofer.de
    pdf, unknown
    Updated Nov 28, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Zenodo (2022). Knowledge graphs in BERD and in NFDI [Dataset]. https://meta4ds.fokus.fraunhofer.de/datasets/oai-zenodo-org-7373258?locale=en
    Explore at:
    pdf(5348053), unknownAvailable download formats
    Dataset updated
    Nov 28, 2022
    Dataset authored and provided by
    Zenodohttp://zenodo.org/
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Knowledge graphs are able to capture, enrich and disseminate research data objects so that the FAIR and Linked Data principles are fulfilled. How knowledge graphs can improve the domain-specific (BERD) and cross-domain (NFDI) research data infrastructures? The answer is based on the use cases in BERD@NFDI and on activities of the NFDI working group “Knowledge graphs”. First, we describe the architecture, knowledge graphs and use cases in BERD@NFDI. Then, we present the NFDI working group “Knowledge Graphs”, its work plan and potential base services.

  10. c

    ckanext-data-depositario

    • catalog.civicdataecosystem.org
    Updated Aug 24, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). ckanext-data-depositario [Dataset]. https://catalog.civicdataecosystem.org/dataset/ckanext-data-depositario
    Explore at:
    Dataset updated
    Aug 24, 2025
    Description

    The ckanext-data-depositario extension customizes CKAN specifically for the depositar research data repository. It contains most of the instance-specific modifications, providing a tailored user experience. Functioning alongside other extensions like ckanext-depositartheme, ckanext-wikidatakeyword, and ckanext-citation, this central extension manages the core customizations required by the depositar instance. Key Features: Core Depositar Customizations: Centralizes the major site-specific modifications for the depositar CKAN instance. This implies handling unique data structures, workflows, or validation rules tailored to the repository's research data focus. Extension Dependency: Operates in conjunction with other specialized extensions, indicating a modular design. This layering enables focused development and maintenance of related features. Theming Support (via ckanext-depositartheme): Integration with a dedicated theming extension allows consistent branding and user interface customization specific to depositar. This ensures the visual identity aligns with the repository's goals. Wikidata Integration (via ckanext-wikidatakeyword): Enables the use of Wikidata for keyword management, which enriches metadata with linked data principles and improves discoverability by linking datasets to Wikidata concepts. Citation Management (via ckanext-citation): Facilitates the display and export of dataset citations, acknowledging the research effort in creating and sharing data. This feature supports academic standards and ensures proper data attribution. Technical Integration: While detailed integration steps are available in the linked documentation, the extension likely uses CKAN's plugin architecture to modify various aspects of the platform. This includes:

  11. d

    Data from: Eagle I

    • dknet.org
    • neuinfo.org
    • +2more
    Updated Jan 29, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2022). Eagle I [Dataset]. http://identifiers.org/RRID:SCR_013153
    Explore at:
    Dataset updated
    Jan 29, 2022
    Description

    Web application to discover resources available at participating networked universities. This distributed platform for creating and sharing semantically rich data is built around semantic web technologies and follows linked open data principles.

  12. Data from: Supporting Scientometric Studies with Linked Open Data

    • scielo.figshare.com
    jpeg
    Updated May 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sandro Rautenberg; Edgard Marx; Antonio Costa Gomes Filho; Sören Auer (2023). Supporting Scientometric Studies with Linked Open Data [Dataset]. http://doi.org/10.6084/m9.figshare.5931487.v1
    Explore at:
    jpegAvailable download formats
    Dataset updated
    May 30, 2023
    Dataset provided by
    SciELOhttp://www.scielo.org/
    Authors
    Sandro Rautenberg; Edgard Marx; Antonio Costa Gomes Filho; Sören Auer
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    ABSTRACT In Scientometric Studies, measuring scientific indicators is a complex task due to the challenges associated with data collection, organization and linking, especially in the web, where data is distributed in various sources and incompatible formats. These problems can be tackled with the technological and methodological techniques based on the Linked Open Data principles. These principles cover a set of the best practices from the fields of Semantic Web and Open Data to organize, publish and interlink the data on the Web. With the use of those best practices, the data can be accessed and consumed without restrictions, in many applications. This paper addresses the availability of a Qualis historical dataset, according to the mentioned principles. In Scientometric studies, this effort is important for data reuse, taking into the account: measuring an evolution of scientific journals; assisting production of qualitative and quantitative measures of scientific publications; or obtaining relevant information by interlinking and exploring other scientific indicators. The availability of the Qualis dataset is verified by the three use cases. As a result, the Qualis index (historical series 2005-2013) is shared by a web interface for: (i) furthering the data reuse and integration; and (ii) supporting the interoperability and computational processability of the available resources.

  13. Z

    Data from: INGRIDKG: A FAIR Knowledge Graph of Graffiti

    • data.niaid.nih.gov
    • ris.uni-paderborn.de
    • +1more
    Updated Mar 22, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ahmed Sherif, Mohamed; Morim da Silva, Ana Alexandra; Pestryakova, Svetlana; Fathi Ahmed, Abdullah; Niemann, Sven; Ngonga Ngomo, Axel-Cyrille (2023). INGRIDKG: A FAIR Knowledge Graph of Graffiti [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_7559895
    Explore at:
    Dataset updated
    Mar 22, 2023
    Dataset provided by
    Paderborn University
    Authors
    Ahmed Sherif, Mohamed; Morim da Silva, Ana Alexandra; Pestryakova, Svetlana; Fathi Ahmed, Abdullah; Niemann, Sven; Ngonga Ngomo, Axel-Cyrille
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Graffiti is an urban phenomenon that is increasingly attracting the interest of the sciences. To the best of our knowledge, no suitable data corpora are available for systematic research until now. The Information System Graffiti in Germany project (INGRID) closes this gap by dealing with graffiti image collections that have been made available to the project for public use. Within INGRID, the graffiti images are collected, digitized and annotated. With this work, we aim to support the rapid access to a comprehensive data source on INGRID targeted especially by researchers. In particular, we present INGRIDKG, an RDF knowledge graph of annotated graffiti, abides by the Linked Data and FAIR principles. We weekly update INGRIDKG by augmenting the new annotated graffiti to our knowledge graph. Our generation pipeline applies RDF data conversion, link discovery and data fusion approaches to the original data. The current version of INGRIDKG contains 460,640,154 triples and is linked to 3 other knowledge graphs by over 200,000 links. In our use case studies, we demonstrate the usefulness of our knowledge graph for different applications. INGRIDKG is publicly available under the Creative Commons Attribution 4.0 International license.

  14. Higher Education Institutions in the USA

    • kaggle.com
    zip
    Updated Apr 8, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jackson Júnior (2023). Higher Education Institutions in the USA [Dataset]. https://www.kaggle.com/datasets/jacksonbarreto/higher-education-institutions-in-the-usa/data
    Explore at:
    zip(35907 bytes)Available download formats
    Dataset updated
    Apr 8, 2023
    Authors
    Jackson Júnior
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Area covered
    United States
    Description

    Higher Education Institutions in the United States of America Dataset

    This repository contains a dataset of higher education institutions in the United States of America. This dataset was compiled in response to a cybersecurity research of American higher education institutions' websites [1]. The data is being made publicly available to promote open science principles [2].

    Data

    The data includes the following fields for each institution:

    • Id: A unique identifier assigned to each institution.
    • Region: The federal state in which the institution is located.
    • Name: The full name of the institution.
    • Category: Indicates whether the institution is public or private.
    • Url: The website of the institution.

    Methodology

    The dataset was obtained from the Higher Education Integrated Data System (IPEDS) website [3], which is administered by the National Center for Education Statistics (NCES). NCES serves as the primary federal entity for collecting and analyzing education-related data in the United States. The data was collected on February 2, 2023.

    The initial list of institutions was derived from the IPEDS database using the following criteria: (1) US institutions only, (2) degree-granting institutions, primarily bachelor's or higher, and (3) industry classification, which includes: public 4 - year or above, private not-for-profit 4 years or more, private for-profit 4 years or more, public 2 years, private not-for-profit 2 years, private for-profit 2 years, public less than 2 years, private not-for-profit for-profit less than 2 years and private for-profit less than 2 years.

    The following variables have been added to the list of institutions: Control of the institution, state abbreviation, degree-granting status, Status of the institution, and Institution's internet website address. This resulted in a report with 1,979 institutions.

    The institution's status was labeled with the following values: A (Active), N (New), R (Restored), M (Closed in the current year), C (Combined with another institution), D (Deleted out of business), I (Inactive due to hurricane-related issues), O (Outside IPEDS scope), P (Potential new/add institution), Q (Potential institution reestablishment), W (Potential addition outside IPEDS scope), X ( Potential restoration outside the scope of IPEDS) and G (Perfect Children's Campus).

    A filter was applied to the report to retain only institutions with an A, N, or R status, resulting in 1,978 institutions. Finally, a data cleaning process was applied, which involved removing the whitespace at the beginning and end of cell content and duplicate whitespace. The final data were compiled into the dataset included in this repository.

    Usage

    This data is available under the Creative Commons Zero (CC0) license and can be used for any purpose, including academic research purposes. We encourage the sharing of knowledge and the advancement of research in this field by adhering to open science principles [2].

    If you use this data in your research, please cite the source and include a link to this repository. To properly attribute this data, please use the following DOI: 10.5281/zenodo.7614862

    DOI

    Contribution

    If you have any updates or corrections to the data, please feel free to open a pull request or contact us directly. Let's work together to keep this data accurate and up-to-date.

    Acknowledgment

    We would like to acknowledge the support of the Norte Portugal Regional Operational Programme (NORTE 2020), under the PORTUGAL 2020 Partnership Agreement, through the European Regional Development Fund (ERDF), within the project "Cybers SeC IP" (NORTE-01-0145-FEDER-000044). This study was also developed as part of the Master in Cybersecurity Program at the Instituto Politécnico de Viana do Castelo, Portugal.

    References

    1. Pending.
    2. S. Bezjak, A. Clyburne-Sherin, P. Conzett, P. Fernandes, E. Görögh, K. Helbig, B. Kramer, I. Labastida, K. Niemeyer, F. Psomopoulos, T. Ross-Hellauer, R. Schneider, J. Tennant, E. Verbakel, H. Brinken, and L. Heller, Open Science Training Handbook. Zenodo, Apr. 2018. [Online]. Available: [https://doi.org/10.5281/zenodo.1212496]
    3. Integrated Postsecondary Education Data System, "Compare Institutions", Fev 2023. [online]. Available: https://nces.ed.gov/ipeds/use-the-data
  15. H

    Data from: Water Data Explorer

    • hydroshare.org
    • search.dataone.org
    • +1more
    zip
    Updated Nov 13, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Elkin Romero (2025). Water Data Explorer [Dataset]. https://www.hydroshare.org/resource/651e066ac1de434eb7949723143ec154
    Explore at:
    zip(30 bytes)Available download formats
    Dataset updated
    Nov 13, 2025
    Dataset provided by
    HydroShare
    Authors
    Elkin Romero
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    May 19, 2025
    Description

    The World Hydrological Observing System (WHOS), operating under the World Meteorological Organization (WMO) Data Policy, serves as a global gateway for the standardized exchange of hydrological, meteorological, and climate-related environmental data. Designed to uphold principles of open access and transparency, WHOS eliminates the need for centralized data storage by dynamically linking users to original data providers—such as national hydrometeorological agencies, research institutions, and monitoring networks—through its advanced Discovery and Access Broker (DAB) technology. This middleware framework harmonizes disparate data formats and protocols (e.g., OGC WaterML 2.0, ISO metadata standards), enabling seamless interoperability across geographic and institutional boundaries. Users gain real-time access to critical datasets, including river discharge, groundwater levels, and precipitation trends, while adhering to strict Terms of Use that prohibit unauthorized commercial exploitation, mandate attribution to source agencies in publications or downstream services, and require acknowledgment of inherent risks (e.g., data latency, sensor inaccuracies).

    The WMO explicitly disclaims liability for decisions or damages arising from data use, emphasizing user responsibility to verify data quality and applicability. Terms are subject to change, potentially altering access permissions or usage rights, necessitating regular policy reviews by stakeholders. By prioritizing decentralized governance and FAIR (Findable, Accessible, Interoperable, Reusable) data principles, WHOS empowers global collaboration in addressing water-related challenges, from transboundary basin management to climate adaptation strategies, while safeguarding data sovereignty and intellectual property rights of contributing entities.

  16. d

    Data from: The mechanics of predator-prey interactions: first principles of...

    • datadryad.org
    • data.niaid.nih.gov
    • +1more
    zip
    Updated Dec 11, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sebastien Portalier; Gregor Fussmann; Michel Loreau; Mehdi Cherif (2018). The mechanics of predator-prey interactions: first principles of physics predict predator-prey size ratios [Dataset]. http://doi.org/10.5061/dryad.8c40mb0
    Explore at:
    zipAvailable download formats
    Dataset updated
    Dec 11, 2018
    Dataset provided by
    Dryad
    Authors
    Sebastien Portalier; Gregor Fussmann; Michel Loreau; Mehdi Cherif
    Time period covered
    Nov 22, 2018
    Description

    READMEThis file explains all the variables and provides full references for the data in each of the datasets that accompany: Portalier S., Fussmann G. F., Loreau M. & Cherif M., 2018ms, The mechanics of predator-prey interactions: first principles of physics predict predator-prey size ratios.Predator-prey species-based dataThe file provides average body masses for predators and prey, across a wide range of sizes and different life media.Portalier_etal_2018_Predator_Prey_Species_Based_Data.csvPredator-prey individual-based dataThe file provides individual body masses of predators and prey in marine food webs.Portalier_etal_2018_Predator_Prey_Individual_Based_Data.csv

  17. Keywords to identify general-purpose databases.

    • plos.figshare.com
    csv
    Updated Nov 18, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ahmad Sofi-Mahmudi; Eero Raittio; Yeganeh Khazaei; Javed Ashraf; Falk Schwendicke; Sergio E. Uribe; David Moher (2024). Keywords to identify general-purpose databases. [Dataset]. http://doi.org/10.1371/journal.pone.0313991.s004
    Explore at:
    csvAvailable download formats
    Dataset updated
    Nov 18, 2024
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Ahmad Sofi-Mahmudi; Eero Raittio; Yeganeh Khazaei; Javed Ashraf; Falk Schwendicke; Sergio E. Uribe; David Moher
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    BackgroundAccording to the FAIR principles (Findable, Accessible, Interoperable, and Reusable), scientific research data should be findable, accessible, interoperable, and reusable. The COVID-19 pandemic has led to massive research activities and an unprecedented number of topical publications in a short time. However, no evaluation has assessed whether this COVID-19-related research data has complied with FAIR principles (or FAIRness).ObjectiveOur objective was to investigate the availability of open data in COVID-19-related research and to assess compliance with FAIRness.MethodsWe conducted a comprehensive search and retrieved all open-access articles related to COVID-19 from journals indexed in PubMed, available in the Europe PubMed Central database, published from January 2020 through June 2023, using the metareadr package. Using rtransparent, a validated automated tool, we identified articles with links to their raw data hosted in a public repository. We then screened the link and included those repositories that included data specifically for their pertaining paper. Subsequently, we automatically assessed the adherence of the repositories to the FAIR principles using FAIRsFAIR Research Data Object Assessment Service (F-UJI) and rfuji package. The FAIR scores ranged from 1–22 and had four components. We reported descriptive analysis for each article type, journal category, and repository. We used linear regression models to find the most influential factors on the FAIRness of data.Results5,700 URLs were included in the final analysis, sharing their data in a general-purpose repository. The mean (standard deviation, SD) level of compliance with FAIR metrics was 9.4 (4.88). The percentages of moderate or advanced compliance were as follows: Findability: 100.0%, Accessibility: 21.5%, Interoperability: 46.7%, and Reusability: 61.3%. The overall and component-wise monthly trends were consistent over the follow-up. Reviews (9.80, SD = 5.06, n = 160), articles in dental journals (13.67, SD = 3.51, n = 3) and Harvard Dataverse (15.79, SD = 3.65, n = 244) had the highest mean FAIRness scores, whereas letters (7.83, SD = 4.30, n = 55), articles in neuroscience journals (8.16, SD = 3.73, n = 63), and those deposited in GitHub (4.50, SD = 0.13, n = 2,152) showed the lowest scores. Regression models showed that the repository was the most influential factor on FAIRness scores (R2 = 0.809).ConclusionThis paper underscored the potential for improvement across all facets of FAIR principles, specifically emphasizing Interoperability and Reusability in the data shared within general repositories during the COVID-19 pandemic.

  18. o

    US EPA WATERS Geoviewer Map Servcies

    • oregonwaterdata.org
    • hub.arcgis.com
    Updated Feb 14, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Oregon ArcGIS Online (2025). US EPA WATERS Geoviewer Map Servcies [Dataset]. https://www.oregonwaterdata.org/maps/0aba6097a8c54c26beb6427871bb4753
    Explore at:
    Dataset updated
    Feb 14, 2025
    Dataset authored and provided by
    Oregon ArcGIS Online
    Area covered
    Description

    The EPA Office of Water’s Watershed Assessment, Tracking and Environmental Results system (WATERS) integrates water-related information by linking it to the NHDPlus stream network. The National Hydrgraphy Dataset Plus (NHDPlus) provides the underlying geospatial hydrologic framework that supports a variety of network-based capabilities including upstream/downstream search and watershed delineation. The WATERS GeoViewer provides easy access to these data and capabilities via the Internet on any desktop or mobile device. It implements the concepts and principles of the Open Water Data Initiative, including the hydrologic Network Linked Data Index.

  19. L

    KEES Ontology

    • liveschema.eu
    csv, rdf, ttl
    Updated Dec 17, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Linked Open Vocabulary (2020). KEES Ontology [Dataset]. http://liveschema.eu/dataset/cue/lov_kees
    Explore at:
    ttl, rdf, csvAvailable download formats
    Dataset updated
    Dec 17, 2020
    Dataset provided by
    Linked Open Vocabulary
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    KEES (Knowledge Exchange Engine Schema ) ontology describes a knowledge base configuration in terms of ABox and TBox statements together with their accrual and reasoning policies. This vocabulary is designed to drive automatic data ingestion in a graph database according KEES and Linked (Open) Data principles. @en

  20. f

    Data from: Metadata Standard

    • fairsharing.org
    Updated Jun 28, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    University of Oxford, Dept. of Engineering Science, Data Readiness Group (2017). Metadata Standard [Dataset]. https://fairsharing.org/
    Explore at:
    Dataset updated
    Jun 28, 2017
    Dataset authored and provided by
    University of Oxford, Dept. of Engineering Science, Data Readiness Group
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Description

    A manually curated registry of standards, split into three types - Terminology Artifacts (ontologies, e.g. Gene Ontology), Models and Formats (conceptual schema, formats, data models, e.g. FASTA), and Reporting Guidelines (e.g. the ARRIVE guidelines for in vivo animal testing). These are linked to the databases that implement them and the funder and journal publisher data policies that recommend or endorse their use.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Luciane Lena Pessanha Monteiro; Mark Douglas de Azevedo Jacyntho (2023). Use of Linked Data principles for semantic management of scanned documents [Dataset]. http://doi.org/10.6084/m9.figshare.7512719.v1

Data from: Use of Linked Data principles for semantic management of scanned documents

Related Article
Explore at:
jpegAvailable download formats
Dataset updated
Jun 1, 2023
Dataset provided by
SciELO journals
Authors
Luciane Lena Pessanha Monteiro; Mark Douglas de Azevedo Jacyntho
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

The study addresses the use of the Semantic Web and Linked Data principles proposed by the World Wide Web Consortium for the development of Web application for semantic management of scanned documents. The main goal is to record scanned documents describing them in a way the machine is able to understand and process them, filtering content and assisting us in searching for such documents when a decision-making process is in course. To this end, machine-understandable metadata, created through the use of reference Linked Data ontologies, are associated to documents, creating a knowledge base. To further enrich the process, (semi)automatic mashup of these metadata with data from the new Web of Linked Data is carried out, considerably increasing the scope of the knowledge base and enabling to extract new data related to the content of stored documents from the Web and combine them, without the user making any effort or perceiving the complexity of the whole process.

Search
Clear search
Close search
Google apps
Main menu