CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
EASY metadata export to LOD in RDF 1.1 Turtle standard. Tested with Timbuctoo repository (v5.0.0-pre8) developed and maintained by Huygens ING.
https://borealisdata.ca/api/datasets/:persistentId/versions/2.1/customlicense?persistentId=doi:10.7939/DVN/URXSGChttps://borealisdata.ca/api/datasets/:persistentId/versions/2.1/customlicense?persistentId=doi:10.7939/DVN/URXSGC
This is a set of RDF/XML metadata for cultural heritage resources from institutions who participated in the Out of the Trenches proof-of-concept project. More details on the project can be found at http://www.canadiana.ca/pcdhn-lod. PCDHN-LOD-2012.7z contains the entire set of data, including descriptive data for the resources as well as rdf expressions of entities (events, concepts, organizations, persons, and works).
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Current approaches to identifying drug-drug interactions (DDIs), which involve clinical evaluation of drugs and post-marketing surveillance, are unable to provide complete, accurate information, nor do they alert the public to potentially dangerous DDIs before the drugs reach the market. Predicting potential drug-drug interaction helps reduce unanticipated drug interactions and drug development costs and optimizes the drug design process. Many bioinformatics databases have begun to present their data as Linked Open Data (LOD), a graph data model, using Semantic Web technologies. The knowledge graphs provide a powerful model for defining the data, in addition to making it possible to use underlying graph structure for extraction of meaningful information. In this work, we have applied Knowledge Graph (KG) Embedding approaches to extract feature vector representation of drugs using LOD to predict potential drug-drug interactions. We have investigated the effect of different embedding methods on the DDI prediction and showed that the knowledge embeddings are powerful predictors and comparable to current state-of-the-art methods for inferring new DDIs. We have applied Logistic Regression, Naive Bayes and Random Forest on Drugbank KG with the 10-fold traditional cross validation (CV) using RDF2Vec, TransE and TransD. RDF2Vec with uniform weighting surpass other embedding methods.
http://dcat-ap.ch/vocabulary/licenses/terms_openhttp://dcat-ap.ch/vocabulary/licenses/terms_open
Swissbib gathers metadata from more than 900 libraries in Switzerland and therefore contains the description of more than 25 million documents. The metadata describing these documents has been transformed into a semantic format based on the RDF model (Resource Description Framework). The data is linked with other datasets on the web (Viaf, GND, DBPedia).
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
LetterSampo correspSearch Knowledge Graph
The documentation of LetterSampo correspSearch dataset is available on Linked Data Finland page and explained in more detail on the project page.
This dataset is available on a public SPARQL endpoint (https://ldf.fi/corresp/sparql).
This is a collection item referencing the following EPA Linked Data resources: - EPA Facility Registry Service (FRS) - EPA Substance Registry Service (SRS) - Resource Conservation and Recovery Act Handlers (RCRA) - Toxic Substance Control Act Chemical Data Reporting (CDR) - Toxics Release Inventory (TRI) All of these resources are available at the following location: https://gaftp.epa.gov/EPADataCommons/OEI/LinkedData/US%20EPA%20Linked%20Data%20-%20Posted%2020170425/ md5 checksums are available here: https://gaftp.epa.gov/EPADataCommons/OEI/LinkedData/US%20EPA%20Linked%20Data%20-%20Posted%2020170425/md5-checksums.txt
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
This dataset contains strictly quality-controlled rich information on events, actors and places related to the First World War. As such, it is meant to be used as a reference dataset to which other datasets (e.g. museum or library collections dealing with WW1 topics) can be linked.
https://joinup.ec.europa.eu/software/page/eupl/licence-euplhttps://joinup.ec.europa.eu/software/page/eupl/licence-eupl
El URIHandler es el componente de la infraestructura Linked OpenData de opendata euskadi responsable de:
ITS JPO's Connected Vehicle Pilot Deployment Program integrates connected vehicle research concepts into practical and effective elements to enhance existing operational capabilities. Data were collected throughout each pilot to facilitate independent evaluations of the use of connected vehicle technology on real roadways. To encourage additional study and reuse of these data, ITS DataHub has partnered with each pilot site to make sanitized and anonymized tabular and non-tabular data from these projects available to the public. This article gives you a brief overview of what each pilot focused on and what types of CV Pilot data and tools are available on ITS DataHub.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
For each of the sets of answers for each of the 50 questions from the accompanying benchmark, this gives the similarity scores between each pair of answers for each question.
This resource consists of the Chemical Data Reporting database that supports the Toxic Substances Control Act (TSCA) of 1976, which provides EPA with authority to require reporting, record-keeping and testing requirements, and restrictions relating to chemical substances and/or mixtures. Certain substances are generally excluded from TSCA, including, among others, food, drugs, cosmetics and pesticides. TSCA addresses the production, importation, use, and disposal of specific chemicals including polychlorinated biphenyls (PCBs), asbestos, radon and lead-based paint. Additional information about the Chemical Data Reporting program is available here: https://www.epa.gov/chemical-data-reporting This data resource consists of a collection of triple files, compressed via gzip (.gz) along with schema and vocabulary documents (.ttl)
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
A complete catalogue of RDF data of all previous legislatures, from I legislature of the Kingdom of Sardinia to the current legislature.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Dataset that presents the NGSI-LD entity of Madrid Weather Station adapted to the NGSI-LD standard and the automatic generated metadata compliant with DCAT-AP v2 generated by the NGSIToCKAN and UpdateCKANMetadata Nifi processors and the ckanext-dcatapedp CKAN extension
@misc{conde2024fostering,
title={Fostering the integration of European Open Data into Data Spaces through High-Quality Metadata},
author={Javier Conde and Alejandro Pozo and Andrés Munoz-Arcentales and Johnny Choque and Álvaro Alonso},
year={2024},
eprint={2402.06693},
archivePrefix={arXiv},
primaryClass={cs.DB}
}
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This resource consists of four folders:
The gold_standard folder provides the files consisting of manually evaluated triples. The files were exported from ANNit with the column: LEFT* for the source URI of the triple. RIGHT* for the target URI of the triple. UserChioce for the choice of user when manually evaluated Decision* for the actual decision made by annotator. It can only be unknown, remove, remain. Comment, if any.
The only three columns that matter for the evaluation of removed triples in this project are those labelled with *.
The folder graph_file includes the unweighted graphs, as well as the two sets of weighted graphs: the graphs with counted weights and the graphs with inferred weights (in the subdirectory of counted_weights and inferred_weights subdirectory respectively). The files are compressed in the format of *.gz. Each file consists of two columns of integers as the source and the target. The integers corresponds to the URIs. The corresponding mapping files are in the directory mapping.
Finally, the corresponding files (of unweighted graphs) in WebGraph format are provided. These files were used when evaluating our algorithm against exiting web-scale feedback-arc-set algorithm.
Should there be any problem with these datasets, please feel free to report to us at the following email address: shuai.wang@vu.nl.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The catalogue of the Biblioteca Virtual Miguel de Cervantes contains about 200,000 records which were originally created in compliance with the MARC21 standard. The entries in the catalogue have been recently migrated to a new relational database whose data model adheres to the conceptual models promoted by the International Federation of Library Associations and Institutions (IFLA), in particular, to the FRBR and FRAD specifications.
The database content has been later mapped, by means of an automated procedure, to RDF triples which employ mainly the RDA vocabulary (Resource Description and Access) to describe the entities, as well as their properties and relationships. In contrast to a direct transformation, the intermediate relational model provides tighter control over the process for example through referential integrity, and therefore enhanced validation of the output. This RDF-based semantic description of the catalogue is now accessible online.
http://data.europa.eu/eli/dec/2011/833/ojhttp://data.europa.eu/eli/dec/2011/833/oj
The EU Open Data Portal also can be accessed by the machine readable SPARQL endpoint. SPARQL is a RDF query language, a semantic query language for databases. SPARQL endpoint is a conformant SPARQL protocol service. SPARQL endpoint enables users (human or other) to query a knowledge base via the SPARQL language. Results are typically returned in one or more machine-processable formats.
It is used for manipulation of data, which is able to be stored in Resource Description Framework (RDF) format. SPARQL queries RDF graphs. An RDF graph is a set of triples. For example:
You can find specifications of SPARQL on the W3C web site: http://www.w3.org/TR/rdf-sparql-query/. The models used to describe datasets catalogued on the EU Open Data Portal are described on the ‘Linked data’ page under ‘Metadata vocabulary’.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
NARCIS public metadata export to LOD in RDF 1.1 Turtle standard. Tested with Timbuctoo repository (v5.0.0-pre8) developed and maintained by Huygens ING.
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
The SURE-KG RDF dataset provides a knowledge graph built from a real dataset to represent Real Estate and Uncertain Spatial Data from Advertisements. It relies on natural language processing and machine learning methods for information extraction, and semantic Web frameworks for representation and integration. It describes more than 100K real estate ads and 6K place-names extracted from French Real Estate advertisements from various online advertiser and located in the French Riviera. It can be exploited by real estate search engines, real estate professionals, or geographers willing to analyze local place-names
Homepage: https://github.com/Wimmics/sure
This dataset contains information about populated places and other geographical entities ready to use in current research information system VIVO. It was created at University of Applied Sciences and Arts Hannover, Germany. It is successfully tested with VIVO 1.6 and 1.7.
The main contribution of the survey is to condense and aggregate the experience and knowledge of Linked Data experts and practitioners regarding the most common strategies to reuse vocabularies when modeling Linked Open Data in a real-world scenario.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
EASY metadata export to LOD in RDF 1.1 Turtle standard. Tested with Timbuctoo repository (v5.0.0-pre8) developed and maintained by Huygens ING.