Open Database License (ODbL) v1.0https://www.opendatacommons.org/licenses/odbl/1.0/
License information was derived automatically
940034 nanopublications. These nanopubs were automatically extracted from the DisGeNET dataset. See also the main DisGeNET data on Datahub.
Download the content of this set of nanopublications from the server network using nanopub-java:
$ np get -c -o nanopubs.trig RAXy332hxqHPKpmvPc-wqJA7kgWiWa-QA0DIpr29LIG0Q
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
PKT Human Disease Knowledge Graph Benchmark Builds (v1.0.0) Build Date: September 03, 2019 The KG Benchmark Builds can also be downloaded from Zenodo:👉 KGs: https://doi.org/10.5281/zenodo.7030200👉 Embeddings: https://zenodo.org/record/7030189
Required Input Documents
resource_info.txt class_source_list.txt instance_source_list.txt ontology_source_list.txt
Data Data Download Date: November 30, 2018 Ontologies
Gene Ontology Human Phenotype Ontology Classes
Human Disease Ontology Gene Ontology: gene associations Reactome: gene associations Human Phenotype Ontology: all source annotations - genes to phenotypes Human Phenotype Ontology: all source annotations - diseases to genes to phenotypes Instances
CTD: chemicals-genes CTD: chemicals-pathways CTD: chemicals-diseases CTD: genes-pathways CTD: diseases-pathways STRING DB: Proteins String DB: entrez gene mappings
Knowledge Graphs Knowledge RepresentationWe worked with a PhD-level biologist to develop a knowledge representation (see the figure below) that modeled mechanisms underlying human disease.
To do this, we manually mapped all possible combinations of the following six node types:
Humans Diseases Human Phenotypes Human Genes Gene Ontology concepts Reactome Pathways Chemicals As shown in the figure above, the Basic Formal Ontology and Relation Ontology ontologies were then used to create edges between the node types.
As shown in this figure, the following edge-types were created:
Phenotypes-Genes: The Human Phenotype Ontology (HP) provides phenotype-Entrez gene annotations that were used to map 6,651 HP classes to 120,288 Entrez genes. Phenotypes-Diseases: The HP provides HP-DOID-Gene annotations that were used to map 5,438 HP concepts to 43,817 DOID concepts. Biological processes, Molecular Functions, and Cellular Locations-Genes: The Gene Ontology (GO) provides GO-Gene annotations that were used to map 17,505 GO concepts to 265,002 Entrez genes. Biological processes, Molecular Functions, and Cellular Locations-Pathways-Pathways: Reactome provides GO-Gene links that were used to map 17,906 pathways to 1,910 biological processes, molecular functions, and cellular locations. Chemicals-Pathways: The Comparative Toxicogenomics Database (CTD) provides Chemical-pathway links that were used to map 8,886 MESH concepts to 711,043 Reactome pathways. Chemicals-Genes: The Comparative Toxicogenomics Database (CTD) provides Chemical-Gene links that were used to map 8,881 MESH concepts 410,379 Entrez genes. Chemicals-Diseases: The Comparative Toxicogenomics Database (CTD) provides Chemical-Disease links that were used to map 14,238 MESH concepts 1,216,900 DOID concepts. Genes-Genes: TheSTRING Database provides Gene-Gene links that were used to create 594,100 gene-gene interactions. When generating these mappings, only the inferred protein-protein relationships considered to be high confidence were used (score of 700 or better). Genes-Disease: Mappings between genes and diseases were retrieved from DisGeNet via SPARQL endpoint and used to map 6,051 Entrez genes to 20,452 DOID concepts. Genes-Pathways: The Comparative Toxicogenomics Database (CTD) provides Gene-Pathway links that were used to map 110,370 Entrez genes to 107,029 Reactome pathways. Pathways-Disease: The Comparative Toxicogenomics Database (CTD) provides Pathway-Disease links that were used to map 1,818 Reactome pathways to 106,727 DOID concepts.
Knowledge GraphThe knowledge graph represented above was built using the following steps: Merge Ontologies: Merge ontologies using the OWL Tools APIExpress New Ontology Concept Annotations: Create new ontology annotations by asserting a relation between the instance and an instance of the ontology class. For example to assert the following relations:
Morphine --> is substance that treats --> Migraine We would need to create two axioms:
isSubstanceThatTreats(Morphine, x1) instanceOf(x1, Migraine) While the instance of the HP class hemiplegic migraines can be treated as an anonymous node in the knowledge graph, we generate a new international resource identifier for each newly generated instance. Deductively Close Knowledge Graph: The knowledge graph is deductively closed by using the OWL 2 EL reasoner, ELK via Protégé v5.1.1. ELK is able to classify instances and supports inferences over class hierarchies and object properties. inference over disjointness, intersection, and existential quantification (ontology class hierarchies). Generate Edge List: The final step before exporting the edge list is to remove any nodes that are not biologically meaningful or would otherwise reduce the performance of machine learning algorithms and the algorithm used to generate embeddings.
🚨 AVAILABLE FILES 🚨Available KG benchmark files are zipped and listed below. For additional details on what each file contains, please see the associated Wiki page 👉 here.
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Open Database License (ODbL) v1.0https://www.opendatacommons.org/licenses/odbl/1.0/
License information was derived automatically
940034 nanopublications. These nanopubs were automatically extracted from the DisGeNET dataset. See also the main DisGeNET data on Datahub.
Download the content of this set of nanopublications from the server network using nanopub-java:
$ np get -c -o nanopubs.trig RAXy332hxqHPKpmvPc-wqJA7kgWiWa-QA0DIpr29LIG0Q