An interactive, visual database containing more than 350 small molecule pathways found in humans. More than 2/3 of these pathways (>280) are not found in any other pathway database. SMPDB is designed specifically to support pathway elucidation and pathway discovery in metabolomics, transcriptomics, proteomics and systems biology. It is able to do so, in part, by providing exquisitely detailed, fully searchable, hyperlinked diagrams of human metabolic pathways, metabolic disease pathways, metabolite signaling pathways and drug-action pathways. All SMPDB pathways include information on the relevant organs, subcellular compartments, protein cofactors, protein locations, metabolite locations, chemical structures and protein quaternary structures. Each small molecule is hyperlinked to detailed descriptions contained in the HMDB or DrugBank and each protein or enzyme complex is hyperlinked to UniProt. All SMPDB pathways are accompanied with detailed descriptions and references, providing an overview of the pathway, condition or processes depicted in each diagram. The database is easily browsed and supports full text, sequence and chemical structure searching. Users may query SMPDB with lists of metabolite names, drug names, genes / protein names, SwissProt IDs, GenBank IDs, Affymetrix IDs or Agilent microarray IDs. These queries will produce lists of matching pathways and highlight the matching molecules on each of the pathway diagrams. Gene, metabolite and protein concentration data can also be visualized through SMPDB''s mapping interface. All of SMPDB''s images, image maps, descriptions and tables are downloadable.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
A comprehensive list of databases can be found in Pathguide (http://www.pathguide.org). A, automated curation; B, both manual and automated curation; BIND, Biomolecular Interaction Network Database; BioPP, Biological Pathway Publisher; DIP, Database of Interacting Proteins; EcoCyc, Encyclopaedia of E. coli Genes and Metabolism; GNPV, Genome Network Platform Viewer; HPRD, Human Protein Reference Database; KEGG, Kyoto Encyclopedia of Genes and Genomes; M, manual curation; MetaCyc, a Metabolic Pathway database; MINT, Molecular Interation Database; MIPS, Munich Information Center for Protein Sequences; N, No; OPHID, Online Predicted Human Interaction Database; PANTHER, Protein Analysis through Evolutionary Relationship Database; PID, The Pathway Interaction Database; STKE, Signal Transduction Knowledge Environment, UNIHI, Unified Human Interactome; Y, yes. (61 KB DOC)
GON is a software platform for biological pathway modeling and simulation. It is based on two architectures, hybrid functional Petri net (HFPN) and XML technology. Pathway models of HFPN are also explained in detail. Petri nets provide a method of describing concurrent systems for manufacturing systems and communication protocols and representing biological pathways graphically. Petri Net Pathways includes IL-1,G-protein and TPO signaling pathways as well as a new pathway model of p53 and related genes.
THIS RESOURCE IS NO LONGER IN SERVICE, documented August 23, 2016. KinasePathwayDatabase is an integrated database concerning completed sequenced major eukaryotes, which contains the classification of protein kinases and their functional conservation and orthologous tables among species, protein-protein interaction data, domain information, structural information, and automatic pathway graph image interface. The protein-protein interactions are extracted by natural language processing (NLP) from abstracts using basic word pattern and protein name dictionary GENA: developed by our group. In this system, pathways are easily compared among species using protein interactions data more than 47,000 and orthologous tables.
Database for molecular interaction information integrated with various other bio-entity information, including pathways, diseases, gene ontology (GO) terms, species and molecular types. The information is obtained from several manually curated databases and automatic extraction from literature. There are protein-protein interaction, gene/protein regulation and protein-small molecule interaction information stored in the database. The interaction information is linked with relevant GO terms, pathway, disease and species names. Interactions are also linked to the PubMed IDs of the corresponding abstracts the interactions were obtained from. Manually curated molecular interaction information was obtained from BioGRID, IntAct, NCBI Gene, and STITCH database. Pathway related information was obtained from KEGG database, Pathway Interaction database and Reactome. Disease information was obtained from PharmGKB and KEGG database. Gene ontology terms and related information was obtained from Gene Ontology database and GOA database.
It is divided to four categories based on extracellular signal molecules (Growth factor, Cytokine, and Hormone) and stress, that initiate the intracellular signaling pathway. SPAD is compiled in order to describe information on interaction between protein and protein, protein and DNA as well as information on sequences of DNA and proteins. There are multiple signal transduction pathways: cascade of information from plasma membrane to nucleus in response to an extracellular stimulus in living organisms. Extracellular signal molecule binds specific intracellular receptor, and initiates the signaling pathway. Now, there is a large amount of information about the signaling pathway which controls the gene expression and cellular proliferation. We have developed an integrated database SPAD to understand the overview of signaling transduction.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
RaMP Relational Database of Metabolite Pathways. This is the sqlite version of the database, which is what is used by the current version of the R package, and the website. Going forward, the sqlite version will be the only version distrubted (we are no longer creating MySQL database dumps).The RaMP DB contains metabolite, gene and protein information and mappings from these analyte types to ontologies (HMDB), reactions (Reactome, Wikipedia, and a portion of KEGG) and reactions from rhea-db.org.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
RaMP Relational Database of Metabolite Pathways, MySQL database dump version 2.5.4. The RaMP DB contains metabolite, gene and protein information and mappings from these analyte types to ontologies (HMDB), reactions (Reactome, Wikipedia, and a portion of KEGG) and reactions from rhea-db.org.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
RELEASE V2.1.0 KNOWLEDGE GRAPH: ORIGINAL DATA SOURCES
Release: v2.1.0
The goal of this build was to create a knowledge graph that represented human disease mechanisms and included the central dogma. The data sources utilized in this release include many of the sources used in the initial release, as well as some new data made available by the Comparative Toxicogenomics Database and experimental data from the Human Protein Atlas.
Data sources are listed by type (Ontology and Data not represented in an ontology [Database Sources]). Additional details are provided for each data source below. Please see documentation on the primary release (https://github.com/callahantiff/PheKnowLator/wiki/v2-Data-Sources) for additional details on each data source as well as citation information.
Data Access:
ONTOLOGIES
Cell Ontology
Cell Line Ontology
Chemical Entities of Biological Interest (ChEBI) Ontology
Gene Ontology
Human Phenotype Ontology
Mondo Disease Ontology
Pathway Ontology
Protein Ontology
Relations Ontology
Sequence Ontology
Uber-Anatomy Ontology
Vaccine Ontology
Cell Ontology (CL)
Homepage: GitHub Citation:
Bard J, Rhee SY, Ashburner M. An ontology for cell types. Genome Biology. 2005;6(2):R21
Usage: Utilized to connect transcripts and proteins to cells. Additionally, the edges between this ontology and its dependencies are utilized:
ChEBI
GO
PATO
PRO
RO
UBERON
Cell Line Ontology (CLO)
Homepage: http://www.clo-ontology.org/ Citation:
Sarntivijai S, Lin Y, Xiang Z, Meehan TF, Diehl AD, Vempati UD, Schürer SC, Pang C, Malone J, Parkinson H, Liu Y. CLO: the cell line ontology. Journal of Biomedical Semantics. 2014;5(1):37
Usage: Utilized this ontology to map cell lines to transcripts and proteins. Additionally, the edges between this ontology and its dependencies are utilized:
CL
DOID
NCBITaxon
UBERON
Chemical Entities of Biological Interest (ChEBI)
Homepage: https://www.ebi.ac.uk/chebi/ Citation:
Hastings J, Owen G, Dekker A, Ennis M, Kale N, Muthukrishnan V, Turner S, Swainston N, Mendes P, Steinbeck C. ChEBI in 2016: Improved services and an expanding collection of metabolites. Nucleic Acids Research. 2015;44(D1):D1214-9
Usage: Utilized to connect chemicals to complexes, diseases, genes, GO biological processes, GO cellular components, GO molecular functions, pathways, phenotypes, reactions, and transcripts.
Gene Ontology (GO)
Homepage: http://geneontology.org/ Citations:
Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, Davis AP, Dolinski K, Dwight SS, Eppig JT, Harris MA. Gene ontology: tool for the unification of biology. Nature Genetics. 2000;25(1):25
The Gene Ontology Consortium. The Gene Ontology Resource: 20 years and still GOing strong. Nucleic Acids Research. 2018;47(D1):D330-8
Usage: Utilized to connect biological processes, cellular components, and molecular functions to chemicals, pathways, and proteins. Additionally, the edges between this ontology and its dependencies are utilized:
CL
NCBITaxon
RO
UBERON
Other Gene Ontology Data Used: goa_human.gaf.gz
Human Phenotype Ontology (HPO)
Homepage: https://hpo.jax.org/ Citation:
Köhler S, Carmody L, Vasilevsky N, Jacobsen JO, Danis D, Gourdine JP, Gargano M, Harris NL, Matentzoglu N, McMurry JA, Osumi-Sutherland D. Expansion of the Human Phenotype Ontology (HPO) knowledge base and resources. Nucleic Acids Research. 2018;47(D1):D1018-27
Usage: Utilized to connect phenotypes to chemicals, diseases, genes, and variants. Additionally, the edges between this ontology and its dependencies are utilized:
CL
ChEBI
GO
UBERON
Files
Other Human Phenotype Ontology Data Used: phenotype.hpoa
Mondo Disease Ontology (Mondo)
Homepage: https://mondo.monarchinitiative.org/ Citation:
Mungall CJ, McMurry JA, Köhler S, Balhoff JP, Borromeo C, Brush M, Carbon S, Conlin T, Dunn N, Engelstad M, Foster E. The Monarch Initiative: an integrative data and analytic platform connecting phenotypes to genotypes across species. Nucleic Acids Research. 2017;45(D1):D712-22
Usage: Utilized to connect diseases to chemicals, phenotypes, genes, and variants. Additionally, the edges between this ontology and its dependencies are utilized:
CL
NCBITaxon
GO
HPO
UBERON
Pathway Ontology (PW)
Homepage: rgd.mcw.edu Citation:
Petri V, Jayaraman P, Tutaj M, Hayman GT, Smith JR, De Pons J, Laulederkind SJ, Lowry TF, Nigam R, Wang SJ, Shimoyama M. The pathway ontology–updates and applications. Journal of Biomedical Semantics. 2014;5(1):7.
Usage: Utilized to connect pathways to GO biological processes, GO cellular components, GO molecular functions, Reactome pathways. Several steps are taken in order to connect Pathway Ontology identifiers to Reactome pathways and GO biological processes. To connect Pathway Ontology identifiers to Reactome pathways, we use ComPath Pathway Database Mappings developed by Daniel Domingo-Fernández (PMID:30564458).
Files
Downloaded Mapping Data
curated_mappings.txt
kegg_reactome.csv
Generated Mapping Data
REACTOME_PW_GO_MAPPINGS.txt
Protein Ontology (PRO)
Homepage: https://proconsortium.org/ Citation:
Natale DA, Arighi CN, Barker WC, Blake JA, Bult CJ, Caudy M, Drabkin HJ, D’Eustachio P, Evsikov AV, Huang H, Nchoutmboube J. The Protein Ontology: a structured representation of protein forms and complexes. Nucleic Acids Research. 2010;39(suppl_1):D539-45
Usage: Utilized to connect proteins to chemicals, genes, anatomy, catalysts, cell lines, cofactors, complexes, GO biological processes, GO cellular components, GO molecular functions, pathways, proteins, reactions, and transcripts. Additionally, the edges between this ontology and its dependencies are utilized:
ChEBI
DOID
GO
Notes: A partial, human-only version of this ontology was used. Details on how this version of the ontology was generated can be found under the Protein Ontology section of the Data_Preparation.ipynb Jupyter Notebook.
Files
Generated Human Version Protein Ontology (PRO)
human_pro.owl (closed with hermit reasoner)
Other PRO Data Used: promapping.txt
Generated Mapping Data
Merged Gene, RNA, Protein Map: Merged_gene_rna_protein_identifiers.pkl
Ensembl Transcript-PRO Identifier Mapping: ENSEMBL_TRANSCRIPT_PROTEIN_ONTOLOGY_MAP.txt
Entrez Gene-PRO Identifier Mapping: ENTREZ_GENE_PRO_ONTOLOGY_MAP.txt
UniProt Accession-PRO Identifier Mapping: UNIPROT_ACCESSION_PRO_ONTOLOGY_MAP.txt
STRING-PRO Identifier Mapping: STRING_PRO_ONTOLOGY_MAP.txt
Relations Ontology (RO)
Homepage: GitHub Citation:
Smith B, Ceusters W, Klagges B, Köhler J, Kumar A, Lomax J, Mungall C, Neuhaus F, Rector AL, Rosse C. Relations in biomedical ontologies. Genome Biology. 2005;6(5):R46.
Usage: Utilizing this ontology to connect all data sources in knowledge graph. Additionally, the ontology is queried prior to building the knowledge graph to identify all relations, their inverse properties, and their labels.
Files
Generated RO Data
INVERSE_RELATIONS.txt
RELATIONS_LABELS.txt
Sequence Ontology (SO)
Homepage: GitHub Citation:
Eilbeck K, Lewis SE, Mungall CJ, Yandell M, Stein L, Durbin R, Ashburner M. The Sequence Ontology: a tool for the unification of genome annotations. Genome Biology. 2005;6(5):R44
Usage: Utilized to connect transcripts and other genomic material like genes and variants.
Files
Generated Mapping Data
genomic_sequence_ontology_mappings.xlsx
SO_GENE_TRANSCRIPT_VARIANT_TYPE_MAPPING.txt
Uber-Anatomy Ontology (Uberon)
Homepage: GitHub Citation:
Mungall CJ, Torniai C, Gkoutos GV, Lewis SE, Haendel MA. Uberon, an integrative multi-species anatomy ontology. Genome Biology. 2012;13(1):R5
Usage: Utilized to connect tissues, fluids, and cells to proteins and transcripts. Additionally, the edges between this ontology and its dependencies are utilized:
ChEBI
CL
GO
PRO
Vaccine Ontology (VO)
Homepage: http://www.violinet.org/vaccineontology/ Citations:
He Y, Racz R, Sayers S, Lin Y, Todd T, Hur J, Li X, Patel M, Zhao B, Chung M, Ostrow J. Updates on the web-based VIOLIN vaccine database and analysis system. Nucleic Acids Research. 2013;42(D1):D1124-32
Xiang Z, Todd T, Ku KP, Kovacic BL, Larson CB, Chen F, Hodges AP, Tian Y, Olenzek EA, Zhao B, Colby LA. VIOLIN: vaccine investigation and online information network. Nucleic Acids Research. 2007;36(suppl_1):D923-8
Usage: Utilized the edges between this ontology and its dependencies:
ChEBI
DOID
GO
PRO
UBERON
DATABASE SOURCES
BioPortal
ClinVar
Comparative Toxicogenomics Database
DisGeNET
Ensembl
GeneMANIA
Genotype-Tissue Expression Project
Human Genome Organisation Gene Nomenclature Committee
Human Protein Atlas
National Center for Biotechnology Information Gene
Reactome Pathway Database
Search Tool for Recurring Instances of Neighbouring Genes Database
Universal Protein Resource Knowledgebase
BioPortal
Homepage: BioPortal Citation:
BioPortal. Lexical OWL Ontology Matcher (LOOM)
Ghazvinian A, Noy NF, Musen MA. Creating mappings for ontologies in biomedicine: simple methods work. In AMIA Annual Symposium Proceedings 2009 (Vol. 2009, p. 198). American Medical Informatics Association
Usage: BioPortal was utilized to obtain mappings between MeSH identifiers and ChEBI identifiers for chemicals-diseases, chemicals-genes, chemical-GO biological processes, chemicals-GO cellular components, chemicals-GO molecular functions, chemicals-phenotypes, chemicals-proteins, and chemicals-transcripts. Additional information on how this data was processed can be obtained
Database that provides access to biological systems and their component genes, proteins, and small molecules, as well as literature describing those biosystems and other related data throughout Entrez. A biosystem, or biological system, is a group of molecules that interact directly or indirectly, where the grouping is relevant to the characterization of living matter. BioSystem records list and categorize components, such as the genes, proteins, and small molecules involved in a biological system. The companion FLink tool, in turn, allows you to input a list of proteins, genes, or small molecules and retrieve a ranked list of biosystems. A number of databases provide diagrams showing the components and products of biological pathways along with corresponding annotations and links to literature. This database was developed as a complementary project to (1) serve as a centralized repository of data; (2) connect the biosystem records with associated literature, molecular, and chemical data throughout the Entrez system; and (3) facilitate computation on biosystems data. The NCBI BioSystems Database currently contains records from several source databases: KEGG, BioCyc (including its Tier 1 EcoCyc and MetaCyc databases, and its Tier 2 databases), Reactome, the National Cancer Institute's Pathway Interaction Database, WikiPathways, and Gene Ontology (GO). It includes several types of records such as pathways, structural complexes, and functional sets, and is desiged to accomodate other record types, such as diseases, as data become available. Through these collaborations, the BioSystems database facilitates access to, and provides the ability to compute on, a wide range of biosystems data. If you are interested in depositing data into the BioSystems database, please contact them.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Background and Objective: Hepatocellular carcinoma (HCC) is a highly aggressive malignant tumor of the digestive system worldwide. Chronic hepatitis B virus (HBV) infection and aflatoxin exposure are predominant causes of HCC in China, whereas hepatitis C virus (HCV) infection and alcohol intake are likely the main risk factors in other countries. It is an unmet need to recognize the underlying molecular mechanisms of HCC in China.Methods: In this study, microarray datasets (GSE84005, GSE84402, GSE101685, and GSE115018) derived from Gene Expression Omnibus (GEO) database were analyzed to obtain the common differentially expressed genes (DEGs) by R software. Moreover, the gene ontology (GO) functional annotation and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis were performed by using Database for Annotation, Visualization and Integrated Discovery (DAVID). Furthermore, the protein-protein interaction (PPI) network was constructed, and hub genes were identified by the Search Tool for the Retrieval of Interacting Genes (STRING) and Cytoscape, respectively. The hub genes were verified using Gene Expression Profiling Interactive Analysis (GEPIA), UALCAN, and Kaplan-Meier Plotter online databases were performed on the TCGA HCC dataset. Moreover, the Human Protein Atlas (HPA) database was used to verify candidate genes’ protein expression levels.Results: A total of 293 common DEGs were screened, including 103 up-regulated genes and 190 down-regulated genes. Moreover, GO analysis implied that common DEGs were mainly involved in the oxidation-reduction process, cytosol, and protein binding. KEGG pathway enrichment analysis presented that common DEGs were mainly enriched in metabolic pathways, complement and coagulation cascades, cell cycle, p53 signaling pathway, and tryptophan metabolism. In the PPI network, three subnetworks with high scores were detected using the Molecular Complex Detection (MCODE) plugin. The top 10 hub genes identified were CDK1, CCNB1, AURKA, CCNA2, KIF11, BUB1B, TOP2A, TPX2, HMMR and CDC45. The other public databases confirmed that high expression of the aforementioned genes related to poor overall survival among patients with HCC.Conclusion: This study primarily identified candidate genes and pathways involved in the underlying mechanisms of Chinese HCC, which is supposed to provide new targets for the diagnosis and treatment of HCC in China.
THIS RESOURCE IS NO LONGER IN SERVICE, documented on July 16, 2013. LOC3d is a database of predicted subcellular localization for eukaryotic proteins of known 3-D structure taken from the Protein Databank. Subcellular localization is currently predicted using four different methods: predictNLS (nuclear localization signal), LOChom (using homology), LOCkey (using keywords) and LOC3d (neural network based prediction). The reported localization is based on the method which predicts localization of a given protein with the highest confidence. LOCtree is a novel system of support vector machines (SVMs) that predict the subcellular localization of proteins, and DNA-binding propensity for nuclear proteins, by incorporating a hierarchical ontology of localization classes modeled onto biological processing pathways. Biological similarities are incorporated from the description of cellular components provided by the gene ontology consortium (GO). GO definitions have been simplified and tailored to the problem of protein sorting. Technically the ontology has been implemented using a decision tree with SVMs as the nodes. LOCtree, was extremely successful at learning evolutionary similarities among subcellular localization classes and was significantly more accurate than other traditional networks at predicting subcellular localization. Whenever available, LOCtree also reports predictions based on the following: 1) Nuclear localization signals found by PredictNLS, 2) Localization inferred using Prosite motifs and Pfam domains found in the protein, and 3) SWISS-PROT keywords associated with a protein. Localization is inferred in the last two cases using the entropy-based LOCkey algorithm. Additional information can be found in the LOCtree manuscript and associated PredictNLS and LOCkey publications.
The PPI view displays H-InvDB human protein-protein interaction (PPI) information. It is constructed by assigning interaction data to H-InvDB proteins which were originally predicted from transcriptional products generated by the H-Invitational project. The PPI view is now providing 32,198 human PPIs comprised of 9,268 H-InvDB proteins. H-Invitational Database (H-InvDB) is an integrated database of human genes and transcripts. By extensive analyses of all human transcripts, we provide curated annotations of human genes and transcripts that include gene structures, alternative splicing isoforms, non-coding functional RNAs, protein functions, functional domains, sub-cellular localizations, metabolic pathways, protein 3D structure, genetic polymorphisms (SNPs, indels and microsatellite repeats) , relation with diseases, gene expression profiling, molecular evolutionary features, protein-protein interactions (PPIs) and gene families/groups. Sponsors: This research is financially supported by the Ministry of Economy, Trade and Industry of Japan (METI), the Ministry of Education, Culture, Sports, Science and Technology of Japan (MEXT) and the Japan Biological Informatics Consortium (JBIC). Also, this work is partly supported by the Research Grant for the RIKEN Genome Exploration Research Project from MEXT to Y.H. and the Grant for the RIKEN Frontier Research System, Functional RNA research program.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Of the roughly 1000 human mitochondrial proteins only 13 proteins, all of them hydrophobic inner membrane proteins that are components of the oxidative phosphorylation apparatus, are encoded in the mitochondrial genome and translated by mitoribosomes at the matrix face of the inner membrane (reviewed in Herrmann et al. 2012, Hallberg and Larsson 2014, Lightowlers et al. 2014). The remainder, including all proteins of the mitochondrial translation system, are encoded in the nucleus and imported from the cytosol into the mitochondrion. Translation in the mitochondrion reflects both the bacterial origin of the organelle and subsequent divergent evolution during symbiosis (reviewed in Huot et al. 2014, Richman et al. 2014). Human mitochondrial ribosomes have a low sedimentation coefficient of only 55S, but at 2.71 MDa they retain a similar mass to E. coli 70S particles. The 55S particles are protein-rich compared to both cytosolic ribosomes and eubacterial ribosomes. This is due to shorter mt-rRNAs, mitochondria-specific proteins, and numerous rearrangements in individual protein positions within the two ribosome subunits (inferred from bovine ribosomes in Sharma et al. 2003, Greber et al. 2014, Kaushal et al. 2014, reviewed in Agrawal and Sharma 2012).
Mitochondrial mRNAs have either no untranslated leader or short leaders of 1-3 nucleotides, with the exception of the 2 bicistronic transcripts, RNA7 and RNA14, which have overlapping orfs that encode ND4L/ND4 and ATP8/ATP6 respectively. Translation is believed to initiate with the mRNA binding the 28S subunit:MTIF3 (28S subunit:IF-3Mt, 28S subunit:IF2mt) complex together with MTIF2:GTP (IF-2Mt:GTP, IF2mt:GTP) at the matrix face of the inner membrane (reviewed in Christian and Spremulli 2012). MTIF3 can dissociate 55S particles in preparation for initiation, enhances formation of initiation complexes, and inhibits N-formylmethionine-tRNA (fMet-tRNA) binding to 28S subunits in the absence of mRNA. Binding of fMet-tRNA to the start codon of the mRNA results in a stable complex while absence of a start codon at the 5' end of the mRNA causes eventual dissociation of the mRNA from the 28S subunit. After recognition of a start codon, the 39S subunit then binds the stable complex, GTP is hydrolyzed, and the initiation factors MTIF3 and MTIF2:GDP dissociate.
Translation elongation then proceeds by cycles of aminoacyl-tRNAs binding, peptide bond formation, and displacement of deacylated tRNAs. In each cycle an aminoacyl-tRNA in a complex with TUFM:GTP (EF-Tu:GTP) binds at the A-site of the ribosome, GTP is hydrolyzed, and TUFM:GDP dissociates. The elongating polypeptide bonded to the tRNA at the P-site is transferred to the aminoacyl group at the A-site by peptide bond formation at the peptidyl transferase center, leaving a deacylated tRNA at the P-site and the elongating polypeptide attached to the tRNA at the A-site. The polypeptide is co-translationally inserted into the inner mitochondrial membrane via an interaction with OXA1L (Haque et al. 2010, reviewed in Ott and Hermann 2010). After peptide bond formation, GFM1:GTP (EF-Gmt:GTP) then binds the ribosome complex, GTP is hydrolyzed, GFM1:GDP dissociates, and the ribosome translocates 3 nucleotides in the 3' direction along the mRNA, relocating the polypeptide-tRNA to the P-site and allowing another cycle to begin. TUFM:GDP is regenerated to TUFM:GTP by the guanine nucleotide exchange factor TSFM (EF-Ts, EF-TsMt).
Translation is terminated when MTRF1L:GTP (MTRF1a:GTP) recognizes an UAA or UAG termination codon at the A-site of the ribosome (Tsuboi et al. 2009). GTP hydrolysis does not appear to be required. The tRNA-aminoacyl bond between the translated polypeptide and the final tRNA at the P-site is hydrolyzed by the 39S subunit, facilitating release of the polypeptide. MRRF (RRF) and GFM2:GTP (EF-G2mt:GTP) then act to release the remaining tRNA and mRNA from the ribosome and dissociate the 55S ribosome into 28S and 39S subunits.
Mutations have been identified in genes encoding mitochondrial ribosomal proteins and translation factors. These have been shown to be pathogenic, causing neurological and other diseases (reviewed in Koopman et al. 2013, Pearce et al. 2013).
Translation can stall due to lack of tRNAs or due to defects in mRNAs and tRNAs (reviewed in Yip and Shao 2021, Nadler et al. 2022). In the case of a defective tRNA, the 28S and 39S ribosomal subunits are separated and the MTRFR-MTRES1 complex binds the peptidyl-tRNA and the empty A-site of the ribosome, resulting in ejection and hydrolysis of the peptidyl-tRNA and regeneration of the tRNA and the 39S ribosomal subunit (Desai et al 2020). In the case of a mRNA lacking a stop codon (non-stop mRNA), ICT1 recognizes the empty mRNA channel in the ribosome and causes ejection and hydrolysis of the peptidyl-tRNA (Kummer et al. 2021, Richter et al. 2010, Feaga et al. 2016).
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Immune recognition of pathogen-associated molecular patterns (PAMPs) by pattern recognition receptors (PRR) often activates proinflammatory nuclear factor kappa B (NF-κB) signalling. Lipopolysaccharide (LPS) is a well-known PAMP produced by gram-negative bacteria. LPS is recognized by toll like receptor 4 (TLR4) and is a strong activator of NF-κB inflammatory responses (Akashi S et al. 2003). LPS is also recognized in the cytosol by mouse caspase-11 and related human caspase-4 and caspase-5, which stimulate pyroptosis, a proinflammatory form of cell death (Kayagaki N et al. 2011; Shi J et al. 2015). Key metabolic intermediates in LPS biosynthesis, d-glycero-β-d-manno-heptose 1,7-bisphosphate (HBP) and ADP L-glycero-β-d-manno-heptose (ADP-heptose) were reported to activate the NF-κB pathway and trigger the innate immune responses (Milivojevic M et al. 2017; Zimmermann S et al. 2017; Zhou P et al. 2018; García-Weber D; 2018). ADP-heptose but not HBP can enter host cells autonomously (Zhou P et al. 2018). During infection, ADP-heptose or HBP translocate into the host cytosol where their presence is sensed by alpha-protein kinase 1 (ALPK1) (Zimmermann S et al. 2017; Zhou P et al. 2018). ADP-heptose directly binds and activates ALPK1 (Garcia-Weber D et al. 2018; Zhou P et al. 2018); instead, HBP is converted by host-derived adenylyltransferases, such as nicotinamide nucleotide adenylyltransferases, to ADP-heptose 7-P, a substrate which can then activate ALPK1 (Zhou P et al. 2018). The ADP-heptose binding to ALPK1 is thought to trigger conformational changes and stimulate the kinase domain of ALPK1 (Zhou P et al. 2018). ALPK1 kinase activity in turn leads to the phosphorylation-dependent oligomerization of the tumor necrosis factor (TNF-α) receptor–associated factor (TRAF)–interacting protein with the forkhead-associated domain (TIFA) (Zimmermann S et al. 2017; Zhou P et al. 2018). This process activates TRAF6 oligomerization and ubiquitination, and the recruitment of transforming growth factor β-activated kinase 1 (TAK1)-binding protein 2 (TAB2), a component of the TAK1 (MAP3K7) complex (Ea CK et al. 2004; Gaudet RG et al. 2017). This TIFA oligomer signaling platform was given the term: TIFAsome. TIFAsome-activated TAK1 induces NF-κB nuclear translocation and proinflammatory gene expression. The ALPK1-TIFA signaling pathway has been identified in human embryonic kidney cells, intestinal epithelial cells, gastric cells and cervical cancer cells (Gaudet RG et al. 2015, 2017; Stein SC et al. 2017; Gall A et al. 2017; Zimmermann S et al. 2017; Milivojevic M et al. 2017; Zhou P et al. 2018). In vivo studies demonstrate that ADP-heptose and Burkholderia cenocepacia trigger massive inflammatory responses with increased production of several NF-κB-dependent cytokines and chemokines in wild type (WT), but not in Alpk1-/- mice (Zhou P et al. 2018).
This Reactome module describes ALPK1 as a cytosolic innate immune receptor for bacterial ADP-heptose.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
PKT Human Disease Knowledge Graph Benchmark Builds (v1.0.0) Build Date: September 03, 2019 The KG Benchmark Builds can also be downloaded from Zenodo:👉 KGs: https://doi.org/10.5281/zenodo.7030200👉 Embeddings: https://zenodo.org/record/7030189
Required Input Documents
resource_info.txt class_source_list.txt instance_source_list.txt ontology_source_list.txt
Data Data Download Date: November 30, 2018 Ontologies
Gene Ontology Human Phenotype Ontology Classes
Human Disease Ontology Gene Ontology: gene associations Reactome: gene associations Human Phenotype Ontology: all source annotations - genes to phenotypes Human Phenotype Ontology: all source annotations - diseases to genes to phenotypes Instances
CTD: chemicals-genes CTD: chemicals-pathways CTD: chemicals-diseases CTD: genes-pathways CTD: diseases-pathways STRING DB: Proteins String DB: entrez gene mappings
Knowledge Graphs Knowledge RepresentationWe worked with a PhD-level biologist to develop a knowledge representation (see the figure below) that modeled mechanisms underlying human disease.
To do this, we manually mapped all possible combinations of the following six node types:
Humans Diseases Human Phenotypes Human Genes Gene Ontology concepts Reactome Pathways Chemicals As shown in the figure above, the Basic Formal Ontology and Relation Ontology ontologies were then used to create edges between the node types.
As shown in this figure, the following edge-types were created:
Phenotypes-Genes: The Human Phenotype Ontology (HP) provides phenotype-Entrez gene annotations that were used to map 6,651 HP classes to 120,288 Entrez genes. Phenotypes-Diseases: The HP provides HP-DOID-Gene annotations that were used to map 5,438 HP concepts to 43,817 DOID concepts. Biological processes, Molecular Functions, and Cellular Locations-Genes: The Gene Ontology (GO) provides GO-Gene annotations that were used to map 17,505 GO concepts to 265,002 Entrez genes. Biological processes, Molecular Functions, and Cellular Locations-Pathways-Pathways: Reactome provides GO-Gene links that were used to map 17,906 pathways to 1,910 biological processes, molecular functions, and cellular locations. Chemicals-Pathways: The Comparative Toxicogenomics Database (CTD) provides Chemical-pathway links that were used to map 8,886 MESH concepts to 711,043 Reactome pathways. Chemicals-Genes: The Comparative Toxicogenomics Database (CTD) provides Chemical-Gene links that were used to map 8,881 MESH concepts 410,379 Entrez genes. Chemicals-Diseases: The Comparative Toxicogenomics Database (CTD) provides Chemical-Disease links that were used to map 14,238 MESH concepts 1,216,900 DOID concepts. Genes-Genes: TheSTRING Database provides Gene-Gene links that were used to create 594,100 gene-gene interactions. When generating these mappings, only the inferred protein-protein relationships considered to be high confidence were used (score of 700 or better). Genes-Disease: Mappings between genes and diseases were retrieved from DisGeNet via SPARQL endpoint and used to map 6,051 Entrez genes to 20,452 DOID concepts. Genes-Pathways: The Comparative Toxicogenomics Database (CTD) provides Gene-Pathway links that were used to map 110,370 Entrez genes to 107,029 Reactome pathways. Pathways-Disease: The Comparative Toxicogenomics Database (CTD) provides Pathway-Disease links that were used to map 1,818 Reactome pathways to 106,727 DOID concepts.
Knowledge GraphThe knowledge graph represented above was built using the following steps: Merge Ontologies: Merge ontologies using the OWL Tools APIExpress New Ontology Concept Annotations: Create new ontology annotations by asserting a relation between the instance and an instance of the ontology class. For example to assert the following relations:
Morphine --> is substance that treats --> Migraine We would need to create two axioms:
isSubstanceThatTreats(Morphine, x1) instanceOf(x1, Migraine) While the instance of the HP class hemiplegic migraines can be treated as an anonymous node in the knowledge graph, we generate a new international resource identifier for each newly generated instance. Deductively Close Knowledge Graph: The knowledge graph is deductively closed by using the OWL 2 EL reasoner, ELK via Protégé v5.1.1. ELK is able to classify instances and supports inferences over class hierarchies and object properties. inference over disjointness, intersection, and existential quantification (ontology class hierarchies). Generate Edge List: The final step before exporting the edge list is to remove any nodes that are not biologically meaningful or would otherwise reduce the performance of machine learning algorithms and the algorithm used to generate embeddings.
🚨 AVAILABLE FILES 🚨Available KG benchmark files are zipped and listed below. For additional details on what each file contains, please see the associated Wiki page 👉 here.
Database about gene regulation and gene expression in prokaryotes. It includes a manually curated and unique collection of transcription factor binding sites. A variety of bioinformatics tools for the prediction, analysis and visualization of regulons and gene reglulatory networks is included. The integrated approach provides information about molecular networks in prokaryotes with focus on pathogenic organisms. In detail this concerns: * transcriptional regulation (transcription factors and their DNA binding sites * signal transduction (two-component systems, phosphylation cascades) * protein interactions (complex formation, oligomerization) * biochemical pathways (chemical reactions) * other regulation events (e.g. codon usage, etc. ...) It aims to be a resource to model protein-host interactions and to be a suitable platform to analyze high-throughput data from proteomis and transcriptomics experiments (systems biology). Currently it mainly contains detailed information about operon and promoter structures including huge collections of transcription factor binding sites. If an appropriate number of regulatory binding sites is available, a position weight matrix (PWM) and a sequence logo is provided, which can be used to predict new binding sites. This data is collected manually by screening the original scientific literature. PRODORIC also handles protein-protein interactions and signal-transduction cascades that commonly occur in form of two-component systems in prokaryotes. Furthermore it contains metabolic network data imported from the KEGG database.
Centralized platform to depict and integrate the information pertaining to protein-protein interaction networks, domain architecture, ortholog information and GO annotation in the Arabidopsis thaliana proteome. The Protein-protein interaction pairs are predicted by integrating several methods with the Naive Baysian Classifier. All other related information curated is manually extracted from published literature and other resources from some expert biologists. You are welcomed to upload your PPI or subcellular localization information or report data errors. Arabidopsis proteins is annotated with information (e.g. functional annotation, subcellular localization, tissue-specific expression, phosphorylation information, SNP phenotype and mutant phenotype, etc.) and interaction qualifications (e.g. transcriptional regulation, complex assembly, functional collaboration, etc.) via further literature text mining and integration of other resources. Meanwhile, the related information is vividly displayed to users through a comprehensive and newly developed display and analytical tools. The system allows the construction of tissue-specific interaction networks with display of canonical pathways.
Database of images of putative biological pathways, macromolecular structures, gene families, and cellular relationships. It is of use to those who are working with large sets of genes or proteins using cDNA arrays, functional genomics, or proteomics. The rationale for this collection is that: # Except in a few cases, information on most biological pathways in higher eukaryotes is non-existent, incomplete, or conflicting. # Similar biological pathways differ by tissue context, developmental stages, stimulatory events, or for other complex reasons. This database allows comparisons of different variations of pathways that can be tested empirically. # The goal of this database is to use images created directly by biomedical scientists who are specialists in a particular biological system. It is specifically designed to NOT use average, idealized or redrawn pathways. It does NOT use pathways defined by computer algorithm or information search approaches. # Information on biological pathways in higher eukaryotes generally resides in the images and text of review papers. Much of this information is not easily accessible by current medical reference search engines. # All images are attributable to the original authors. All pathways or other biological systems described are graphic representations of natural systems. Each pathway is to be considered a work in progress. Each carries some degree of error or incompleteness. The end user has the ultimate responsibility to determine the scientific correctness and validity in their particular biological system. Image/pathway submissions are welcome.
A database is a of mammalian miRNAs and their known or predicted regulatory targets. It provides information on origin of miRNAs, tissue specificity of their expressions and their known or proposed functions, their potential target genes as well as data on miRNA families based on their co-expression and proteins known to be involved in miRNA processing. This database also contains three other navigation tools that can be used to find information relating to miRNA: 1.) Gene Annotations is an information retrieval system for miRNA target genes. It provides comprehensive information from sequence databases and allows to simultaneously search PubMed with all synonyms of a given gene. 2.) miRNA Motif Finder - Argonaute predicts miRNA motifs binding to the gene sequence of the user. The miRNA mature sequences are taken from Agronaute 2 database. miRNA Motif Finder - Custom predicts miRNA motifs binding to the gene sequence, both the gene sequence and miRNA mature sequences provided by the user. 3.) miRNA Statistics provides statistics for the mature miRNA sequences from Argonaute 2 as well as for the miRNA sequences uploaded by the user. It provides statitics on the individual nucleotide as well as pattern of nucleotides apperaing in the sequence.
An interactive, visual database containing more than 350 small molecule pathways found in humans. More than 2/3 of these pathways (>280) are not found in any other pathway database. SMPDB is designed specifically to support pathway elucidation and pathway discovery in metabolomics, transcriptomics, proteomics and systems biology. It is able to do so, in part, by providing exquisitely detailed, fully searchable, hyperlinked diagrams of human metabolic pathways, metabolic disease pathways, metabolite signaling pathways and drug-action pathways. All SMPDB pathways include information on the relevant organs, subcellular compartments, protein cofactors, protein locations, metabolite locations, chemical structures and protein quaternary structures. Each small molecule is hyperlinked to detailed descriptions contained in the HMDB or DrugBank and each protein or enzyme complex is hyperlinked to UniProt. All SMPDB pathways are accompanied with detailed descriptions and references, providing an overview of the pathway, condition or processes depicted in each diagram. The database is easily browsed and supports full text, sequence and chemical structure searching. Users may query SMPDB with lists of metabolite names, drug names, genes / protein names, SwissProt IDs, GenBank IDs, Affymetrix IDs or Agilent microarray IDs. These queries will produce lists of matching pathways and highlight the matching molecules on each of the pathway diagrams. Gene, metabolite and protein concentration data can also be visualized through SMPDB''s mapping interface. All of SMPDB''s images, image maps, descriptions and tables are downloadable.