100+ datasets found

Methods applied for identification of trials in addition to searching in...
plos.figshare.com
xls
Updated Jun 2, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Wynanda A. van Enst; Rob J. P. M. Scholten; Lotty Hooft (2023). Methods applied for identification of trials in addition to searching in biomedical databases in 210 Cochrane reviews. [Dataset]. http://doi.org/10.1371/journal.pone.0042812.t001
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0042812.t001
Dataset updated
Jun 2, 2023
Dataset provided by
PLOShttp://plos.org/
Authors
Wynanda A. van Enst; Rob J. P. M. Scholten; Lotty Hooft
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
*Most review authors applied multiple strategies to identify additional trials. Therefore, the summation of percentages exceeds 100%.
Characteristics of Included Clinical Studies.
plos.figshare.com
xls
Updated Dec 17, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ariel Bardach; Mabel Berrueta; Agustín Ciapponi; Juan M. Sambade; Noelia Castellana; Jamile Ballivian; Martín Brizuela; Julieta Caravario; Daniel Comande; Esteban Couto; Agustina Mazzoni; Vanesa Ortega; Edward P. K. Parker; Florencia Salva; Katharina Stegelmann; John S. Schieffelin; Xu Xiong; Andy Stergachis; Flor M. Munoz; Pierre Buekens (2025). Characteristics of Included Clinical Studies. [Dataset]. http://doi.org/10.1371/journal.pone.0338128.t001
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0338128.t001
Dataset updated
Dec 17, 2025
Dataset provided by
PLOShttp://plos.org/
Authors
Ariel Bardach; Mabel Berrueta; Agustín Ciapponi; Juan M. Sambade; Noelia Castellana; Jamile Ballivian; Martín Brizuela; Julieta Caravario; Daniel Comande; Esteban Couto; Agustina Mazzoni; Vanesa Ortega; Edward P. K. Parker; Florencia Salva; Katharina Stegelmann; John S. Schieffelin; Xu Xiong; Andy Stergachis; Flor M. Munoz; Pierre Buekens
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
BackgroundLassa fever (LF) is an acute viral hemorrhagic illness endemic in West Africa, representing significant public health challenges, particularly for pregnant persons and children who experience higher morbidity and mortality. Although several vaccine candidates are being developed, no LF vaccine has been licensed yet.MethodsWe conducted a living systematic review (LSR) of the literature to evaluate the safety, efficacy, effectiveness, and immunogenicity of LF vaccines. We performed biweekly searches in major biomedical databases, trial registries, preprint servers, and other sources. Eligible studies included preclinical studies, clinical trials, and observational studies published from January 2014 to April 2025. Reviewer pairs screened studies extracted data (REDCap), and assessed risk of bias independently. Data synthesis involved random-effects pairwise and proportion meta-analyses (R software), with GRADE assessment of evidence certainty. PROSPERO registries: (CRD42024514513; CRD42024516754).ResultsSearches retrieved 1423 records, including 51 studies, 2 clinical trials in adults involving 88 vaccinated persons, and 49 preclinical studies of 30 vaccine candidates. Trials evaluated Recombinant Measles-Vectored (MV-LASV) and Recombinant Vesicular Stomatitis Virus-based (rVSVΔG-LASV-GPC) LF vaccine candidates. No published clinical trials were found to evaluate LF vaccines in special populations such as pregnant persons, infants, children, or adolescents. Although injection site reactogenicity was reported, no vaccine-related serious adverse events (SAEs) were reported in study participants. Immunogenicity was robust in adults, with vaccines achieving around 95% seroconversion at 30 days. Preclinical data evaluated nine different platforms. Findings are disseminated via an interactive online dashboard (https://safeinpregnancy.org/living-systematic-review-lassa/).ConclusionCurrently, two LF vaccine candidates that have advanced to clinical trials exhibit high immunogenicity, but the safety profile in healthy adults is still limited. Clinical evidence in pregnant persons, infants, children, and adolescents is absent. Vaccine platforms of interest have been identified in preclinical studies, providing information on those that could advance to clinical studies.
n
Bibliography on Alternatives to the Use of Live Vertebrates in Biomedical...
neuinfo.org
scicrunch.org
+2more
Updated Feb 1, 2001
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2001). Bibliography on Alternatives to the Use of Live Vertebrates in Biomedical Research and Testing [Dataset]. http://identifiers.org/RRID:SCR_008160
Explore at:
Unique identifier
https://identifiers.org/RRID:SCR_008160
Dataset updated
Feb 1, 2001
Description
Bibliography to assist in identifying methods and procedures helpful in supporting the development, testing, application, and validation of alternatives to the use of vertebrates in biomedical research and toxicology testing. This bibliography is produced from MEDLINE database searches, performed and analyzed by subject experts from the Toxicology and Environmental Health Information Program (TEHIP) of the Specialized Information Services Division (SIS) of the National Library of Medicine (NLM). The purpose of these bibliographies on animal alternatives is to provide a survey of the literature in a format which facilitates easy scanning. This bibliography includes citations from published articles, books, book chapters, and technical reports. Citations to items in non-English languages are indicated with brackets around the title. The language is also indicated. Citations with abstracts or annotations relating to the method are organized under subject categories. This publication features citations which deal with methods, tests, assays or procedures which may prove useful in establishing alternatives to the use of intact vertebrates. Citations are selected and compiled through searching various computerized on-line bibliographic databases of the National Library of Medicine, National Institutes of Health. The focus of the bibliography is to assist in identifying methods and procedures helpful in supporting the development, testing, application, and validation of alternatives to the use of vertebrates in biomedical research and toxicology testing. Toxicology Databases
f
Table_2_Discovery of Selenocysteine as a Potential Nanomedicine Promotes...
datasetcatalog.nlm.nih.gov
frontiersin.figshare.com
Updated Jul 24, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Zhang, Jiying; Song, Yifan; Song, Shitang; Yuan, Fuzhen; Chen, Yourong; Yang, Meng; Fan, Baoshi; Yu, Jia-Kuo; Xu, Bingbing; Sun, Zewen; Ye, Jing; Yan, Xin (2020). Table_2_Discovery of Selenocysteine as a Potential Nanomedicine Promotes Cartilage Regeneration With Enhanced Immune Response by Text Mining and Biomedical Databases.xlsx [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0000558818
Explore at:
Dataset updated
Jul 24, 2020
Authors
Zhang, Jiying; Song, Yifan; Song, Shitang; Yuan, Fuzhen; Chen, Yourong; Yang, Meng; Fan, Baoshi; Yu, Jia-Kuo; Xu, Bingbing; Sun, Zewen; Ye, Jing; Yan, Xin
Description
BackgroundUnlike bone tissue, little progress has been made regarding cartilage regeneration, and many challenges remain. Furthermore, the key roles of cartilage lesion caused by traumas, focal lesion, or articular overstress remain unclear. Traumatic injuries to the meniscus as well as its degeneration are important risk factors for long-term joint dysfunction, degenerative joint lesions, and knee osteoarthritis (OA) a chronic joint disease characterized by degeneration of articular cartilage and hyperosteogeny. Nearly 50% of the individuals with meniscus injuries develop OA over time. Due to the limited inherent self-repair capacity of cartilage lesion, the Biomaterial drug-nanomedicine is considered to be a promising alternative. Therefore, it is important to elucidate the gene potential regeneration mechanisms and discover novel precise medication, which are identified through this study to investigate their function and role in pathogenesis.MethodsWe downloaded the mRNA microarray statistics GSE117999, involving paired cartilage lesion tissue samples from 12 OA patients and 12 patients from a control group. First, we analyzed these statistics to recognize the differentially expressed genes (DEGs). We then exposed the gene ontology (GO) annotation and the Kyoto Encyclopaedia of Genes and Genomes (KEGG) pathway enrichment analyses for these DEGs. Protein-protein interaction (PPI) networks were then constructed, from which we attained eight significant genes after a functional interaction analysis. Finally, we identified a potential nanomedicine attained from this assay set, using a wide range of inhibitor information archived in the Search Tool for the Retrieval of Interacting Genes (STRING) database.ResultsSixty-six DEGs were identified with our standards for meaning (adjusted P-value < 0.01, |log2 - FC| ≥1.2). Furthermore, we identified eight hub genes and one potential nanomedicine - Selenocysteine based on these integrative data.ConclusionWe identified eight hub genes that could work as prospective biomarkers for the diagnostic and biomaterial drug treatment of cartilage lesion, involving the novel genes CAMP, DEFA3, TOLLIP, HLA-DQA2, SLC38A6, SLC3A1, FAM20A, and ANO8. Meanwhile, these genes were mainly associated with immune response, immune mediator induction, and cell chemotaxis. Significant support is provided for obtaining a series of novel gene targets, and we identify potential mechanisms for cartilage regeneration and final nanomedicine immunotherapy in regenerative medicine.
Z
Data from the paper "The landscape of biomedical research"
datasetcatalog.nlm.nih.gov
zenodo.org
Updated Mar 3, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Berens, Philipp; Schmidt, Luca; Kobak, Dmitry; González-Márquez, Rita; Schmidt, Benjamin M. (2023). Data from the paper "The landscape of biomedical research" [Dataset]. http://doi.org/10.5281/zenodo.7695390
Explore at:
Unique identifier
https://doi.org/10.5281/zenodo.7695390
Dataset updated
Mar 3, 2023
Authors
Berens, Philipp; Schmidt, Luca; Kobak, Dmitry; González-Márquez, Rita; Schmidt, Benjamin M.
Description
Data from the paper "The landscape of biomedical research" (https://www.biorxiv.org/content/10.1101/2023.04.10.536208v1). The paper used the PubMed 2020 baseline (download date: 26.01.2021, not available anymore) supplemented with additional files from the 2021 baseline (download date: 27.04.2022, not available anymore), both originally obtained from https://www.nlm.nih.gov/databases/download/pubmed_medline.html, courtesy of the U.S. National Library of Medicine. The data provided here includes: - from the PubMed database: article title, journal, PMID, and publication year. - produced by us: t-SNE embedding X and Y coordinates, label, and color.
m
Data for "Sub-Saharan Africa's Biomedical Journal Coverage in Scholarly...
data.mendeley.com
Updated Nov 24, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Toluwase Asubiaro (2021). Data for "Sub-Saharan Africa's Biomedical Journal Coverage in Scholarly Databases: A comparison of Web of Science, Scopus, EMBASE, PubMed, African Index Medicus and African Journals Online" [Dataset]. http://doi.org/10.17632/52pncd8zmy.1
Explore at:
Unique identifier
https://doi.org/10.17632/52pncd8zmy.1
Dataset updated
Nov 24, 2021
Authors
Toluwase Asubiaro
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Africa, Sub-Saharan Africa
Description
Journal lists of all the 46 Sub-Saharan African countries were retrieved manually from Ulrich periodical database using the "country of publication" field in the advanced search interface. Delimiters were used to limit the retrieved results to periodicals in the journal categories and with active status. Ulrich's database usually multiple records for the different formats (eg. online and print), or languages in which a single journal is published. Duplicates were removed from the retrieved results.

Master journal lists for Web of Science indexes comprising of the Science Citation Index Expanded (SCIE), the Social Science Citation Index (SSCI) and the Arts and Humanities Citation Index (A&HCI) and Emerging Sources Citation Index ESCI. Master journal lists for Scopus, EMBASE and MEDLINE databases were downloaded from their respective publishers' websites. Master journal lists for AJOL was not available on the publishers' website. Therefore, the master journal list from AJOL was created manually by extracting journal information from the publishers' websites. Only active journals were included in the study, where active journals were defined as journals that have published at least an issue in 2021 or 2020. The master journal list for AIM was not available as well. The whole database comprising of 18,949 articles were downloaded with the source (journal names). Journals were sorted to identify unique journal names, where only 15,279 articles had identifiable journal names. Five hundred twenty-four unique journals were identified, with only 74 active journals. Journals that were not indexed in the AIM database in 2020 or 2021 were deemed inactive and were not included in the study. This study was not considered for ethics review because data used was collected from publicly available records.
s
RESNET
scicrunch.org
neuinfo.org
+2more
Updated Jan 4, 2026
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2026). RESNET [Dataset]. http://identifiers.org/RRID:SCR_002121
Explore at:
Unique identifier
https://identifiers.org/RRID:SCR_002121
Dataset updated
Jan 4, 2026
Description
Databases that represent sets of pre-compiled information on biological relationships and associations, interactions and facts which have been extracted from the biomedical literature using Ariadne's MedScan technology. ResNet databases store information harvested from the entire PubMed in a formal structure that allows searching, retrieval and updating by Pathway Studio user. ResNet is seamlessly installed when Pathway Studio is installed. There are several available ResNet databases: *ResNet Mammalian Database includes data for Human, Rat, and Mouse *ResNet Plant Database has data on Arabidopsis, Rice and several other plants. Features of ResNet: *All extracted relations have linked access to the original article or abstract *Synonyms and homologs are included to maintain gene identity and to obviate redundancy in search results *Users can update ResNet as often as required using the MedScan technology built into all Ariadne products *Updates are made available by Ariadne every quarter To purchase Pathway Studio software with ResNet database, for information, or to schedule a web demonstration, call our sales department at (240) 453-6272, or (866) 340-5040 (toll free)., THIS RESOURCE IS NO LONGER IN SERVICE. Documented on September 16,2025.
n
RIKEN integrated database of mammals
neuinfo.org
Updated Oct 17, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). RIKEN integrated database of mammals [Dataset]. http://identifiers.org/RRID:SCR_006890
Explore at:
Unique identifier
https://identifiers.org/RRID:SCR_006890
Dataset updated
Oct 17, 2024
Description
THIS RESOURCE IS NO LONGER IN SERVICE. Documented on August 16, 2019.A database that integrates not only RIKEN''''s original large-scale mammalian databases, such as FANTOM, the ENU mutagenesis program, the RIKEN Cerebellar Development Transcriptome Database and the Bioresource Database, but also imported data from public databases, such as Ensembl, MGI and biomedical ontologies. Our integrated database has been implemented on the infrastructure of publication medium for databases, termed SciNetS/SciNeS, or the Scientists'''' Networking System, where the data and metadata are structured as a semantic web and are downloadable in various standardized formats. The top-level ontology-based implementation of mammal-related data directly integrates the representative knowledge and individual data records in existing databases to ensure advanced cross-database searches and reduced unevenness of the data management operations. Through the development of this database, we propose a novel methodology for the development of standardized comprehensive management of heterogeneous data sets in multiple databases to improve the sustainability, accessibility, utility and publicity of the data of biomedical information.
n
Data from: EMBASE
neuinfo.org
scicrunch.org
+2more
Updated Sep 3, 2008
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2008). EMBASE [Dataset]. http://identifiers.org/RRID:SCR_001650
Explore at:
Unique identifier
https://identifiers.org/RRID:SCR_001650
Dataset updated
Sep 3, 2008
Description
Comprehensive international bibliographic biomedical database that enables users to track and retrieve precise information on drugs and diseases from pre-clinical studies to searches on critical toxicological information. It contains bibliographic records with citations, abstracts and indexing derived from biomedical articles in peer reviewed journals, and is especially strong in its coverage of drug and pharmaceutical research. Embase can help with everything from clinical trials research to pharmacovigilance and is updated online daily and weekly. Its broad biomedical scope covers the following areas: * Drug therapy and research, including pharmaceutics, pharmacology and toxicology * Clinical and experimental (human) medicine * Basic biological science relevant to human medicine * Biotechnology and biomedical engineering, including medical devices * Health policy and management, including pharmacoeconomics * Public, occupational and environmental health, including pollution control * Veterinary science, dentistry, and nursing The Embase Application Programming Interface supports export, RSS feeds, and integration services, making it possible to share data with a wide range of systems.
Training/evaluation data sets and databases for the operation of...
zenodo.org
zip
Updated Jan 24, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ibrahim Burak Ozyurt; Ibrahim Burak Ozyurt (2020). Training/evaluation data sets and databases for the operation of bio-answerfinder biomedical question answering system [Dataset]. http://doi.org/10.5281/zenodo.2597595
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.2597595
Dataset updated
Jan 24, 2020
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Ibrahim Burak Ozyurt; Ibrahim Burak Ozyurt
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
A zip file containing training/evaluation data sets for the bio-answerfinder biomedical question answering system training and evaluation. The zip file also contains SQLite databases for named entity lookups, morphology, nominalizations, acronyms, PubMED trained GLoVe word/phrase embeddings and vocabulary with document frequencies and SciCrunch ontology data for named entities such as proteins, anatomical structures,
Database of Parenthetic Biomedical Abbreviations
zenodo.org
csv
Updated Nov 24, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Houcemeddine Turki; Houcemeddine Turki; Mohamed Ali Hadj Taieb; Mohamed Ali Hadj Taieb; Mohamed Ben Aouicha; Mohamed Ben Aouicha (2020). Database of Parenthetic Biomedical Abbreviations [Dataset]. http://doi.org/10.5281/zenodo.4282483
Explore at:
csvAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.4282483
Dataset updated
Nov 24, 2020
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Houcemeddine Turki; Houcemeddine Turki; Mohamed Ali Hadj Taieb; Mohamed Ali Hadj Taieb; Mohamed Ben Aouicha; Mohamed Ben Aouicha
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset includes the biomedical abbreviations stated between parentheses in the titles of the scholarly publications indexed by PubMed between 1947 and 2019. Each abbreviation is extracted thanks to the parenthetic level count algorithm and is assigned to the title, PMID and year of publication of each corresponding research paper. Then, every acronym is allocated its length and the number of upper and lower case letters it involves. Finally, the entities including one or no upper case letter, less than three characters, eight characters or more, or a high rate of non-alphanumeric characters are semi-automatically eliminated to ensure the consistency of the research database.
n
eBIRT
neuinfo.org
Updated Feb 7, 2026
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2026). eBIRT [Dataset]. http://identifiers.org/RRID:SCR_004172
Explore at:
Unique identifier
https://identifiers.org/RRID:SCR_004172 https://identifiers.org/RRID:SCR_004172/resolver?q=&i=rrid
Dataset updated
Feb 7, 2026
Description
Venue for research resource discovery offering resource providers a platform to advertise their services and products, as well as investigators a means to locate services for their use. Search results may be refined by resource type, research area or institution.
Human Disease Ontology 2018 update: classification, content and workflow...
zenodo.org
data.niaid.nih.gov
csv
Updated Jun 29, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Quentin St.Charles; Quentin St.Charles (2023). Human Disease Ontology 2018 update: classification, content and workflow expansion [Dataset]. http://doi.org/10.1093/nar/gky1032
Explore at:
csvAvailable download formats
Unique identifier
https://doi.org/10.1093/nar/gky1032
Dataset updated
Jun 29, 2023
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Quentin St.Charles; Quentin St.Charles
License
Attribution 3.0 (CC BY 3.0)https://creativecommons.org/licenses/by/3.0/
License information was derived automatically
Description
ABSTRACT:

The Human Disease Ontology (DO) (http://www.disease-ontology.org), database has undergone significant expansion in the past three years. The DO disease classification includes specific formal semantic rules to express meaningful disease models and has expanded from a single asserted classification to include multiple-inferred mechanistic disease classifications, thus providing novel perspectives on related diseases. Expansion of disease terms, alternative anatomy, cell type and genetic disease classifications and workflow automation highlight the updates for the DO since 2015. The enhanced breadth and depth of the DO's knowledgebase has expanded the DO's utility for exploring the multi-etiology of human disease, thus improving the capture and communication of health-related data across biomedical databases, bioinformatics tools, genomic and cancer resources and demonstrated by a 6.6× growth in DO's user community since 2015. The DO's continual integration of human disease knowledge, evidenced by the more than 200 SVN/GitHub releases/revisions, since previously reported in our DO 2015 NAR paper, includes the addition of 2650 new disease terms, a 30% increase of textual definitions, and an expanding suite of disease classification hierarchies constructed through defined logical axioms.

Instructions:

Data was cleaned. Duplicates and unnecessary columns were removed. Title of columns were changed.

Inspiration:

This dataset uploaded to U-BRITE for "DRG_DEPOT" summer 2023 team project.

Acknowledgements:

Schriml, L. M., Mitraka, E., Munro, J., Tauber, B., Schor, M., Nickle, L., Felix, V., Jeng, L., Bearer, C., Lichenstein, R., Bisordi, K., Campion, N., Hyman, B., Kurland, D., Oates, C. P., Kibbey, S., Sreekumar, P., Le, C., Giglio, M., & Greene, C.

Human Disease Ontology 2018 update: classification, content and workflow expansion

Nucleic Acids Research 2019; 47(D1), D955–D962;PMID:30407550;DOI:https://doi.org/10.1093/nar/gky1032

U-BRITE last update data: 06/28/2023
AbreMES-DB
zenodo.org
zip
Updated Nov 4, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ander Intxaurrondo; Ander Intxaurrondo (2022). AbreMES-DB [Dataset]. http://doi.org/10.5281/zenodo.1492192
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.1492192
Dataset updated
Nov 4, 2022
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Ander Intxaurrondo; Ander Intxaurrondo
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
[Lexical/terminological resource] First release of the Spanish Medical Abbreviation DataBase (AbreMES-DB).

The database is created automatically by detecting abbreviations and their potential definitions explicitly mentioned in the same sentence. These abbreviations are extracted from the metadata of different biomedical publications written in Spanish, which contain the titles and abstracts. The sources of these publications are SciELO, IBECS and Pubmed.
n
Mouse Genome Database
neuinfo.org
dknet.org
+2more
Updated Jan 29, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2022). Mouse Genome Database [Dataset]. http://identifiers.org/RRID:SCR_012953
Explore at:
Unique identifier
https://identifiers.org/RRID:SCR_012953
Dataset updated
Jan 29, 2022
Description
Community model organism database for laboratory mouse and authoritative source for phenotype and functional annotations of mouse genes. MGD includes complete catalog of mouse genes and genome features with integrated access to genetic, genomic and phenotypic information, all serving to further the use of the mouse as a model system for studying human biology and disease. MGD is a major component of the Mouse Genome Informatics.Contains standardized descriptions of mouse phenotypes, associations between mouse models and human genetic diseases, extensive integration of DNA and protein sequence data, normalized representation of genome and genome variant information. Data are obtained and integrated via manual curation of the biomedical literature, direct contributions from individual investigators and downloads from major informatics resource centers. MGD collaborates with the bioinformatics community on the development and use of biomedical ontologies such as the Gene Ontology (GO) and the Mammalian Phenotype (MP) Ontology.
PICO-Based Biomedical Knowledge Graph (Neo4j Dump)
kaggle.com
zip
Updated Aug 11, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Rumjot kaur (2025). PICO-Based Biomedical Knowledge Graph (Neo4j Dump) [Dataset]. https://www.kaggle.com/datasets/rumjotkaur/pico-based-biomedical-knowledge-graph-neo4j-dump
Explore at:
zip(176414126 bytes)Available download formats
Dataset updated
Aug 11, 2025
Authors
Rumjot kaur
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
This dataset contains a Neo4j .dump file for the constructed PICO-based Biomedical Knowledge Graph (EBM-KG). The graph is built from the EBM-NLP dataset and represents key PICO (Population, Intervention, Comparator, Outcome) elements and their relationships.

The knowledge graph can be restored in Neo4j to support biomedical text mining, literature-based discovery, and advanced retrieval-augmented generation (RAG) pipelines.

Neo4j database dump file contains the following : - Document, keyword, and author nodes for each PubMed article in the EBM-NLP dataset. - PICO nodes with their sub-labels as defined in the EBM-NLP dataset.

Total 23 entity types and 22 relation types is present in the knowledge graph

How to Restore the Database 1. Install Neo4j (compatible with version used: Neo4j 5.24.0). 2. Stop the Neo4j server. 3. Run : neo4j-admin database load --from-path=[xxx/neo4j.dump]/backups --overwrite-destination=true 4. Start the Neo4j server
d
EMBRYS
dknet.org
rrid.site
+2more
Updated Jan 29, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2022). EMBRYS [Dataset]. http://identifiers.org/RRID:SCR_006689/resolver
Explore at:
Unique identifier
https://identifiers.org/RRID:SCR_006689 https://identifiers.org/RRID:SCR_006689/resolver
Dataset updated
Jan 29, 2022
Description
Data collection of gene expression patterns mapped in whole-mount mouse embryo (ICR strain) of mid-gestational stages (Embryonic Day 9.5, 10.5, 11.5), in which most striking dynamics in pattern formation and organogenesis is observed. Collection of gene expression patterns of transcription factors (TFs) and TF-related factors such as transcription cofactors. Genes were extracted from databases including RIKEN Transcription Factor Database and Panther Classification System.
Relevant Datasets and Software Used for Paper "KGML-xDTD: A Knowledge...
zenodo.org
application/gzip, bin
Updated Jun 6, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Chunyu Ma; Zhihan Zhou; Han Liu; David Koslicki; Chunyu Ma; Zhihan Zhou; Han Liu; David Koslicki (2023). Relevant Datasets and Software Used for Paper "KGML-xDTD: A Knowledge Graph-based Machine Learning Framework for Drug Treatment Prediction and Mechanism Description" [Dataset]. http://doi.org/10.5281/zenodo.7582233
Explore at:
bin, application/gzipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.7582233
Dataset updated
Jun 6, 2023
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Chunyu Ma; Zhihan Zhou; Han Liu; David Koslicki; Chunyu Ma; Zhihan Zhou; Han Liu; David Koslicki
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
This repository contains relevant datasets and software used in a paper "KGML-xDTD: A Knowledge Graph-based Machine Learning Framework for Drug Treatment Prediction and Mechanism Description". They are used to run the code of KGML-xDTD stored on Github and support the results of this paper.

About the datasets

1. bkg_rtxkg2c_v2.7.3.tar.gz

This tar.gz file contains three sub-folders: tsv_files, scripts, and relevant_dbs. The "tsv_files" sub-folder has the input files that the neo4j software uses. The "scripts" sub-folder contains a shell script with a relevant python script to construct the biomedical knowledge graph. The "relevant_dbs" sub-folder stores two auxiliary databases that KGML-xDTD needs to use.

2. indication_paths.yaml

This file contains the DrugMechDB MOA paths that we used to evaluate the predicted MOA paths by KGML-xDTD. It is downloaded from the official GitHub repository of DrugMechDB.

3. training_data.tar.gz

This tar.gz file contains the processed training data of four data sources (e.g., MyChem, SemMedDB, NDF-RT, RepoDB) mentioned in the paper. These processed drug-disease pairs have been matched to the identifiers of biological entities used in our biomedical knowledge graph and respectively split into true positive (tp) sets and true negative (tn) sets. We also provide the names of these drug identifiers and disease identifiers under a sub-folder "translated _to_name".

About the software

neo4j-community-3.5.26.tar.gz

This tar.gz is the Neo4j community version 3.5.26 downloaded from Neo4j Download Center. Although the newer versions are available, due to their big changes in the Neo4j setting that are not compatible with our scripts on Github, we provide the version that we used in our research. If you would like to use the newer version, modifications to our script will be required to import the biomedical knowledge graph into your local Neo4j database with the new setting.
n
Integrated Manually Extracted Annotation
neuinfo.org
scicrunch.org
+2more
Updated Apr 15, 2014
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2014). Integrated Manually Extracted Annotation [Dataset]. http://identifiers.org/RRID:SCR_008876
Explore at:
Unique identifier
https://identifiers.org/RRID:SCR_008876
Dataset updated
Apr 15, 2014
Description
A virtual database of annotations made by 50 database providers (April 2014) - and growing (see below), that map data to publication information. All NIF Data Federation sources can be part of this virtual database as long as they indicate the publications that correspond to data records. The format that NIF accepts is the PubMed Identifier, category or type of data that is being linked to, and a data record identifier. A subset of this data is passed to NCBI, as LinkOuts (links at the bottom of PubMed abstracts), however due to NCBI policies the full data records are not currently associated with PubMed records. Database providers can use this mechanism to link to other NCBI databases including gene and protein, however these are not included in the current data set at this time. (To view databases available for linking see, http://www.ncbi.nlm.nih.gov/books/NBK3807/#files.Databases_Available_for_Linking ) The categories that NIF uses have been standardized to the following types: * Resource: Registry * Resource: Software * Reagent: Plasmid * Reagent: Antibodies * Data: Clinical Trials * Data: Gene Expression * Data: Drugs * Data: Taxonomy * Data: Images * Data: Animal Model * Data: Microarray * Data: Brain connectivity * Data: Volumetric observation * Data: Value observation * Data: Activation Foci * Data: Neuronal properties * Data: Neuronal reconstruction * Data: Chemosensory receptor * Data: Electrophysiology * Data: Computational model * Data: Brain anatomy * Data: Gene annotation * Data: Disease annotation * Data: Cell Model * Data: Chemical * Data: Pathways For more information refer to Create a LinkOut file, http://neuinfo.org/nif_components/disco/interoperation.shtm Participating resources ( http://disco.neuinfo.org/webportal/discoLinkoutServiceSummary.do?id=4 ): * Addgene http://www.addgene.org/pgvec1 * Animal Imaging Database http://aidb.crbs.ucsd.edu * Antibody Registry http://www.neuinfo.org/products/antibodyregistry/ * Avian Brain Circuitry Database http://www.behav.org/abcd/abcd.php * BAMS Connectivity http://brancusi.usc.edu/ * Beta Cell Biology Consortium http://www.betacell.org/ * bioDBcore http://biodbcore.org/ * BioGRID http://thebiogrid.org/ * BioNumbers http://bionumbers.hms.harvard.edu/ * Brain Architecture Management System http://brancusi.usc.edu/bkms/ * Brede Database http://hendrix.imm.dtu.dk/services/jerne/brede/ * Cell Centered Database http://ccdb.ucsd.edu * CellML Model Repository http://www.cellml.org/models * CHEBI http://www.ebi.ac.uk/chebi/ * Clinical Trials Network (CTN) Data Share http://www.ctndatashare.org/ * Comparative Toxicogenomics Database http://ctdbase.org/ * Coriell Cell Repositories http://ccr.coriell.org/ * CRCNS - Collaborative Research in Computational Neuroscience - Data sharing http://crcns.org * Drug Related Gene Database https://confluence.crbs.ucsd.edu/display/NIF/DRG * DrugBank http://www.drugbank.ca/ * FLYBASE http://flybase.org/ * Gene Expression Omnibus http://www.ncbi.nlm.nih.gov/geo/ * Gene Ontology Tools http://www.geneontology.org/GO.tools.shtml * Gene Weaver http://www.GeneWeaver.org * GeneDB http://www.genedb.org/Homepage * Glomerular Activity Response Archive http://gara.bio.uci.edu * GO http://www.geneontology.org/ * Internet Brain Volume Database http://www.cma.mgh.harvard.edu/ibvd/ * ModelDB http://senselab.med.yale.edu/modeldb/ * Mouse Genome Informatics Transgenes ftp://ftp.informatics.jax.org/pub/reports/MGI_PhenotypicAllele.rpt * NCBI Taxonomy Browser http://www.ncbi.nlm.nih.gov/Taxonomy/taxonomyhome.html * NeuroMorpho.Org http://neuromorpho.org/neuroMorpho * NeuronDB http://senselab.med.yale.edu/neurondb * SciCrunch Registry http://neuinfo.org/nif/nifgwt.html?tab=registry * NIF Registry Automated Crawl Data http://lucene1.neuinfo.org/nif_resource/current/ * NITRC http://www.nitrc.org/ * Nuclear Receptor Signaling Atlas http://www.nursa.org * Olfactory Receptor DataBase http://senselab.med.yale.edu/ordb/ * OMIM http://omim.org * OpenfMRI http://openfmri.org * PeptideAtlas http://www.peptideatlas.org * RGD http://rgd.mcw.edu * SFARI Gene: AutDB https://gene.sfari.org/autdb/Welcome.do * SumsDB http://sumsdb.wustl.edu/sums/ * Temporal-Lobe: Hippocampal - Parahippocampal Neuroanatomy of the Rat http://www.temporal-lobe.com/ * The Cell: An Image Library http://www.cellimagelibrary.org/ * Visiome Platform http://platform.visiome.neuroinf.jp/ * WormBase http://www.wormbase.org * YPED http://medicine.yale.edu/keck/nida/yped.aspx * ZFIN http://zfin.org
Data from: Comprehensive dataset of heterogeneous network structures in...
data.niaid.nih.gov
dataone.org
+1more
zip
Updated Sep 13, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Parastoo Fathi; Nasrollah Moghaddam Charkari (2024). Comprehensive dataset of heterogeneous network structures in traditional Chinese medicine research [Dataset]. http://doi.org/10.5061/dryad.wh70rxwx9
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5061/dryad.wh70rxwx9
Dataset updated
Sep 13, 2024
Dataset provided by
Tarbiat Modares University
University of Kurdistan
Authors
Parastoo Fathi; Nasrollah Moghaddam Charkari
License
https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
Description
This dataset presents a comprehensive collection of heterogeneous network structures commonly encountered in biomedical research. The dataset encompasses diverse types of networks, including protein-protein interaction networks, herb-target interaction networks, herb-symptom interaction networks, herb-herb association networks, and symptom-symptom association networks. Each network is represented as a graph, with nodes representing entities such as proteins, herb, symptoms, and efficacy, and edges representing the relationships between these entities. The dataset provides detailed descriptions of the nodes and edges in each network, along with metadata such as node attributes where applicable. This dataset serves as a valuable resource for researchers studying complex biological systems and exploring relationships between various biomedical entities. Methods The dataset is composed of seven files that encompass a range of networks, including protein-protein interactions, herb-target interactions, herb-symptom interactions, herb-herb associations, and symptom-symptom associations. It features three types of nodes: 1,254 herbs, 1,027 symptoms, and 2,208 protein targets. Additionally, the dataset includes five types of associations: 1,912 herb-herb associations, 7,220 symptom-symptom associations, 25,133 protein-protein interactions, 168,797 herb-target associations, and 16,739 herb-symptom associations. Below, you'll find detailed descriptions of the contents contained within these seven files. The first file is a compressed CSV that encompasses data on proteins and herbs. It begins with a header row outlining the dataset's features, followed by rows detailing specific protein-herb interactions. The columns are divided into two segments: the first segment includes attributes related to proteins such as Target ID, Protein Name, Gene Symbol, and Uniprot ID, while the second segment contains herb-related information, including Herb ID, Chinese Pin Yin, English Name, and Latin Name. Table 1 presents some of its associated targets. This compilation serves as an extensive repository of knowledge about herbs and their molecular targets, enriched with supplementary information. In this dataset, the designation "NA" signifies a lack of information regarding specific protein-target interactions related to herbs. Specifically, it indicates that there is no identified protein name associated with the herb-target interaction documented in the dataset. Furthermore, the corresponding gene linked to this protein and UniProt ID remains unavailable or undocumented. The CSV file encompasses 14,985 pairs of compound-herb activities and includes quality indicators for 1,254 herbs and 1,237 ingredients, detailing the interactions between these herbs and compounds organized into the following fields (Table 2 shows various compounds associated with the herb Coptis deltoidea):

The CAS number uniquely identifies each chemical compound and is essential for scientific communication. The preferred name is based on the IUPAC nomenclature, providing a standardized way to refer to the compound. These compounds contribute to the medicinal benefits of their respective herbs, making them valuable in traditional and modern herbal medicine. PubChem ID: A unique identifier assigned to chemical substances in the PubChem database, offering an abundance of information regarding the chemical's properties, biological activities, and safety data. ChEMBL ID: A unique identifier for bioactive compounds in the ChEMBL database, focusing on the activities of these compounds relevant to drug discovery and medicinal chemistry. Formula: The chemical composition and structure of a compound, indicating the types and quantities of atoms present in the molecule. Smiles: (Simplified Molecular Input Line Entry System): SMILES is a string notation used to represent the structure of molecules. It is a compact and human-readable way to describe the molecular structure, including the atoms, bonds, and functional groups. In the CSV file, the Smiles column contains the SMILES representation of the compounds. This information can be used to identify and compare the chemical structures of the compounds, as well as to predict their properties and activities. Herb ID: A unique identifier designated for each herb.

In summary, the occurrence of "None" in the CAS, PubChem ID, or ChEMBL ID columns of the herb compound dataset indicates missing or unavailable identifiers. The other CSV file contains detailed interactions between herbs and symptoms, organized into the following fields:

Symptom: Describes the symptom associated with the herb. Herb_id: Provides a unique identifier for each herb. Pinyin name: Represents the Romanized version of the herb's Chinese name. Latin name: Lists the herb's Latin botanical name. English name: Identifies the herb by its common English name. Class in Chinese: Indicates the herb's classification in Chinese. Class in English: Describes the herb's classification in English. P-value: Represents the statistical significance value. FDR (BH): Denotes the False Discovery Rate adjusted using the Benjamini-Hochberg method. FDR (Bonferroni): Shows the False Discovery Rate adjusted by the Bonferroni method.

Relationship: Explains the association between the symptom and the herb. The presence of "None" for P_value , FDR(BH) and FDR(Bonferroni) in the herb-symptom interaction dataset indicates a lack of statistical evaluation for that particular interaction. This dataset is sourced from the Chinese Pharmacopoeia (CHPA), ensuring its credibility. It includes 1,254 unique herb nodes and 1,027 symptom nodes, resulting in a total of 2,281 nodes. Furthermore, it features 16,739 interactions as edges, offering a comprehensive overview of the relationships between herbs and symptoms. There are two supplementary files related to herbs. One likely details the interactions among various herbs, while the other appears to outline the therapeutic properties associated with each specific herb. These files can offer valuable insights into the relationships between herbs and their effectiveness in treating different ailments or health conditions. The CSV file encompasses 1,912 pairs of herb-herb activities and includes quality indicators for 1,254 herbs, detailing the interactions between these herbs. Herb-herb association interaction indicate relationships among herbs based on shared properties and are available in CSV format. The CSV file encompasses 25133 pairs of target-target activities and includes quality indicators for 22008 protein targets, detailing the interactions between these proteins. Protein-Protein Interaction illustrate the interactions among proteins and are also provided in CSV format. This data, sourced from the STRING database, consists of 2,208 protein targets connected by 25,133 interactions. The CSV file encompasses 7220 pairs of symptom-symptom activities and includes quality indicators for 22008 symptoms, detailing the interactions between these symptoms. Additionally, Symptom-Symptom Association show relationships between symptoms and are available in CSV format as well. The data is extracted from the Semantic MEDLINE Database (SemMedDB) and encompasses 1,027 symptoms linked by 7,220 associations. EXPERIMENTAL DESIGN, MATERIALS AND METHODS This study compiles data on herbs, symptoms, targets, and their interactions from various public databases and publications, as detailed in Table 3. Herb-target and herb-compound associations are obtained from the HIT2 database, while relationships between herbs and their indications are derived from the 2015 edition of the Chinese Pharmacopoeia (CHPA). Herb-herb associations are based on research described in the reference. Furthermore, associations related to herb efficacy were specifically gathered from the Chinese Pharmacopoeia (CHPA). Using these efficacy-based herb relationships, herb vectors are constructed, and cosine similarities between pairs of herbs are calculated, resulting in the development of herb-herb connections. These similarity scores are then applied as edge weights in the network analysis. Table 3 summarizes information collected from various public databases and publications regarding herbs, symptoms, targets, and their interactions.

Name

Composition

Source

Herb-target associations

1254 herbs, 2208 targets and 168797 herb-target links

HIT2

Herb-compound associations

1254 herbs, 1237 compound and 14985 herb-compound links

HIT2

Herb-efficacy associations

829 herbs, 373 efficacies and 3830 herb-efficacy links

CHPA

Herb-symptom associations

465 herbs, 1027 symptoms and 16739 herb-symptom links

CHPA

Herb-herb associations

809 herbs and 1912 links

Herbs linked with similar efficacy

Protein-protein interactions

10622 proteins and 25133 interactions

String10

Symptom-symptom associations

1027 symptoms and 7220 links

SemMedDB

Assume, there are m types of herbs and n types of efficacies. Each herb is represented by a vector of efficacy Va = (w1, w2, …, wj, …, wn), where wi=1 indicates that herb Va has relationship with efficacy j, otherwise there is no relationship. The efficacy-based cosine similarity of herb Va and herb Vb can be calculated . Relationships between symptoms were analyzed using text mining methodologies. Initially, these connections were drawn from the Semantic MEDLINE Database (SemMedDB), which includes ternary semantic relationships sourced from the MEDLINE database via the biomedical semantic relation extraction tool, SemRep. The significance of each relationship was assessed using Fisher’s exact test ,with relationships displaying a significance level of P<0.05 deemed reliable. Additionally, protein-protein interactions were obtained from a well-known gene-gene interaction network database. For relationships with weights exceeding 700, a filtering process was applied, followed by linear

Facebook

Twitter

Click to copy link

Link copied

Cite

Wynanda A. van Enst; Rob J. P. M. Scholten; Lotty Hooft (2023). Methods applied for identification of trials in addition to searching in biomedical databases in 210 Cochrane reviews. [Dataset]. http://doi.org/10.1371/journal.pone.0042812.t001

Methods applied for identification of trials in addition to searching in biomedical databases in 210 Cochrane reviews.

Explore at:

xlsAvailable download formats

Unique identifier

https://doi.org/10.1371/journal.pone.0042812.t001

Dataset updated

Jun 2, 2023

Dataset provided by

PLOShttp://plos.org/

Authors

Wynanda A. van Enst; Rob J. P. M. Scholten; Lotty Hooft

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

*Most review authors applied multiple strategies to identify additional trials. Therefore, the summation of percentages exceeds 100%.

Clear search

Close search

Google apps

Main menu

Methods applied for identification of trials in addition to searching in...

Characteristics of Included Clinical Studies.

Bibliography on Alternatives to the Use of Live Vertebrates in Biomedical...

Table_2_Discovery of Selenocysteine as a Potential Nanomedicine Promotes...

Data from the paper "The landscape of biomedical research"

Data for "Sub-Saharan Africa's Biomedical Journal Coverage in Scholarly...

RESNET

RIKEN integrated database of mammals

Data from: EMBASE

Training/evaluation data sets and databases for the operation of...

Database of Parenthetic Biomedical Abbreviations

eBIRT

Human Disease Ontology 2018 update: classification, content and workflow...

AbreMES-DB

Mouse Genome Database

PICO-Based Biomedical Knowledge Graph (Neo4j Dump)

EMBRYS

Relevant Datasets and Software Used for Paper "KGML-xDTD: A Knowledge...

Integrated Manually Extracted Annotation

Data from: Comprehensive dataset of heterogeneous network structures in...

Methods applied for identification of trials in addition to searching in biomedical databases in 210 Cochrane reviews.