52 datasets found

Semantic Similarity Score Calculation and Reproducibility
figshare.com
txt
Updated May 14, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Gaston Mazandu; Kenneth B. Opap; Funmilayo Makinde; Victoria Nembaware; Francis Agamah; Christian Bope; Emile R. Chimusa; Ambroise Wonkam; Nicola Mulder (2021). Semantic Similarity Score Calculation and Reproducibility [Dataset]. http://doi.org/10.6084/m9.figshare.14599992.v2
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.14599992.v2
Dataset updated
May 14, 2021
Dataset provided by
Figsharehttp://figshare.com/
Authors
Gaston Mazandu; Kenneth B. Opap; Funmilayo Makinde; Victoria Nembaware; Francis Agamah; Christian Bope; Emile R. Chimusa; Ambroise Wonkam; Nicola Mulder
License
https://www.gnu.org/licenses/gpl-3.0.htmlhttps://www.gnu.org/licenses/gpl-3.0.html
Description
Building the annotation file, consisting of protein (entity)-gene ontology process map extracted from the GOA UniProt dataset at ftp://ftp.ebi.ac.uk/pub/databases/GO/goa/UNIPROT/goa_uniprot_all.gaf.gz. This protein-process map file is used to generate protein pairs used for testing the PySML library. Semantic similarity scores produced are also included.
d
Data from: The new bioinformatics: integrating ecological data from the gene...
search.dataone.org
data.niaid.nih.gov
+2more
Updated Jul 3, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Matthew B. Jones; Mark P. Schildahuer; O. J. Reichman; Shawn Bowers; Mark P. Schildhauer; O.J. Reichman (2025). The new bioinformatics: integrating ecological data from the gene to the biosphere [Dataset]. http://doi.org/10.5061/dryad.qb0d6
Explore at:
Unique identifier
https://doi.org/10.5061/dryad.qb0d6
Dataset updated
Jul 3, 2025
Dataset provided by
Dryad Digital Repository
Authors
Matthew B. Jones; Mark P. Schildahuer; O. J. Reichman; Shawn Bowers; Mark P. Schildhauer; O.J. Reichman
Time period covered
Jan 1, 2012
Description
Bioinformatics, the application of computational tools to the management and analysis of biological data, has stimulated rapid research advances in genomics through the development of data archives such as GenBank, and similar progress is just beginning within ecology. One reason for the belated adoption of informatics approaches in ecology is the breadth of ecologically pertinent data (from genes to the biosphere) and its highly heterogeneous nature. The variety of formats, logical structures, and sampling methods in ecology create significant challenges. Cultural barriers further impede progress, especially for the creation and adoption of data standards. Here we describe informatics frameworks for ecology, from subject-specific data warehouses, to generic data collections that use detailed metadata descriptions and formal ontologies to catalog and cross-reference information. Combining these approaches with automated data integration techniques and scientific workflow systems will ma...
Extracted Schemas from the Life Sciences Linked Open Data Cloud
figshare.com
txt
Updated Jun 1, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Maulik Kamdar (2023). Extracted Schemas from the Life Sciences Linked Open Data Cloud [Dataset]. http://doi.org/10.6084/m9.figshare.12402425.v2
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.12402425.v2
Dataset updated
Jun 1, 2023
Dataset provided by
Figsharehttp://figshare.com/
Authors
Maulik Kamdar
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset is related to the manuscript "An empirical meta-analysis of the life sciences linked open data on the web" published at Nature Scientific Data. If you use the dataset, please cite the manuscript as follows:Kamdar, M.R., Musen, M.A. An empirical meta-analysis of the life sciences linked open data on the web. Sci Data 8, 24 (2021). https://doi.org/10.1038/s41597-021-00797-yWe have extracted schemas from more than 80 publicly available biomedical linked data graphs in the Life Sciences Linked Open Data (LSLOD) cloud into an LSLOD schema graph and conduct an empirical meta-analysis to evaluate the extent of semantic heterogeneity across the LSLOD cloud. The dataset published here contains the following files:- The set of Linked Data Graphs from the LSLOD cloud from which schemas are extracted.- Refined Sets of extracted classes, object properties, data properties, and datatypes, shared across the Linked Data Graphs on LSLOD cloud. Where the schema element is reused from a Linked Open Vocabulary or an ontology, it is explicitly indicated.- The LSLOD Schema Graph, which contains all the above extracted schema elements interlinked with each other based on the underlying content. Sample instances and sample assertions are also provided along with broad level characteristics of the modeled content. The LSLOD Schema Graph is saved as a JSON Pickle File. To read the JSON object in this Pickle file use the Python command as follows:with open('LSLOD-Schema-Graph.json.pickle' , 'rb') as infile: x = pickle.load(infile, encoding='iso-8859-1')Check the Referenced Link for more details on this research, raw data files, and code references.
m
The Molecular Entities in Linked Data Dataset
data.mendeley.com
Updated Apr 4, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Dominik Tomaszuk (2020). The Molecular Entities in Linked Data Dataset [Dataset]. http://doi.org/10.17632/fp4phyrbkz.1
Explore at:
Unique identifier
https://doi.org/10.17632/fp4phyrbkz.1
Dataset updated
Apr 4, 2020
Authors
Dominik Tomaszuk
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The Molecular Entities in Linked Data (MEiLD) dataset comprises data of distinct atoms, molecules, ions, ion pairs, radicals, radical ions, and others that can be identifiable as separately distinguishable chemical entities. The dataset is provided in a JSON-LD format and was generated by the SDFEater, a tool that allows parsing atoms, bonds, and other molecule data. MEiLD contains 349,960 of ‘small’ chemical entities.
r
G6GFINDR
rrid.site
Updated Jan 29, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2022). G6GFINDR [Dataset]. http://identifiers.org/RRID:SCR_015821
Explore at:
Unique identifier
https://identifiers.org/RRID:SCR_015821 https://identifiers.org/RRID:SCR_015821/resolver?q=*&i=rrid
Dataset updated
Jan 29, 2022
Description
Query-based web application that helps users find bioinformatics and artificial intelligence (AI) software. G6GFINDR is powered by "semantic annotation" vs. keyword search, which take advantage of semantic web graph technology.
Data from: An Ontology-Based System for Querying Life in a Post-Taxonomic...
figshare.com
pdf
Updated Jan 19, 2016
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Nico Cellinese; Hilmar Lapp (2016). An Ontology-Based System for Querying Life in a Post-Taxonomic Age [Dataset]. http://doi.org/10.6084/m9.figshare.1401984.v1
Explore at:
pdfAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.1401984.v1
Dataset updated
Jan 19, 2016
Dataset provided by
Figsharehttp://figshare.com/
Authors
Nico Cellinese; Hilmar Lapp
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Grant proposal (project description and references cited) to the US National Science Foundation, Advances in Biological Informatics (ABI) program as Collaborative Research. Funded in 2015. Files include public abstract as submitted to NSF.
RDF/Jena : an extension for XSLT/Xalan. Testing with NCBI gene and the...
figshare.com
application/gzip
Updated Jun 1, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Pierre Lindenbaum (2023). RDF/Jena : an extension for XSLT/Xalan. Testing with NCBI gene and the disease ontology. [Dataset]. http://doi.org/10.6084/m9.figshare.105167.v3
Explore at:
application/gzipAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.105167.v3
Dataset updated
Jun 1, 2023
Dataset provided by
Figsharehttp://figshare.com/
figshare
Authors
Pierre Lindenbaum
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The current code contains an extension for the XSLT processor apache XALAN : it allows to search and inject some RDF statements during a XSLT transformation. As an example, the makefile transforms a NCBI-gene record to HTML and annotate it with the disease-ontology .
f
Additional file 2: of NeuroRDF: semantic integration of highly curated data...
datasetcatalog.nlm.nih.gov
Updated Dec 14, 2016
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Kawalia, Shweta; Raschka, Tamara; Senger, Philipp; Iyappan, Anandhi; Hofmann-Apitius, Martin (2016). Additional file 2: of NeuroRDF: semantic integration of highly curated data to prioritize biomarker candidates in Alzheimer's disease [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0001526461
Explore at:
Dataset updated
Dec 14, 2016
Authors
Kawalia, Shweta; Raschka, Tamara; Senger, Philipp; Iyappan, Anandhi; Hofmann-Apitius, Martin
Description
The developed RDF models and the SPARQL queries used are made available at: http://www.scai.fraunhofer.de/en/business-research-areas/bioinformatics/downloads/neurordf.html . (ZIP 178 kb)
h
umnsrs
huggingface.co
Updated Oct 20, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
BigScience Biomedical Datasets (2023). umnsrs [Dataset]. https://huggingface.co/datasets/bigbio/umnsrs
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Oct 20, 2023
Dataset authored and provided by
BigScience Biomedical Datasets
License
https://choosealicense.com/licenses/cc0-1.0/https://choosealicense.com/licenses/cc0-1.0/
Description
UMNSRS, developed by Pakhomov, et al., consists of 725 clinical term pairs whose semantic similarity and relatedness. The similarity and relatedness of each term pair was annotated based on a continuous scale by having the resident touch a bar on a touch sensitive computer screen to indicate the degree of similarity or relatedness. The following subsets are available: - similarity: A set of 566 UMLS concept pairs manually rated for semantic similarity (e.g. whale-dolphin) using a continuous response scale. - relatedness: A set of 588 UMLS concept pairs manually rated for semantic relatedness (e.g. needle-thread) using a continuous response scale. - similarity_mod: Modification of the UMNSRS-Similarity dataset to exclude control samples and those pairs that did not match text in clinical, biomedical and general English corpora. Exact modifications are detailed in the paper (Corpus Domain Effects on Distributional Semantic Modeling of Medical Terms. Serguei V.S. Pakhomov, Greg Finley, Reed McEwan, Yan Wang, and Genevieve B. Melton. Bioinformatics. 2016; 32(23):3635-3644). The resulting dataset contains 449 pairs. - relatedness_mod: Modification of the UMNSRS-Relatedness dataset to exclude control samples and those pairs that did not match text in clinical, biomedical and general English corpora. Exact modifications are detailed in the paper (Corpus Domain Effects on Distributional Semantic Modeling of Medical Terms. Serguei V.S. Pakhomov, Greg Finley, Reed McEwan, Yan Wang, and Genevieve B. Melton. Bioinformatics. 2016; 32(23):3635-3644). The resulting dataset contains 458 pairs.
b
WormBase
bioregistry.io
integbio.jp
Updated Apr 27, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2021). WormBase [Dataset]. http://identifiers.org/re3data:r3d100010424
Explore at:
Unique identifier
https://identifiers.org/wikidata:P3860, https://identifiers.org/biolink:WBVocab https://identifiers.org/re3data:r3d100010424
Dataset updated
Apr 27, 2021
License
https://bioregistry.io/spdx:CC0-1.0https://bioregistry.io/spdx:CC0-1.0
Description
WormBase is an online bioinformatics database of the biology and genome of the model organism Caenorhabditis elegans and other nematodes. It is used by the C. elegans research community both as an information resource and as a mode to publish and distribute their results. This collection references WormBase-accessioned entities.
GO term (Biological Process) similarity
figshare.com
datasetcatalog.nlm.nih.gov
txt
Updated Mar 9, 2020
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Vafaee Lab (2020). GO term (Biological Process) similarity [Dataset]. http://doi.org/10.6084/m9.figshare.11955177.v2
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.11955177.v2
Dataset updated
Mar 9, 2020
Dataset provided by
Figsharehttp://figshare.com/
figshare
Authors
Vafaee Lab
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Pairwise GO term sets (Biological Process) similarities induced by drug-pairs and their PPI partners (degree =2) among all small-molecule drugs modeled by semantic similarity.
Human Disease Ontology 2018 update: classification, content and workflow...
zenodo.org
data.niaid.nih.gov
csv
Updated Jun 29, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Quentin St.Charles; Quentin St.Charles (2023). Human Disease Ontology 2018 update: classification, content and workflow expansion [Dataset]. http://doi.org/10.1093/nar/gky1032
Explore at:
csvAvailable download formats
Unique identifier
https://doi.org/10.1093/nar/gky1032
Dataset updated
Jun 29, 2023
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Quentin St.Charles; Quentin St.Charles
License
Attribution 3.0 (CC BY 3.0)https://creativecommons.org/licenses/by/3.0/
License information was derived automatically
Description
ABSTRACT:

The Human Disease Ontology (DO) (http://www.disease-ontology.org), database has undergone significant expansion in the past three years. The DO disease classification includes specific formal semantic rules to express meaningful disease models and has expanded from a single asserted classification to include multiple-inferred mechanistic disease classifications, thus providing novel perspectives on related diseases. Expansion of disease terms, alternative anatomy, cell type and genetic disease classifications and workflow automation highlight the updates for the DO since 2015. The enhanced breadth and depth of the DO's knowledgebase has expanded the DO's utility for exploring the multi-etiology of human disease, thus improving the capture and communication of health-related data across biomedical databases, bioinformatics tools, genomic and cancer resources and demonstrated by a 6.6× growth in DO's user community since 2015. The DO's continual integration of human disease knowledge, evidenced by the more than 200 SVN/GitHub releases/revisions, since previously reported in our DO 2015 NAR paper, includes the addition of 2650 new disease terms, a 30% increase of textual definitions, and an expanding suite of disease classification hierarchies constructed through defined logical axioms.

Instructions:

Data was cleaned. Duplicates and unnecessary columns were removed. Title of columns were changed.

Inspiration:

This dataset uploaded to U-BRITE for "DRG_DEPOT" summer 2023 team project.

Acknowledgements:

Schriml, L. M., Mitraka, E., Munro, J., Tauber, B., Schor, M., Nickle, L., Felix, V., Jeng, L., Bearer, C., Lichenstein, R., Bisordi, K., Campion, N., Hyman, B., Kurland, D., Oates, C. P., Kibbey, S., Sreekumar, P., Le, C., Giglio, M., & Greene, C.

Human Disease Ontology 2018 update: classification, content and workflow expansion

Nucleic Acids Research 2019; 47(D1), D955–D962;PMID:30407550;DOI:https://doi.org/10.1093/nar/gky1032

U-BRITE last update data: 06/28/2023
b
EDAM Ontology
bioregistry.io
Updated Apr 24, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2021). EDAM Ontology [Dataset]. https://bioregistry.io/edam
Explore at:
Dataset updated
Apr 24, 2021
License
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Description
EDAM is an ontology of general bioinformatics concepts, including topics, data types, formats, identifiers and operations. EDAM provides a controlled vocabulary for the description, in semantic terms, of things such as: web services (e.g. WSDL files), applications, tool collections and packages, work-benches and workflow software, databases and ontologies, XSD data schema and data objects, data syntax and file formats, web portals and pages, resource catalogues and documents (such as scientific publications).
d
Open PHACTS
dknet.org
Updated Nov 9, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). Open PHACTS [Dataset]. http://identifiers.org/RRID:SCR_005050
Explore at:
Unique identifier
https://identifiers.org/RRID:SCR_005050 https://identifiers.org/RRID:SCR_005050/resolver/mentions?q=&i=rrid
Dataset updated
Nov 9, 2024
Description
Project that developed an open access discovery platform, called Open Pharmacological Space (OPS), via a semantic web approach, integrating pharmacological data from a variety of information resources and tools and services to question this integrated data to support pharmacological research. The project is based upon the assimilation of data already stored as triples, in the form subject-predicate-object. The software and data are available for download and local installation, under an open source and open access model. Tools and services are provided to query and visualize this data, and a sustainability plan will be in place, continuing the operation of the Open PHACTS Discovery Platform after the project funding ends. Throughout the project, a series of recommendations will be developed in conjunction with the community, building on open standards, to ensure wide applicability of the approaches used for integration of data.
f
Data from: Getting the best of Linked Data and Property Graphs: rdf2neo and...
swat4hcls.figshare.com
png
Updated Dec 5, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Marco Brandizi; Ajit Singh; Christopher Rawlings; Keywan Hassani-Pak (2018). Getting the best of Linked Data and Property Graphs: rdf2neo and the KnetMiner Use Case [Dataset]. http://doi.org/10.6084/m9.figshare.7314323.v1
Explore at:
pngAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.7314323.v1
Dataset updated
Dec 5, 2018
Dataset provided by
Semantic Web Applications and Tools for Healthcare and Life Sciences
Authors
Marco Brandizi; Ajit Singh; Christopher Rawlings; Keywan Hassani-Pak
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Paper submitted to SWAT4LS 2018. We introduce rdf2neo, a tool to populate Neo4j databases starting from RDF data sets, based on a configurable mapping between the two. By employing agrigenomics-related real use cases, we show how such mapping can allow for a hybrid approach to the management of networked knowledge, based on taking advantage of the best of both RDF and property graphs.
LISC 2013 - Results: Discussion Groups on Semantic Web and Reproducibility
commons.datacite.org
figshare.com
Updated Jan 18, 2016
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Paul Groth; Peter Ansell; Kjetil Kjernsmo; Jacco Van Ossenbruggen; Guillermo Palma; Carol Goble; Cameron McLean; Richard Hosking; Steve Cassidy; Jun Zhao; Prashant Gupta; Niels Ockeloen; Graham Klyne (2016). LISC 2013 - Results: Discussion Groups on Semantic Web and Reproducibility [Dataset]. http://doi.org/10.6084/m9.figshare.828798.v2
Explore at:
Unique identifier
https://doi.org/10.6084/m9.figshare.828798.v2
Dataset updated
Jan 18, 2016
Dataset provided by
DataCitehttps://www.datacite.org/
Figsharehttp://figshare.com/
figshare
Authors
Paul Groth; Peter Ansell; Kjetil Kjernsmo; Jacco Van Ossenbruggen; Guillermo Palma; Carol Goble; Cameron McLean; Richard Hosking; Steve Cassidy; Jun Zhao; Prashant Gupta; Niels Ockeloen; Graham Klyne
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Results of discussion groups at the Linked Science Workshop 2013 held at the International Semantic Web Conference. (http://linkedscience.org/events/lisc2013/) Participants were asked to develop a matrices about how semantic web/linked data solutions can help address reproducbility/re* problems. The results are documented in the spreadsheets above and described in videos (to be posted) The participants also developed a set of challenges for the Linked Science and broader semantic web community to help address these re* problems. See below or (lisc2013-challenges.txt) Linked Science Community Challenges The Linked Science 2013 workshop discussion participants identified several challenges to the Linked Data/Semantic Web community in order to help reproducibility (and other re* problems i.e. repurposing, reuse, etc) in science. 1) Promote the basics of linked data for reproducibility Many basic linked data technologies (e.g. content negotiation or the use of dereferenceable URLs) could be usable for scientific reproducibility and reproducibility. The goal here would be to develop a set of how-to documents that guide e-scientists on how to use these technologies to support scientific re* problems. An important point would be to tie these solutions directly to domain scientist problems. 2) Integrate Semantic Web technologies and the publishing process. Publishing is central to the scientific process and the issues of reusing scientific work. Semantic Web technologies should be integrated into the publishing process to enable reuse. 3) Make it easier to publish data and then work with it than work directly on your own data. Publishing data should enable a scientist to do more. Can we make it so that publishing data is so useful to the scientist themselves that it would be their first option? 4) Provide an integrated view of the how, what, when, where, and why of the scientific process. Linked data technologies are designed for integration and aggregation. Can we use these technologies to provide an integrated view over all the questions one might have with respect to a scientific experiment? 5) Provide a mechanisms for dealing with copyright on data both from a technical and social perspective. Dealing with copyright is not always straightforward. Can we eliminate the barriers to reuse through helping scientists with these copyright issues in an automatic fashion. 6) Get an altmetric based award into one of our own venues. Part of supporting re* problems is promoting sharing. We should "eat are own dogfood" by promoting and rewarding sharing in the major semantic web venues. We suggest an award based on some sort of altmetric. 7) Make sure the EBI RDF platform does not get shut down in two years. The European Bioinformatics Institute has released RDF versions with SPARQL endpoints for many of their core data sets. They are making it available for two years and checking on whether it is used to determine if it continues in the long term. This is a key data resource for using Linked Data for reproducibility - let's make sure it keeps going.
n
Allen Institute Neurowiki
neuinfo.org
scicrunch.org
+2more
Updated Oct 26, 2019
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2019). Allen Institute Neurowiki [Dataset]. http://identifiers.org/RRID:SCR_005042
Explore at:
Unique identifier
https://identifiers.org/RRID:SCR_005042
Dataset updated
Oct 26, 2019
Description
THIS RESOURCE IS NO LONGER IN SERVICE, documented September 6, 2016. The Allen Institute Neurowiki is a joint project between Vulcan Inc. and the Allen Institute to build a Semantic Wiki mapping genetic instances. It is a finished prototype testing the import pipelines and display componenets for combining 5 major RDF datasets from 4 different sources. Current planning includes mapping complete datasets, curating a better ontology, and creating multiple ontology management for a user class. Biological Linked Data Map: * Open, public online access * Data from multiple RDF data stores * Complete import pipeline using LDIF framework * Outlines of each imported instance embedding inline wiki properties and providing views of imported properties from original RDF datasets * Charting tools that ''''pivot'''' SPARQL queries providing several views of each query * Navigation and composition tools for accessing and mining the data Where did we get the data? * KEGG: Kyoto Encyclopedia of Genes and Genomes: KEGG GENES is a collection of gene catalogs for all complete genomes generated from publicly available resources, mostly NCBI RefSeq * Diseasome: The Diseasome website is a disease / disorder relationships explorer and a sample of an innovative map-oriented scientific work. Built by a team of researchers and engineers, it uses the Human Disease Network dataset. * DrugBank: The DrugBank database is a unique bioinformatics and cheminformatics resource that combines detailed drug data with comprehensive drug target information. * Sider: Sider contains information on marketed medicines and their recorded adverse drug reactions. The information is extracted from public documents and package inserts. Every piece of content on every instance page is generated by Semantic Result Formatters interpreting SPARQL results.
Provenance RDF Models
figshare.com
zip
Updated Jan 19, 2016
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Gang Fu (2016). Provenance RDF Models [Dataset]. http://doi.org/10.6084/m9.figshare.1399197.v1
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.1399197.v1
Dataset updated
Jan 19, 2016
Dataset provided by
Figsharehttp://figshare.com/
figshare
Authors
Gang Fu
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
A reference data set crowdsourced from multiple data sources. Codes to generate multiple provenance RDF models are available. The sample queries for comparative analysis are also included.
s
Data from: Whole Brain Catalog
scicrunch.org
Updated Oct 17, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2019). Whole Brain Catalog [Dataset]. http://identifiers.org/RRID:SCR_007011
Explore at:
Unique identifier
https://identifiers.org/RRID:SCR_007011
Dataset updated
Oct 17, 2019
Description
THIS RESOURCE IS NO LONGER IN SERVICE, documented May 26, 2016. An open source, downloadable, 3d atlas of the mouse brain and its cellular constituents that allows multi-scale data to be visualized in a seamless way, similar to Google earth. Data within the Catalog is marked up with annotations and can link out to additional data sources via a semantic framework. This next generation open environment has been developed to connect members of the neuroscience community to facilitate solutions for today's intractable challenges in brain research through cooperation and crowd sourcing. The client-server platform provides rich 3-D views for researchers to zoom in, out, and around structures deep in a multi-scale spatial framework of the mouse brain. An open-source, 3-D graphics engine used in graphics-intensive computer gaming generates high-resolution visualizations that bring data to life through biological simulations and animations. Within the Catalog, researchers can view and contribute a wide range of data including: * 3D meshes of subcellular scenes or brain region territories * Large 2D image datasets from both electron and light level microscopy * NeuroML and Neurolucida neuronal reconstructions * Protein Database molecular structures Users of the Whole Brain Catalog can: * Fit data of any scale into the international standard atlas coordinate system for spatial brain mapping, the Waxholm Space. * View brain slices, neurons and their animation, neuropil reconstructions, and molecules in appropriate locations * View data up close and at a high resolution * View their own data in the Whole Brain Catalog environment * View data within a semantic environment supported by vocabularies from the Neuroscience Information Framework (NIF) at http://www.neuinfo.org. * Contribute code and connect personal tools to the environment * Make new connections with related research and researchers 5 Easy Ways to Explore: * Explore the datasets across multiple scales. * View data closely at high resolution. * Observe accurately simulated neurons. * Readily search for content. * Contribute your own research.
S
Crop trait regulating-genes knowledge graph dataset
scidb.cn
Updated Jan 3, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
zhang dan dan (2025). Crop trait regulating-genes knowledge graph dataset [Dataset]. http://doi.org/10.57760/sciencedb.agriculture.00175
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.57760/sciencedb.agriculture.00175
Dataset updated
Jan 3, 2025
Dataset provided by
Science Data Bank
Authors
zhang dan dan
License
Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically
Description
In the scientific research of crop breeding, breeding new crop varieties with various excellent traits has always been the direction of efforts of breeders. At present, with the accelerated application of information technology in the field of crop breeding, the multi-dimensional scientific data related to crop breeding has shown exponential growth. These semi-structured and structured scientific data are distributed in scientific databases in different fields and lack the association and fusion of multi-dimensional scientific data across species. It hindered the transfer and reuse of existing crop breeding knowledge and maximized the value of crop breeding scientific data, which brought challenges to the knowledge discovery of crop trait regulation genes. Therefore, more and more crop breeding research work is based on the reorganization, correlation, analysis and utilization of existing breeding scientific data, so as to achieve the discovery of crop trait regulation gene knowledge.The dataset of knowledge map of crop trait regulatory genes was selected from PubMed literature database, Phytozome (genomic information of 4 species) and Ensembl (European Molecular Biology Laboratory's European) Bioinformatics Institute (Bioinformatics Institute) plants (Genome information of 4 species), UniProt (Universal Protein) (protein Annotation information of 4 species), Rice Genome Annotation (RGAP) Project), STRING (protein interaction information for 4 species), Pfam (Protein family analysis and modeling) (protein family information for 4 species), KEGG (Kyoto Encyclopedia of Genes) The entities and relationships of the multi-source scientific data with different data formats were extracted using the and Genomes (pathway annotation information of the 4 species) and the GO (Gene Ontology) domain scientific database as the data sources. It mainly includes mapping knowledge extraction for structured data. For XML semi-structured data, knowledge extraction based on Kettle data analysis is adopted. For FASTA semi-structured data, knowledge extraction based on BLAST model is adopted. For Text unstructured data, knowledge extraction based on large language model is adopted. On the basis of the above entity and relationship extraction, the association fusion of multi-source crop breeding knowledge was realized based on entity mapping and specific attribute association. Finally, the crop trait regulatory gene knowledge map dataset was formed, which consisted of 13 entity datasets and 16 entity relationship datasets.The crop trait -egulating gene knowledge graph dataset provides a key semantic model and important data basis for crop breeding knowledge discovery, such as excellent pleiotropic gene discovery, cross-species gene function prediction and potential discovery of pathway gene network.

Facebook

Twitter

Click to copy link

Link copied

Cite

Gaston Mazandu; Kenneth B. Opap; Funmilayo Makinde; Victoria Nembaware; Francis Agamah; Christian Bope; Emile R. Chimusa; Ambroise Wonkam; Nicola Mulder (2021). Semantic Similarity Score Calculation and Reproducibility [Dataset]. http://doi.org/10.6084/m9.figshare.14599992.v2

Semantic Similarity Score Calculation and Reproducibility

Explore at:

txtAvailable download formats

Unique identifier

https://doi.org/10.6084/m9.figshare.14599992.v2

Dataset updated

May 14, 2021

Dataset provided by

Figsharehttp://figshare.com/

Authors

Gaston Mazandu; Kenneth B. Opap; Funmilayo Makinde; Victoria Nembaware; Francis Agamah; Christian Bope; Emile R. Chimusa; Ambroise Wonkam; Nicola Mulder

License

https://www.gnu.org/licenses/gpl-3.0.htmlhttps://www.gnu.org/licenses/gpl-3.0.html

Description

Building the annotation file, consisting of protein (entity)-gene ontology process map extracted from the GOA UniProt dataset at ftp://ftp.ebi.ac.uk/pub/databases/GO/goa/UNIPROT/goa_uniprot_all.gaf.gz. This protein-process map file is used to generate protein pairs used for testing the PySML library. Semantic similarity scores produced are also included.

Clear search

Close search

Google apps

Main menu

Semantic Similarity Score Calculation and Reproducibility

Data from: The new bioinformatics: integrating ecological data from the gene...

Extracted Schemas from the Life Sciences Linked Open Data Cloud

The Molecular Entities in Linked Data Dataset

G6GFINDR

Data from: An Ontology-Based System for Querying Life in a Post-Taxonomic...

RDF/Jena : an extension for XSLT/Xalan. Testing with NCBI gene and the...

Additional file 2: of NeuroRDF: semantic integration of highly curated data...

umnsrs

WormBase

GO term (Biological Process) similarity

Human Disease Ontology 2018 update: classification, content and workflow...

EDAM Ontology

Open PHACTS

Data from: Getting the best of Linked Data and Property Graphs: rdf2neo and...

LISC 2013 - Results: Discussion Groups on Semantic Web and Reproducibility

Allen Institute Neurowiki

Provenance RDF Models

Data from: Whole Brain Catalog

Crop trait regulating-genes knowledge graph dataset

Semantic Similarity Score Calculation and Reproducibility