100+ datasets found

Bioinformatic databases survey
zenodo.org
csv
Updated Aug 17, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Alise Ponsero; Alise Ponsero; Bonnie Hurwitz; Bonnie Hurwitz; Kiran Smelser; Kiran Smelser; Karen Valencia; Lucas Jimenez Miranda; Lucas Jimenez Miranda; Abby McDermott; Karen Valencia; Abby McDermott (2024). Bioinformatic databases survey [Dataset]. http://doi.org/10.5281/zenodo.12790448
Explore at:
csvAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.12790448
Dataset updated
Aug 17, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Alise Ponsero; Alise Ponsero; Bonnie Hurwitz; Bonnie Hurwitz; Kiran Smelser; Kiran Smelser; Karen Valencia; Lucas Jimenez Miranda; Lucas Jimenez Miranda; Abby McDermott; Karen Valencia; Abby McDermott
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Bioinformatic databases survey

The dataset surveys bioinformatic databases published in the NAR database issue from 1995 to 2022. It evaluates the current number of citations and availability of each ressources.

Data content

The dataset is composed of two tables :

A. Databases table : Contains the information of each database published in the NAR database issue.

db_id : Database ID in the dataset

resource_name : Name(s) of the database

current_access : Latest known web address of the database

is_a_pun : The database name is a play on word

available_2022 : The database was accessible online during the 2022 survey

last_accessible_year : If not accessible, latest point in time where the database was found online (using the Internet web archive snapshots)

unavailable_message : If not accessible, the message/error when trying to access the ressource

year_first_publication : Year of first publication of the database

year_last_publication : Year of latest publication of the database (including database update publications)

total_citations_2022 : Cumulative number of citation for all articles of the database

nb_authors_max : Maximum number of authors associated to any articles published for that database

nb_articles_2022 : Number of articles published for that database in 2022

B. Articles table : Contains the information collected for the NAR articles

collector : Person who contributed to add this database in the dataset

article_global_id : DOI of the article surveyed

db_id : Database ID of the ressource described in the article

article_id : Article unique ID

article_year : Article publication year

Authors : list of authors of the article. Separated by ";"

Author.ID : list of ORCID of the authors of the article. Separated by ";"

Title : Title of the atricle

Source.title : Journal name

Volume : Volume number

Issue : Issue number

Funding.Details : Funding information of the article

Funding.Text : Funding text provided by the authors

PubMed.ID : Pubmed ID of the article

citations_2016 : Number of citations of the article in 2016 (if published)

citations_2022 : Number of citations of the article in 2022

nb_authors : Number of authors in the article

Index.Keywords : Keywords associated to the publication

Data sources

Note that the presented dataset leverage and expand on the dataset gathered and published in Imker, H.J., 2020. Who Bears the Burden of Long-Lived Molecular Biology Databases?. Data Science Journal, 19(1), p.8. The original dataset collected by Dr. Imker is available at : https://doi.org/10.13012/B2IDB-4311325_V1

The dataset was collected and is maintained by undergraduate students of a CURE class (Course-based Undergraduate Research Experience) held at the University of Arizona. All students of the class have participated to the collection, update and curation the dataset that is available as a database and a web-portal at https://hurwitzlab.shinyapps.io/DS_Heroes/. Students could elect to be added or not as author to this Zenodo repository.

The CURE class BAT102 "Data Science Heroes: An undergraduate research experience in Open Data Science Practices" gives the students an opportunity to learn about open science and investigate open data practices in bioinformatics through a survey of the databases published in the NAR database issue.
e
PROSITE profiles
ebi.ac.uk
Updated Feb 5, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). PROSITE profiles [Dataset]. https://www.ebi.ac.uk/interpro/
Explore at:
Dataset updated
Feb 5, 2025
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
PROSITE is a database of protein families and domains. It consists of biologically significant sites, patterns and profiles that help to reliably identify to which known protein family a new sequence belongs. PROSITE is based at the Swiss Institute of Bioinformatics (SIB), Geneva, Switzerland.
n
Bioinformatics Links Directory
neuinfo.org
scicrunch.org
+2more
Updated Jan 29, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2022). Bioinformatics Links Directory [Dataset]. http://identifiers.org/RRID:SCR_008018
Explore at:
Unique identifier
https://identifiers.org/RRID:SCR_008018
Dataset updated
Jan 29, 2022
Description
Database of curated links to molecular resources, tools and databases selected on the basis of recommendations from bioinformatics experts in the field. This resource relies on input from its community of bioinformatics users for suggestions. Starting in 2003, it has also started listing all links contained in the NAR Webserver issue. The different types of information available in this portal: * Computer Related: This category contains links to resources relating to programming languages often used in bioinformatics. Other tools of the trade, such as web development and database resources, are also included here. * Sequence Comparison: Tools and resources for the comparison of sequences including sequence similarity searching, alignment tools, and general comparative genomics resources. * DNA: This category contains links to useful resources for DNA sequence analyses such as tools for comparative sequence analysis and sequence assembly. Links to programs for sequence manipulation, primer design, and sequence retrieval and submission are also listed here. * Education: Links to information about the techniques, materials, people, places, and events of the greater bioinformatics community. Included are current news headlines, literature sources, educational material and links to bioinformatics courses and workshops. * Expression: Links to tools for predicting the expression, alternative splicing, and regulation of a gene sequence are found here. This section also contains links to databases, methods, and analysis tools for protein expression, SAGE, EST, and microarray data. * Human Genome: This section contains links to draft annotations of the human genome in addition to resources for sequence polymorphisms and genomics. Also included are links related to ethical discussions surrounding the study of the human genome. * Literature: Links to resources related to published literature, including tools to search for articles and through literature abstracts. Additional text mining resources, open access resources, and literature goldmines are also listed. * Model Organisms: Included in this category are links to resources for various model organisms ranging from mammals to microbes. These include databases and tools for genome scale analyses. * Other Molecules: Bioinformatics tools related to molecules other than DNA, RNA, and protein. This category will include resources for the bioinformatics of small molecules as well as for other biopolymers including carbohydrates and metabolites. * Protein: This category contains links to useful resources for protein sequence and structure analyses. Resources for phylogenetic analyses, prediction of protein features, and analyses of interactions are also found here. * RNA: Resources include links to sequence retrieval programs, structure prediction and visualization tools, motif search programs, and information on various functional RNAs.
r
University of Pittsburgh Bioinformatics Resources Collection
rrid.site
Updated Aug 9, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). University of Pittsburgh Bioinformatics Resources Collection [Dataset]. http://identifiers.org/RRID:SCR_005845
Explore at:
Unique identifier
https://identifiers.org/RRID:SCR_005845
Dataset updated
Aug 9, 2025
Description
THIS RESOURCE IS NO LONGER IN SERVICE, documented August 23, 2016. To bridge the gap between the rising information needs of biological and medical researchers and the rapidly growing number of online bioinformatics resources we have created the Online Bioinformatics Resources Collection (OBRC) at the Health Sciences Library System at the University of Pittsburgh. The OBRC containing 1542 major online bioinformatics databases and software tools was constructed using the HSLS content management system built on the Zope? Web application server. To enhance the output of search results we further implemented the Vivsimo Clustering Engine? which automatically organizes the search results into categories created dynamically based on the textual information of the retrieved records. As the largest online collection of its kind and the only one with advanced search results clustering OBRC is aimed at becoming a one-stop guided information gateway to the major bioinformatics databases and software tools on the Web. OBRC is available at the University of Pittsburgh's Health Sciences Library System.
f
Table_4_Comprehensive Review of Web Servers and Bioinformatics Tools for...
datasetcatalog.nlm.nih.gov
frontiersin.figshare.com
Updated Feb 5, 2020
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Zheng, Hong; Xie, Longxiang; Zhu, Wan; Dong, Huan; Guo, Xiangqian; Zhang, Lu; Li, Yongqiang; Yan, Zhongyi; Li, Huimin; Zhang, Guosen; Han, Yali; An, Yang; Wang, Qiang (2020). Table_4_Comprehensive Review of Web Servers and Bioinformatics Tools for Cancer Prognosis Analysis.XLSX [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0000449982
Explore at:
Dataset updated
Feb 5, 2020
Authors
Zheng, Hong; Xie, Longxiang; Zhu, Wan; Dong, Huan; Guo, Xiangqian; Zhang, Lu; Li, Yongqiang; Yan, Zhongyi; Li, Huimin; Zhang, Guosen; Han, Yali; An, Yang; Wang, Qiang
Description
Prognostic biomarkers are of great significance to predict the outcome of patients with cancer, to guide the clinical treatments, to elucidate tumorigenesis mechanisms, and offer the opportunity of identifying therapeutic targets. To screen and develop prognostic biomarkers, high throughput profiling methods including gene microarray and next-generation sequencing have been widely applied and shown great success. However, due to the lack of independent validation, only very few prognostic biomarkers have been applied for clinical practice. In order to cross-validate the reliability of potential prognostic biomarkers, some groups have collected the omics datasets (i.e., epigenetics/transcriptome/proteome) with relative follow-up data (such as OS/DSS/PFS) of clinical samples from different cohorts, and developed the easy-to-use online bioinformatics tools and web servers to assist the biomarker screening and validation. These tools and web servers provide great convenience for the development of prognostic biomarkers, for the study of molecular mechanisms of tumorigenesis and progression, and even for the discovery of important therapeutic targets. Aim to help researchers to get a quick learning and understand the function of these tools, the current review delves into the introduction of the usage, characteristics and algorithms of tools, and web servers, such as LOGpc, KM plotter, GEPIA, TCPA, OncoLnc, PrognoScan, MethSurv, SurvExpress, UALCAN, etc., and further help researchers to select more suitable tools for their own research. In addition, all the tools introduced in this review can be reached at http://bioinfo.henu.edu.cn/WebServiceList.html.
e
HAMAP
ebi.ac.uk
Updated Feb 5, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). HAMAP [Dataset]. https://www.ebi.ac.uk/interpro/
Explore at:
Dataset updated
Feb 5, 2025
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
HAMAP stands for High-quality Automated and Manual Annotation of Proteins. HAMAP profiles are manually created by expert curators. They identify proteins that are part of well-conserved protein families or subfamilies. HAMAP is based at the SIB Swiss Institute of Bioinformatics, Geneva, Switzerland.
B
Bioinformatics Cloud Platform Report
archivemarketresearch.com
doc, pdf, ppt
Updated Jan 6, 2026
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Archive Market Research (2026). Bioinformatics Cloud Platform Report [Dataset]. https://www.archivemarketresearch.com/reports/bioinformatics-cloud-platform-58815
Explore at:
ppt, doc, pdfAvailable download formats
Dataset updated
Jan 6, 2026
Dataset authored and provided by
Archive Market Research
License
https://www.archivemarketresearch.com/privacy-policyhttps://www.archivemarketresearch.com/privacy-policy
Time period covered
2026 - 2034
Area covered
Global
Variables measured
Market Size
Description
The Bioinformatics Cloud Platform market is booming, projected to reach $10 billion by 2033 with a 20% CAGR. Discover key trends, drivers, restraints, and leading companies shaping this rapidly evolving sector in genomics, drug discovery, and academic research. Learn more about SaaS, PaaS, and IaaS solutions.
f
FAIRsharing record for: Poxvirus Bioinformatics Resource Center
fairsharing.org
search.datacite.org
Updated Jan 4, 2017
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2017). FAIRsharing record for: Poxvirus Bioinformatics Resource Center [Dataset]. http://doi.org/10.25504/FAIRsharing.bn6jba
Explore at:
Unique identifier
https://doi.org/10.25504/FAIRsharing.bn6jba
Dataset updated
Jan 4, 2017
License
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Description
This FAIRsharing record describes: Poxvirus Bioinformatics Resource Center has been established to provide specialized web-based resources to the scientific community studying poxviruses. This resource is no longer being maintained. For tools and data supporting virus genomics, especially related to poxviruses and other large DNA viruses, please visit the Viral Bioinformatics site maintained by our collaborator, Chris Upton: http://virology.ca For information on virus taxonomy, please visit the ICTV web site at http://www.ictvonline.org/ For updated sequence data and analytical tools, please visit http://www.viprbrc.org
Extracted Schemas from the Life Sciences Linked Open Data Cloud
figshare.com
txt
Updated Jun 1, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Maulik Kamdar (2023). Extracted Schemas from the Life Sciences Linked Open Data Cloud [Dataset]. http://doi.org/10.6084/m9.figshare.12402425.v2
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.12402425.v2
Dataset updated
Jun 1, 2023
Dataset provided by
Figsharehttp://figshare.com/
Authors
Maulik Kamdar
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset is related to the manuscript "An empirical meta-analysis of the life sciences linked open data on the web" published at Nature Scientific Data. If you use the dataset, please cite the manuscript as follows:Kamdar, M.R., Musen, M.A. An empirical meta-analysis of the life sciences linked open data on the web. Sci Data 8, 24 (2021). https://doi.org/10.1038/s41597-021-00797-yWe have extracted schemas from more than 80 publicly available biomedical linked data graphs in the Life Sciences Linked Open Data (LSLOD) cloud into an LSLOD schema graph and conduct an empirical meta-analysis to evaluate the extent of semantic heterogeneity across the LSLOD cloud. The dataset published here contains the following files:- The set of Linked Data Graphs from the LSLOD cloud from which schemas are extracted.- Refined Sets of extracted classes, object properties, data properties, and datatypes, shared across the Linked Data Graphs on LSLOD cloud. Where the schema element is reused from a Linked Open Vocabulary or an ontology, it is explicitly indicated.- The LSLOD Schema Graph, which contains all the above extracted schema elements interlinked with each other based on the underlying content. Sample instances and sample assertions are also provided along with broad level characteristics of the modeled content. The LSLOD Schema Graph is saved as a JSON Pickle File. To read the JSON object in this Pickle file use the Python command as follows:with open('LSLOD-Schema-Graph.json.pickle' , 'rb') as infile: x = pickle.load(infile, encoding='iso-8859-1')Check the Referenced Link for more details on this research, raw data files, and code references.
Predefined workflows in the ZBIT Bioinformatics Toolbox.
plos.figshare.com
xls
Updated Jun 1, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Michael Römer; Johannes Eichner; Andreas Dräger; Clemens Wrzodek; Finja Wrzodek; Andreas Zell (2023). Predefined workflows in the ZBIT Bioinformatics Toolbox. [Dataset]. http://doi.org/10.1371/journal.pone.0149263.t001
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0149263.t001
Dataset updated
Jun 1, 2023
Dataset provided by
PLOShttp://plos.org/
Authors
Michael Römer; Johannes Eichner; Andreas Dräger; Clemens Wrzodek; Finja Wrzodek; Andreas Zell
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Predefined workflows in the ZBIT Bioinformatics Toolbox.
M
PATRIC: Bacterial Bioinformatics Resource Center
datacatalog.mskcc.org
Updated Nov 13, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2019). PATRIC: Bacterial Bioinformatics Resource Center [Dataset]. https://datacatalog.mskcc.org/dataset/10392
Explore at:
Dataset updated
Nov 13, 2019
Description
PATRIC (Pathosystems Resource Integration Center) is the Bacterial Bioinformatics Resource Center, an information system designed to support the biomedical research community’s work on bacterial infectious diseases via integration of vital pathogen information with rich data and analysis tools. PATRIC sharpens and hones the scope of available bacterial phylogenomic data from numerous sources specifically for the bacterial research community, in order to save biologists time and effort when conducting comparative analyses. The freely available PATRIC platform provides an interface for biologists to discover data and information and conduct comprehensive comparative genomics and other analyses in a one-stop shop.
I
Funding and Operating Organizations for Long-Lived Molecular Biology...
databank.illinois.edu
aws-databank-alb.library.illinois.edu
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Heidi Imker, Funding and Operating Organizations for Long-Lived Molecular Biology Databases [Dataset]. http://doi.org/10.13012/B2IDB-3993338_V1
Explore at:
Unique identifier
https://doi.org/10.13012/B2IDB-3993338_V1
Authors
Heidi Imker
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
The organizations that contribute to the longevity of 67 long-lived molecular biology databases published in Nucleic Acids Research (NAR) between 1991-2016 were identified to address two research questions 1) which organizations fund these databases? and 2) which organizations maintain these databases? Funders were determined by examining funding acknowledgements in each database's most recent NAR Database Issue update article published (prior to 2017) and organizations operating the databases were determine through review of database websites.
f
Program and web site of bioinformatics used in this study.
datasetcatalog.nlm.nih.gov
plos.figshare.com
Updated Jul 26, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sakae, Kotaro; Furuhashi, Miyuna; Nagano, Keiji; Hasegawa, Yoshiaki (2021). Program and web site of bioinformatics used in this study. [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0000736824
Explore at:
Dataset updated
Jul 26, 2021
Authors
Sakae, Kotaro; Furuhashi, Miyuna; Nagano, Keiji; Hasegawa, Yoshiaki
Description
Program and web site of bioinformatics used in this study.
PBMC training data
figshare.com
hdf
Updated May 31, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Kevin Menden (2023). PBMC training data [Dataset]. http://doi.org/10.6084/m9.figshare.8052221.v1
Explore at:
hdfAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.8052221.v1
Dataset updated
May 31, 2023
Dataset provided by
Figsharehttp://figshare.com/
figshare
Authors
Kevin Menden
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This is the PBMC training dataset used for training Scaden models to perform deconvolution on PBMC RNA-seq datasets. It is compiled from four different PBMC scRNA-seq datasets downloaded from the 10X Genomics website (donorA, donorC, data6k, data8k).The datasets downloaded from 10X Genomics were processed and used to generate artificial bulk RNA-seq samples, which result in this dataset. A link to the 10X Genomics datasets site is provided.
バイオサイエンスにおけるID
figshare.com
pdf
Updated Jun 2, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Toshiaki Katayama (2023). バイオサイエンスにおけるID [Dataset]. http://doi.org/10.6084/m9.figshare.6597509.v1
Explore at:
pdfAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.6597509.v1
Dataset updated
Jun 2, 2023
Dataset provided by
figshare
Figsharehttp://figshare.com/
Authors
Toshiaki Katayama
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Presentation slides on the identifiers in biosciences at the Japan Open Science Summit (JOSS) 2018.
Ensembl TSS dataset for GRCh38
zenodo.org
investigacion.ubu.es
bin
Updated Aug 26, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
José A. Barbero-Aparicio; José A. Barbero-Aparicio; Alicia Olivares-Gil; Alicia Olivares-Gil; José F. Díez-Pastor; José F. Díez-Pastor; César García-Osorio; César García-Osorio (2024). Ensembl TSS dataset for GRCh38 [Dataset]. http://doi.org/10.5281/zenodo.7147597
Explore at:
binAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.7147597
Dataset updated
Aug 26, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
José A. Barbero-Aparicio; José A. Barbero-Aparicio; Alicia Olivares-Gil; Alicia Olivares-Gil; José F. Díez-Pastor; José F. Díez-Pastor; César García-Osorio; César García-Osorio
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
We used the human genome reference sequence in its GRCh38.p13 version in order to have a reliable source of data in which to carry out our experiments. We chose this version because it is the most recent one available in Ensemble at the moment. However, the DNA sequence by itself is not enough, the specific TSS position of each transcript is needed. In this section, we explain the steps followed to generate the final dataset. These steps are: raw data gathering, positive instances processing, negative instances generation and data splitting by chromosomes.

First, we need an interface in order to download the raw data, which is composed by every transcript sequence in the human genome. We used Ensembl release 104 (Howe et al., 2020) and its utility BioMart (Smedley et al., 2009), which allows us to get large amounts of data easily. It also enables us to select a wide variety of interesting fields, including the transcription start and end sites. After filtering instances that present null values in any relevant field, this combination of the sequence and its flanks will form our raw dataset. Once the sequences are available, we find the TSS position (given by Ensembl) and the 2 following bases to treat it as a codon. After that, 700 bases before this codon and 300 bases after it are concatenated, getting the final sequence of 1003 nucleotides that is going to be used in our models. These specific window values have been used in (Bhandari et al., 2021) and we have kept them as we find it interesting for comparison purposes. One of the most sensitive parts of this dataset is the generation of negative instances. We cannot get this kind of data in a straightforward manner, so we need to generate it synthetically. In order to get examples of negative instances, i.e. sequences that do not represent a transcript start site, we select random DNA positions inside the transcripts that do not correspond to a TSS. Once we have selected the specific position, we get 700 bases ahead and 300 bases after it as we did with the positive instances.

Regarding the positive to negative ratio, in a similar problem, but studying TIS instead of TSS (Zhang135
et al., 2017), a ratio of 10 negative instances to each positive one was found optimal. Following this136
idea, we select 10 random positions from the transcript sequence of each positive codon and label them137
as negative instances. After this process, we end up with 1,122,113 instances: 102,488 positive and 1,019,625 negative sequences. In order to validate and test our models, we need to split this dataset into three parts: train, validation and test. We have decided to make this differentiation by chromosomes, as it is done in (Perez-Rodriguez et al., 2020). Thus, we use chromosome 16 as validation because it is a good example of a chromosome with average characteristics. Then we selected samples from chromosomes 1, 3, 13, 19 and 21 to be part of the test set and used the rest of them to train our models. Every step of this process can be replicated using the scripts available in https://github.com/JoseBarbero/EnsemblTSSPrediction.
e
NCBIFAM
ebi.ac.uk
Updated Aug 6, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). NCBIFAM [Dataset]. https://www.ebi.ac.uk/interpro/
Explore at:
Dataset updated
Aug 6, 2025
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
NCBIfam is a collection of protein families, featuring curated multiple sequence alignments, hidden Markov models (HMMs) and annotation, which provides a tool for identifying functionally related proteins based on sequence homology. NCBIfam is maintained at the National Center for Biotechnology Information (Bethesda, MD). NCBIfam includes models from TIGRFAMs, another database of protein families developed at The Institute for Genomic Research, then at the J. Craig Venter Institute (Rockville, MD, US).
f
Data from: Getting the best of Linked Data and Property Graphs: rdf2neo and...
swat4hcls.figshare.com
png
Updated Dec 5, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Marco Brandizi; Ajit Singh; Christopher Rawlings; Keywan Hassani-Pak (2018). Getting the best of Linked Data and Property Graphs: rdf2neo and the KnetMiner Use Case [Dataset]. http://doi.org/10.6084/m9.figshare.7314323.v1
Explore at:
pngAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.7314323.v1
Dataset updated
Dec 5, 2018
Dataset provided by
Semantic Web Applications and Tools for Healthcare and Life Sciences
Authors
Marco Brandizi; Ajit Singh; Christopher Rawlings; Keywan Hassani-Pak
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Paper submitted to SWAT4LS 2018. We introduce rdf2neo, a tool to populate Neo4j databases starting from RDF data sets, based on a configurable mapping between the two. By employing agrigenomics-related real use cases, we show how such mapping can allow for a hybrid approach to the management of networked knowledge, based on taking advantage of the best of both RDF and property graphs.
bioinformatics.com.cn Website Traffic, Ranking, Analytics [December 2025]
sr01.toolswala.net
Updated Jan 13, 2026
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Semrush (2026). bioinformatics.com.cn Website Traffic, Ranking, Analytics [December 2025] [Dataset]. https://sr01.toolswala.net/_www/website/bioinformatics.com.cn/overview/
Explore at:
Dataset updated
Jan 13, 2026
Dataset authored and provided by
Semrushhttps://fr.semrush.com/
License
https://sr01.toolswala.net/_www/company/legal/terms-of-service/https://sr01.toolswala.net/_www/company/legal/terms-of-service/
Time period covered
Jan 13, 2026
Area covered
Worldwide
Variables measured
visits, backlinks, bounceRate, pagesPerVisit, authorityScore, organicKeywords, avgVisitDuration, referringDomains, trafficByCountry, paidSearchTraffic, and 3 more
Measurement technique
Semrush Traffic Analytics; Click-stream data
Description
bioinformatics.com.cn is ranked #4532 in CN with 113.62K Traffic. Categories: . Learn more about website traffic, market share, and more!
f
Additional file 1 of INSaFLU-TELEVIR: an open web-based bioinformatics suite...
datasetcatalog.nlm.nih.gov
Updated Aug 15, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Horton, Daniel L.; Santos, João Dourado; Pinheiro, Miguel; Santos, André; Pinto, Miguel; Bogaardt, Carlijn; Mamede, Rafael; Gomes, João Paulo; Borges, Vítor; Isidro, Joana; Sobral, Daniel; Eusébio, Rodrigo (2024). Additional file 1 of INSaFLU-TELEVIR: an open web-based bioinformatics suite for viral metagenomic detection and routine genomic surveillance [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0001405514
Explore at:
Dataset updated
Aug 15, 2024
Authors
Horton, Daniel L.; Santos, João Dourado; Pinheiro, Miguel; Santos, André; Pinto, Miguel; Bogaardt, Carlijn; Mamede, Rafael; Gomes, João Paulo; Borges, Vítor; Isidro, Joana; Sobral, Daniel; Eusébio, Rodrigo
Description
Additional file 1. Benchmark of the INSaFLU-TELEVIR pipeline for virus detection (TELEVIR): Resources, Workflow details, Benchmark and Implementation. Additional file 2. Benchmarking of INSaFLU against commonly used command line bioinformatics workflows for SARS-CoV-2 reference-based consensus generation (amplicon-based Illumina and ONT data), and validation of the INSaFLU snakemake pipeline. Additional file 3: Supplementary figures 1-8. Additional file 4: Supplementary tables 1-8.

Facebook

Twitter

Click to copy link

Link copied

Cite

Alise Ponsero; Alise Ponsero; Bonnie Hurwitz; Bonnie Hurwitz; Kiran Smelser; Kiran Smelser; Karen Valencia; Lucas Jimenez Miranda; Lucas Jimenez Miranda; Abby McDermott; Karen Valencia; Abby McDermott (2024). Bioinformatic databases survey [Dataset]. http://doi.org/10.5281/zenodo.12790448

Bioinformatic databases survey

Explore at:

csvAvailable download formats

Unique identifier

https://doi.org/10.5281/zenodo.12790448

Dataset updated

Aug 17, 2024

Dataset provided by

Zenodohttp://zenodo.org/

Authors

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Bioinformatic databases survey

The dataset surveys bioinformatic databases published in the NAR database issue from 1995 to 2022. It evaluates the current number of citations and availability of each ressources.

Data content

The dataset is composed of two tables :

A. Databases table : Contains the information of each database published in the NAR database issue.

db_id : Database ID in the dataset
resource_name : Name(s) of the database
current_access : Latest known web address of the database
is_a_pun : The database name is a play on word
available_2022 : The database was accessible online during the 2022 survey
last_accessible_year : If not accessible, latest point in time where the database was found online (using the Internet web archive snapshots)
unavailable_message : If not accessible, the message/error when trying to access the ressource
year_first_publication : Year of first publication of the database
year_last_publication : Year of latest publication of the database (including database update publications)
total_citations_2022 : Cumulative number of citation for all articles of the database
nb_authors_max : Maximum number of authors associated to any articles published for that database
nb_articles_2022 : Number of articles published for that database in 2022

B. Articles table : Contains the information collected for the NAR articles

collector : Person who contributed to add this database in the dataset
article_global_id : DOI of the article surveyed
db_id : Database ID of the ressource described in the article
article_id : Article unique ID
article_year : Article publication year
Authors : list of authors of the article. Separated by ";"
Author.ID : list of ORCID of the authors of the article. Separated by ";"
Title : Title of the atricle
Source.title : Journal name
Volume : Volume number
Issue : Issue number
Funding.Details : Funding information of the article
Funding.Text : Funding text provided by the authors
PubMed.ID : Pubmed ID of the article
citations_2016 : Number of citations of the article in 2016 (if published)
citations_2022 : Number of citations of the article in 2022
nb_authors : Number of authors in the article
Index.Keywords : Keywords associated to the publication

Data sources

Note that the presented dataset leverage and expand on the dataset gathered and published in Imker, H.J., 2020. Who Bears the Burden of Long-Lived Molecular Biology Databases?. Data Science Journal, 19(1), p.8. The original dataset collected by Dr. Imker is available at : https://doi.org/10.13012/B2IDB-4311325_V1

The dataset was collected and is maintained by undergraduate students of a CURE class (Course-based Undergraduate Research Experience) held at the University of Arizona. All students of the class have participated to the collection, update and curation the dataset that is available as a database and a web-portal at https://hurwitzlab.shinyapps.io/DS_Heroes/. Students could elect to be added or not as author to this Zenodo repository.

The CURE class BAT102 "Data Science Heroes: An undergraduate research experience in Open Data Science Practices" gives the students an opportunity to learn about open science and investigate open data practices in bioinformatics through a survey of the databases published in the NAR database issue.

Clear search

Close search

Google apps

Main menu

Bioinformatic databases survey

Bioinformatic databases survey

Data content

Data sources

PROSITE profiles

Bioinformatics Links Directory

University of Pittsburgh Bioinformatics Resources Collection

Table_4_Comprehensive Review of Web Servers and Bioinformatics Tools for...

HAMAP

Bioinformatics Cloud Platform Report

FAIRsharing record for: Poxvirus Bioinformatics Resource Center

Extracted Schemas from the Life Sciences Linked Open Data Cloud

Predefined workflows in the ZBIT Bioinformatics Toolbox.

PATRIC: Bacterial Bioinformatics Resource Center

Funding and Operating Organizations for Long-Lived Molecular Biology...

Program and web site of bioinformatics used in this study.

PBMC training data

バイオサイエンスにおけるID

Ensembl TSS dataset for GRCh38

NCBIFAM

Data from: Getting the best of Linked Data and Property Graphs: rdf2neo and...

bioinformatics.com.cn Website Traffic, Ranking, Analytics [December 2025]

Additional file 1 of INSaFLU-TELEVIR: an open web-based bioinformatics suite...

Bioinformatic databases survey

Bioinformatic databases survey

Data content

Data sources