100+ datasets found

I
Molecular Biology Databases Published in Nucleic Acids Research between...
databank.illinois.edu
Updated Feb 1, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Heidi Imker (2024). Molecular Biology Databases Published in Nucleic Acids Research between 1991-2016 [Dataset]. http://doi.org/10.13012/B2IDB-4311325_V1
Explore at:
Unique identifier
https://doi.org/10.13012/B2IDB-4311325_V1
Dataset updated
Feb 1, 2024
Authors
Heidi Imker
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
This dataset was developed to create a census of sufficiently documented molecular biology databases to answer several preliminary research questions. Articles published in the annual Nucleic Acids Research (NAR) “Database Issues” were used to identify a population of databases for study. Namely, the questions addressed herein include: 1) what is the historical rate of database proliferation versus rate of database attrition?, 2) to what extent do citations indicate persistence?, and 3) are databases under active maintenance and does evidence of maintenance likewise correlate to citation? An overarching goal of this study is to provide the ability to identify subsets of databases for further analysis, both as presented within this study and through subsequent use of this openly released dataset.
Fantastic databases and where to find them: Web applications for researchers...
scielo.figshare.com
jpeg
Updated Jun 3, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Gerda Cristal Villalba; Ursula Matte (2023). Fantastic databases and where to find them: Web applications for researchers in a rush [Dataset]. http://doi.org/10.6084/m9.figshare.20018091.v1
Explore at:
jpegAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.20018091.v1
Dataset updated
Jun 3, 2023
Dataset provided by
SciELOhttp://www.scielo.org/
Authors
Gerda Cristal Villalba; Ursula Matte
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Abstract Public databases are essential to the development of multi-omics resources. The amount of data created by biological technologies needs a systematic and organized form of storage, that can quickly be accessed, and managed. This is the objective of a biological database. Here, we present an overview of human databases with web applications. The databases and tools allow the search of biological sequences, genes and genomes, gene expression patterns, epigenetic variation, protein-protein interactions, variant frequency, regulatory elements, and comparative analysis between human and model organisms. Our goal is to provide an opportunity for exploring large datasets and analyzing the data for users with little or no programming skills. Public user-friendly web-based databases facilitate data mining and the search for information applicable to healthcare professionals. Besides, biological databases are essential to improve biomedical search sensitivity and efficiency and merge multiple datasets needed to share data and build global initiatives for the diagnosis, prognosis, and discovery of new treatments for genetic diseases. To show the databases at work, we present a a case study using ACE2 as example of a gene to be investigated. The analysis and the complete list of databases is available in the following website .
List of bioinformatics tools and databases students used.
plos.figshare.com
xls
Updated Jun 1, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
João Carlos Sousa; Manuel João Costa; Joana Almeida Palha (2023). List of bioinformatics tools and databases students used. [Dataset]. http://doi.org/10.1371/journal.pone.0000481.t002
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0000481.t002
Dataset updated
Jun 1, 2023
Dataset provided by
PLOShttp://plos.org/
Authors
João Carlos Sousa; Manuel João Costa; Joana Almeida Palha
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
List of bioinformatics tools and databases students used.
d
Alternative Splicing Annotation Project II Database
dknet.org
scicrunch.org
+2more
Updated Jan 29, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2022). Alternative Splicing Annotation Project II Database [Dataset]. http://identifiers.org/RRID:SCR_000322
Explore at:
Unique identifier
https://identifiers.org/RRID:SCR_000322
Dataset updated
Jan 29, 2022
Description
THIS RESOURCE IS NO LONGER IN SERVICE, documented on 8/12/13. An expanded version of the Alternative Splicing Annotation Project (ASAP) database with a new interface and integration of comparative features using UCSC BLASTZ multiple alignments. It supports 9 vertebrate species, 4 insects, and nematodes, and provides with extensive alternative splicing analysis and their splicing variants. As for human alternative splicing data, newly added EST libraries were classified and included into previous tissue and cancer classification, and lists of tissue and cancer (normal) specific alternatively spliced genes are re-calculated and updated. They have created a novel orthologous exon and intron databases and their splice variants based on multiple alignment among several species. These orthologous exon and intron database can give more comprehensive homologous gene information than protein similarity based method. Furthermore, splice junction and exon identity among species can be valuable resources to elucidate species-specific genes. ASAP II database can be easily integrated with pygr (unpublished, the Python Graph Database Framework for Bioinformatics) and its powerful features such as graph query, multi-genome alignment query and etc. ASAP II can be searched by several different criteria such as gene symbol, gene name and ID (UniGene, GenBank etc.). The web interface provides 7 different kinds of views: (I) user query, UniGene annotation, orthologous genes and genome browsers; (II) genome alignment; (III) exons and orthologous exons; (IV) introns and orthologous introns; (V) alternative splicing; (IV) isoform and protein sequences; (VII) tissue and cancer vs. normal specificity. ASAP II shows genome alignments of isoforms, exons, and introns in UCSC-like genome browser. All alternative splicing relationships with supporting evidence information, types of alternative splicing patterns, and inclusion rate for skipped exons are listed in separate tables. Users can also search human data for tissue- and cancer-specific splice forms at the bottom of the gene summary page. The p-values for tissue-specificity as log-odds (LOD) scores, and highlight the results for LOD >= 3 and at least 3 EST sequences are all also reported.
I
Funding and Operating Organizations for Long-Lived Molecular Biology...
databank.illinois.edu
aws-databank-alb.library.illinois.edu
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Heidi Imker, Funding and Operating Organizations for Long-Lived Molecular Biology Databases [Dataset]. http://doi.org/10.13012/B2IDB-3993338_V1
Explore at:
Unique identifier
https://doi.org/10.13012/B2IDB-3993338_V1
Authors
Heidi Imker
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
The organizations that contribute to the longevity of 67 long-lived molecular biology databases published in Nucleic Acids Research (NAR) between 1991-2016 were identified to address two research questions 1) which organizations fund these databases? and 2) which organizations maintain these databases? Funders were determined by examining funding acknowledgements in each database's most recent NAR Database Issue update article published (prior to 2017) and organizations operating the databases were determine through review of database websites.
d
3D-Genomics Database
dknet.org
scicrunch.org
+2more
Updated Jan 29, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2022). 3D-Genomics Database [Dataset]. http://identifiers.org/RRID:SCR_007430
Explore at:
Unique identifier
https://identifiers.org/RRID:SCR_007430
Dataset updated
Jan 29, 2022
Description
THIS RESOURCE IS NO LONGER IN SERVICE, documented August 29, 2016. Database containing structural annotations for the proteomes of just under 100 organisms. Using data derived from public databases of translated genomic sequences, representatives from the major branches of Life are included: Prokaryota, Eukaryota and Archaea. The annotations stored in the database may be accessed in a number of ways. The help page provides information on how to access the database. 3D-GENOMICS is now part of a larger project, called e-Protein. The project brings together similar databases at three sites: Imperial College London , University College London and the European Bioinformatics Institute . e-Protein''s mission statement is To provide a fully automated distributed pipeline for large-scale structural and functional annotation of all major proteomes via the use of cutting-edge computer GRID technologies. The following databases are incorporated: NRprot, SCOP, ASTRAL, PFAM, Prosite, taxonomy, COG The following eukaryotic genomes are incorporated: Anopheles gambiae, protein sequences from the mosquito genome; Arabidopsis thaliana, protein sequences from the Arabidopsis genome; Caenorhabditis briggsae, protein sequences from the C.briggsae genome; Caenorhabditis elegans protein sequences from the worm genome; Ciona intestinalis protein sequences from the sea squirt genome; Danio rerio protein sequences from the zebrafish genome; Drosophila melanogaster protein sequences from the fruitfly genome; Encephalitozoon cuniculi protein sequences from the E.cuniculi genome; Fugu rubripes protein sequences from the pufferfish genome; Guillardia theta protein sequences from the G.theta genome; Homo sapiens protein sequences from the human genome; Mus musculus protein sequences from the mouse genome; Neurospora crassa protein sequences from the N.crassa genome; Oryza sativa protein sequences from the rice genome; Plasmodium falciparum protein sequences from the P.falciparum genome; Rattus norvegicus protein sequences from the rat genome; Saccharomyces cerevisiae protein sequences from the yeast genome; Schizosaccharomyces pombe protein sequences from the yeast genome
n
Bioinformatics Links Directory
neuinfo.org
scicrunch.org
+3more
Updated Jan 29, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2022). Bioinformatics Links Directory [Dataset]. http://identifiers.org/RRID:SCR_008018
Explore at:
Unique identifier
https://identifiers.org/RRID:SCR_008018
Dataset updated
Jan 29, 2022
Description
Database of curated links to molecular resources, tools and databases selected on the basis of recommendations from bioinformatics experts in the field. This resource relies on input from its community of bioinformatics users for suggestions. Starting in 2003, it has also started listing all links contained in the NAR Webserver issue. The different types of information available in this portal: * Computer Related: This category contains links to resources relating to programming languages often used in bioinformatics. Other tools of the trade, such as web development and database resources, are also included here. * Sequence Comparison: Tools and resources for the comparison of sequences including sequence similarity searching, alignment tools, and general comparative genomics resources. * DNA: This category contains links to useful resources for DNA sequence analyses such as tools for comparative sequence analysis and sequence assembly. Links to programs for sequence manipulation, primer design, and sequence retrieval and submission are also listed here. * Education: Links to information about the techniques, materials, people, places, and events of the greater bioinformatics community. Included are current news headlines, literature sources, educational material and links to bioinformatics courses and workshops. * Expression: Links to tools for predicting the expression, alternative splicing, and regulation of a gene sequence are found here. This section also contains links to databases, methods, and analysis tools for protein expression, SAGE, EST, and microarray data. * Human Genome: This section contains links to draft annotations of the human genome in addition to resources for sequence polymorphisms and genomics. Also included are links related to ethical discussions surrounding the study of the human genome. * Literature: Links to resources related to published literature, including tools to search for articles and through literature abstracts. Additional text mining resources, open access resources, and literature goldmines are also listed. * Model Organisms: Included in this category are links to resources for various model organisms ranging from mammals to microbes. These include databases and tools for genome scale analyses. * Other Molecules: Bioinformatics tools related to molecules other than DNA, RNA, and protein. This category will include resources for the bioinformatics of small molecules as well as for other biopolymers including carbohydrates and metabolites. * Protein: This category contains links to useful resources for protein sequence and structure analyses. Resources for phylogenetic analyses, prediction of protein features, and analyses of interactions are also found here. * RNA: Resources include links to sequence retrieval programs, structure prediction and visualization tools, motif search programs, and information on various functional RNAs.
NCBI Nt (Nucleotide) database FASTA file from 2017-10-26
zenodo.org
application/gzip
Updated Dec 23, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
James Fellows Yates; James Fellows Yates (2020). NCBI Nt (Nucleotide) database FASTA file from 2017-10-26 [Dataset]. http://doi.org/10.5281/zenodo.4382154
Explore at:
application/gzipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.4382154
Dataset updated
Dec 23, 2020
Dataset provided by
Zenodohttp://zenodo.org/
Authors
James Fellows Yates; James Fellows Yates
License
ODC Public Domain Dedication and Licence (PDDL) v1.0http://www.opendatacommons.org/licenses/pddl/1.0/
License information was derived automatically
Description
This FASTA file is the NCBI Nt (Nucleotide) database (public domain) used for holistic metagenomic screening of ancient DNA data at the Department of Archaeogenetics at the Max Planck Institute for the Science of Human History. We offer here the FASTA file used to construct MALT databases (https://uni-tuebingen.de/fakultaeten/mathematisch-naturwissenschaftliche-fakultaet/fachbereiche/informatik/lehrstuehle/algorithms-in-bioinformatics/software/malt/), which are generally too large for uploading. Please see each relevent publications that use the database for MALT database construction commands.

NCBI does not retain older versions of this database which is why this has been uploaded here. It was downloaded on 2017-10-26 12:39 from: ftp://ftp-trace.ncbi.nih.gov/blast/db/FASTA/nt.gz. The NCBI Nt database is released into the public domain as per https://www.ncbi.nlm.nih.gov/home/about/policies/.
n
Bioinformatic Harvester IV (beta) at Karlsruhe Institute of Technology
neuinfo.org
Updated Jan 29, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2022). Bioinformatic Harvester IV (beta) at Karlsruhe Institute of Technology [Dataset]. http://identifiers.org/RRID:SCR_008017
Explore at:
Unique identifier
https://identifiers.org/RRID:SCR_008017
Dataset updated
Jan 29, 2022
Description
Harvester is a Web-based tool that bulk-collects bioinformatic data on human proteins from various databases and prediction servers. It is a meta search engine for gene and protein information. It searches 16 major databases and prediction servers and combines the results on pregenerated HTML pages. In this way Harvester can provide comprehensive gene-protein information from different servers in a convenient and fast manner. As full text meta search engine, similar to Google trade mark, Harvester allows screening of the whole genome proteome for current protein functions and predictions in a few seconds. With Harvester it is now possible to compare and check the quality of different database entries and prediction algorithms on a single page. Sponsors: This work has been supported by the BMBF with grants 01GR0101 and 01KW0013.
ASURAT knowledge-based databases
figshare.com
application/gzip
Updated May 9, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Keita Iida (2022). ASURAT knowledge-based databases [Dataset]. http://doi.org/10.6084/m9.figshare.19102598.v5
Explore at:
application/gzipAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.19102598.v5
Dataset updated
May 9, 2022
Dataset provided by
figshare
Figsharehttp://figshare.com/
Authors
Keita Iida
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Knowledge-based databases and the codes for collecting these databases are stored.
n
DAVID
neuinfo.org
dknet.org
+1more
Updated Aug 17, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). DAVID [Dataset]. http://identifiers.org/RRID:SCR_001881
Explore at:
Unique identifier
https://identifiers.org/RRID:SCR_001881
Dataset updated
Aug 17, 2024
Description
Bioinformatics resource system including web server and web service for functional annotation and enrichment analyses of gene lists. Consists of comprehensive knowledgebase and set of functional analysis tools. Includes gene centered database integrating heterogeneous gene annotation resources to facilitate high throughput gene functional analysis., THIS RESOURCE IS NO LONGER IN SERVICE. Documented on September 16,2025.
uniprot-database_(type_ko).27.09.2019.tab.rar
figshare.com
application/x-rar
Updated Jun 24, 2020
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Daniel Kumazawa Morais (2020). uniprot-database_(type_ko).27.09.2019.tab.rar [Dataset]. http://doi.org/10.6084/m9.figshare.12555422.v1
Explore at:
application/x-rarAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.12555422.v1
Dataset updated
Jun 24, 2020
Dataset provided by
figshare
Figsharehttp://figshare.com/
Authors
Daniel Kumazawa Morais
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The current database was downloaded on 27.09.2019 and has the data fields (columns) as described below:# 1 Entry# 2 Entry name# 3 Status# 4 Protein names# 5 Gene names# 6 Organism# 7 Length# 8 Cross-reference (KO)# 9 Taxonomic lineage (PHYLUM)# 10 Taxonomic lineage (SPECIES) # This field carries current and old* taxonomic classifications.# 11 Taxonomic lineage (GENUS)# 12 Taxonomic lineage (KINGDOM)# 13 Taxonomic lineage (SUPERKINGDOM)# 14 Cross-reference (OrthoDB)# 15 Cross-reference (eggNOG)*Details about the classification used in UNIPROT can be found at the link: https://www.uniprot.org/help/taxonomy

Databases for MyCodentifier: A tool for routine identification of...

zenodo.org
data.niaid.nih.gov

application/gzip

Updated Dec 9, 2022

Facebook

Twitter

Click to copy link

Link copied

Cite

Jodie A. Schildkraut; Jodie A. Schildkraut; Jordy P.M. Coolen; Jordy P.M. Coolen; Heleen Severin; Ellen Koenraad; Nicole Aalders; Willem J.G. Melchers; Wouter Hoefsloot; Wouter Hoefsloot; Heiman F.L. Wertheim; Heiman F.L. Wertheim; Jakko van Ingen; Jakko van Ingen; Heleen Severin; Ellen Koenraad; Nicole Aalders; Willem J.G. Melchers (2022). Databases for MyCodentifier: A tool for routine identification of nontuberculous mycobacteria using MGIT enriched shotgun metagenomics. [Dataset]. http://doi.org/10.5281/zenodo.7396289

Explore at:

application/gzipAvailable download formats

Unique identifier

https://doi.org/10.5281/zenodo.7396289

Dataset updated

Dec 9, 2022

Dataset provided by

Zenodohttp://zenodo.org/

Authors

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Databases used for MyCodentifier a Nextflow pipeline to identify Mycobacterium tuberculosis complex (MTBC) and Nontuberculous mycobacteria (NTM) species from Next-generation sequencing (NGS) data.

Short description:
The pipeline is constructed using nextflow as workflow manager running in a docker container. It is able to identify species of MTBC/NTM from positive Mycobacterial Growth Indicator Tube (MGIT) cultures. To do so it uses an hsp65 database for fast identification coupled with a Metagenomic method using centrifuge to identify on genome level. For TB it also is able to identify subspecies. Results are presented in automated pdf and html reports.

**Databases**
Name	Short Description
20220726_ref.tar.gz	7 major mycobacterial genomes as centrifuge classification database, used for reference-based mapping and genotype resistance prediction
20220726_wgs_centrifuge_db_Radboudumc_MB.tar.gz	centrifuge classification database using Tortoli et al 2017 Mycobacterium strains + additional strains
genomes.tar.gz	7 major mycobacterial genomes, annotation and Genbank files. Files are paired with 20220726_ref.tar.gz
snpEff.tar.gz	7 major mycobacterial genomes annotation models for snpEff.
Tortoli_etal_hsp65.tar.gz	KMA database of hsp65 gene extractions of the Tortoli et al 2017 Mycobacterium strains.
Used in the study: p_compressed+h+v.tar.gz (12/06/2016)	Databases available via ftp://ftp.ccb.jhu.edu/pub/infphilo/centrifuge/data or https://ccb.jhu.edu/software/centrifuge/manual.shtml#custom-database

MyCodentifier Github:

https://jordycoolen.github.io/MyCodentifier/

Bioinformatics Goes to School—New Avenues for Teaching Contemporary Biology
plos.figshare.com
doc
Updated Jun 4, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Louisa Wood; Philipp Gebhardt (2023). Bioinformatics Goes to School—New Avenues for Teaching Contemporary Biology [Dataset]. http://doi.org/10.1371/journal.pcbi.1003089
Explore at:
docAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pcbi.1003089
Dataset updated
Jun 4, 2023
Dataset provided by
PLOShttp://plos.org/
Authors
Louisa Wood; Philipp Gebhardt
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Since 2010, the European Molecular Biology Laboratory's (EMBL) Heidelberg laboratory and the European Bioinformatics Institute (EMBL-EBI) have jointly run bioinformatics training courses developed specifically for secondary school science teachers within Europe and EMBL member states. These courses focus on introducing bioinformatics, databases, and data-intensive biology, allowing participants to explore resources and providing classroom-ready materials to support them in sharing this new knowledge with their students.In this article, we chart our progress made in creating and running three bioinformatics training courses, including how the course resources are received by participants and how these, and bioinformatics in general, are subsequently used in the classroom. We assess the strengths and challenges of our approach, and share what we have learned through our interactions with European science teachers.
c
Bioinformatics Market size was USD 12.76 Billion in 2022!
cognitivemarketresearch.com
pdf,excel,csv,ppt
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Cognitive Market Research, Bioinformatics Market size was USD 12.76 Billion in 2022! [Dataset]. https://www.cognitivemarketresearch.com/bioinformatics-market-report
Explore at:
pdf,excel,csv,pptAvailable download formats
Dataset authored and provided by
Cognitive Market Research
License
https://www.cognitivemarketresearch.com/privacy-policyhttps://www.cognitivemarketresearch.com/privacy-policy
Time period covered
2021 - 2033
Area covered
Global
Description
Global Bioinformatics market size was USD 12.76 Billion in 2022 and it is forecasted to reach USD 29.32 Billion by 2030. Bioinformatics Industry's Compound Annual Growth Rate will be 10.4% from 2023 to 2030. What are the driving factors for the Bioinformatics market?

The primary factors propelling the global bioinformatics industry are advances in genomics, rising demand for protein sequencing, and rising public-private sector investment in bioinformatics. Large volumes of data are being produced by the expanding use of next-generation sequencing (NGS) and other genomic technologies; these data must be analyzed using advanced bioinformatics tools. Furthermore, the global bioinformatics industry may benefit from the development of emerging advanced technologies. However, the bioinformatics discipline contains intricate algorithms and massive amounts of data, which can be difficult for researchers and demand a lot of processing power. What is Bioinformatics?

Bioinformatics is related to genetics and genomics, which involves the use of computer technology to store, collect, analyze, and disseminate biological information, and data, such as DNA and amino acid sequences or annotations about these sequences. Researchers and medical professionals use databases that organize and index this biological data to better understand health and disease, and in some circumstances, as a component of patient care. Through the creation of software and algorithms, bioinformatics is primarily used to extract knowledge from biological data. Bioinformatics is frequently used in the analysis of genomics, proteomics, 3D protein structure modeling, image analysis, drug creation, and many other fields.
d
Data from: Prophage-DB: A comprehensive database to explore diversity,...
search.dataone.org
data.niaid.nih.gov
+1more
Updated Jul 19, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Etan Dieppa-ColÃ³n; Cody Martin; Karthik Anantharaman (2024). Prophage-DB: A comprehensive database to explore diversity, distribution, and ecology of prophages [Dataset]. http://doi.org/10.5061/dryad.3n5tb2rs5
Explore at:
Unique identifier
https://doi.org/10.5061/dryad.3n5tb2rs5
Dataset updated
Jul 19, 2024
Dataset provided by
Dryad Digital Repository
Authors
Etan Dieppa-ColÃ³n; Cody Martin; Karthik Anantharaman
Time period covered
Jun 27, 2024
Description
Background: Viruses that infect prokaryotes (phages) constitute the most abundant group of biological agents, playing pivotal roles in microbial systems. They are known to impact microbial community dynamics, microbial ecology, and evolution. Efforts to document the diversity, host range, infection dynamics, and effects of bacteriophage infection on host cell metabolism are still at the surface level. Among phages, some adopt the lysogenic mode of infection, where the genome integrates into the host cell genome, forming a prophage. Prophages enable viral genome replication without host cell lysis and often contribute novel and beneficial traits to the host genome. Despite their importance, research on prophages is limited. Current phage research predominantly focuses on lytic phages, leaving a significant gap in knowledge regarding prophages, including their biology, diversity, and ecological roles. Results: To bridge this gap, the creation of Prophage-DB, a prophage database, aims to a..., , , # Prophage-DB: A comprehensive database to explore diversity, distribution, and ecology of prophages

https://doi.org/10.5061/dryad.3n5tb2rs5

This dataset contains prophage sequences (available as .fna files) identified from prokaryotic genomes from three public databases (Genome Taxonomy Database (GTDB) (release 207), National Center for Biotechnology Information (NCBI)Â Reference Sequence (RefSeq) database (accessed March 2023), and SearchableÂ Planetary-scale mIcrobiomeÂ REsource (SPIRE).Â The downloaded prokaryotic genomes from these databases contained both archaeal and bacterial representative genomes (SPIRE also included data from unknown hosts).Â

Methods

Prophage identification from downloaded representative genomes was carried out using VIBRANT (v1.2.1).Â We used the default arguments when using VIBRANT (minimum scaffold length requirement = 1000 base pairs, minimum number of open readings frames (ORFs, or proteins) per scaffold requi...
m
Data from: PseudoResistance DB: A new Database of antibiotics related to...
data.mendeley.com
Updated Nov 8, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Caio Cheohen (2024). PseudoResistance DB: A new Database of antibiotics related to Pseudomonas aeruginosa antibiotic resistance [Dataset]. http://doi.org/10.17632/bxdn3p33z2.1
Explore at:
Unique identifier
https://doi.org/10.17632/bxdn3p33z2.1
Dataset updated
Nov 8, 2024
Authors
Caio Cheohen
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This research addresses the pressing issue of antibiotic resistance, a global health challenge that undermines the efficacy of treatments against infectious diseases. Focusing on Pseudomonas aeruginosa—a Gram-negative bacterium known for causing opportunistic infections—this study emphasizes its prioritization by the World Health Organization (WHO) as a critical-level pathogen requiring new therapeutic approaches.

To identify antibiotics associated with P. aeruginosa, the study employed text mining techniques on the Scielo database. The resulting dataset comprises 98 antibiotics, each documented with detailed textual information and referencing data. Additionally, the dataset includes structural files of the antibiotics in several formats suitable for computational modeling and simulations. These formats encompass Protein Data Bank, Partial Charge & Atom Type (PDBQT), Simplified Molecular Input Line Entry System (SMI), IUPAC International Chemical Identifier (INCHI), Molecular Design Limited Molfile (MOL2), Structure-Data File (SDF), Chemical Markup Language (CML), Cartesian Coordinates File (XYZ), Scalable Vector Graphics (SVG), Molecular File (MOL) and Protein Data Bank (PDB) files, with molecular models generated via OpenBabel to facilitate advanced studies in drug development and resistance mechanisms.
Software and database resource mentions across the whole of PubMed Central...
figshare.com
application/gzip
Updated Jan 19, 2016
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Geraint Duck (2016). Software and database resource mentions across the whole of PubMed Central full-text articles [Dataset]. http://doi.org/10.6084/m9.figshare.1281371.v1
Explore at:
application/gzipAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.1281371.v1
Dataset updated
Jan 19, 2016
Dataset provided by
Figsharehttp://figshare.com/
Authors
Geraint Duck
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This is a compressed .sql.gz file of a MySQL database dump. The table contains the automatically extracted mentions of database and software resource names as extracted by bioNerDS across the full sub-set of open-access full-text PubMed Central articles. Each matched resource is identified by name, text offsets and "normalised" name, and also includes details of the rules from which the name was matched. This dataset is one of the primary research contributions of my PhD work, and a paper currently being finalised for submission to PLoS Computational Biology.
d
High Quality SNP Database
dknet.org
scicrunch.org
+2more
Updated May 11, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). High Quality SNP Database [Dataset]. http://identifiers.org/RRID:SCR_007230
Explore at:
Unique identifier
https://identifiers.org/RRID:SCR_007230
Dataset updated
May 11, 2024
Description
This is the HQSNP DB (high-quality SNP database) developed by CHG bioinformatics group. The high-quality SNP is defined as a SNP having allele frequency or genotyping data. The majority of the HQSNPs come from HapMap, others come from JSNP (Japanese SNP database), TSC (The SNP Consortium), Affymetrix 120K SNP, and Perlegen SNP. There are four kinds of SNP search you can do: * Get SNPs by dbSNP rs#: Choose this search if you have already selected a list of SNPs and you just want to get the SNP information. The program will generate a Excel file containing the SNP flanking sequence, variation, quality, function, etc. In the Excel file, there are 10 highlighted fields. You can send only those highlighted information to Illumina to get SNP pre-score. (The same fields are presented in other types of searches as well.) * Get gene SNPs by gene names: Choose this search if you have a list of gene names and you want to get the SNP information in these genes. The gene name can be official gene symbol, Ensembl gene ID, RefSeq accession ID, LocusLink number, etc. * Get gene SNPs by genome regions: Choose this search if you have a list of genome regions and you want to get all gene SNP information in these regions. The software will find all the Ensembl genes in the regions and find SNPs associated to each Ensembl gene. * Get genome scan SNPs by genome regions: Choose this search if you have a list of genome regions and you want to get evenly spaced SNPs in these regions. A SNP selection tool (SNPselector) was built upon HQSNP. It took snp ID list, gene name list, or genome region list as input and searched SNPs for genome scan or gene assoctiation study. It could take an optional ABI SNP file (exported from ABI SNP search web page) as input for checking whether the candidate SNP is available from ABI. It could also take an optional Illumina SNP pre-score file as input to select SNP for Illumina SNP assay. It generated results sorted by tag SNP in LD block, SNP quality, SNP function, SNP regulatory potential, and SNP mutation risk. SNPselector is now retired from public use (as of September 30, 2010).
n
Bio Resource for Array Genes Database
neuinfo.org
rrid.site
+2more
Updated Oct 28, 2017
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2017). Bio Resource for Array Genes Database [Dataset]. http://identifiers.org/RRID:SCR_000748
Explore at:
Unique identifier
https://identifiers.org/RRID:SCR_000748
Dataset updated
Oct 28, 2017
Description
Bio Resource for array genes is a free online resource for easy access to collective and integrated information from various public biological resources for human, mouse, rat, fly and c. elegans genes. The resource includes information about the genes that are represented in Unigene clusters. This resource provides interactive tools to selectively view, analyze and interpret gene expression patterns against the background of gene and protein functional information. Different query options are provided to mine the biological relationships represented in the underlying database. Search button will take you to the list of query tools available. This Bio resource is a platform designed as an online resource to assist researchers in analyzing results of microarray experiments and developing a biological interpretation of the results. This site is mainly to interpret the unique gene expression patterns found as biological changes that can lead to new diagnostic procedures and drug targets. This interactive site allows users to selectively view a variety of information about gene functions that is stored in an underlying database. Although there are other online resources that provide a comprehensive annotation and summary of genes, this resource differs from these by further enabling researchers to mine biological relationships amongst the genes captured in the database using new query tools. Thus providing a unique way of interpreting the microarray data results based on the knowledge provided for the cellular roles of genes and proteins. A total of six different query tools are provided and each offer different search features, analysis options and different forms of display and visualization of data. The data is collected in relational database from public resources: Unigene, Locus link, OMIM, NCBI dbEST, protein domains from NCBI CDD, Gene Ontology, Pathways (Kegg, Genmapp and Biocarta) and BIND (Protein interactions). Data is dynamically collected and compiled twice a week from public databases. Search options offer capability to organize and cluster genes based on their Interactions in biological pathways, their association with Gene Ontology terms, Tissue/organ specific expression or any other user-chosen functional grouping of genes. A color coding scheme is used to highlight differential gene expression patterns against a background of gene functional information. Concept hierarchies (Anatomy and Diseases) of MESH (Medical Subject Heading) terms are used to organize and display the data related to Tissue specific expression and Diseases. Sponsors: BioRag database is maintained by the Bioinformatics group at Arizona Cancer Center. The material presented here is compiled from different public databases. BioRag is hosted by the Biotechnology Computing Facility of the University of Arizona. 2002,2003 University of Arizona.

Facebook

Twitter

Click to copy link

Link copied

Cite

Heidi Imker (2024). Molecular Biology Databases Published in Nucleic Acids Research between 1991-2016 [Dataset]. http://doi.org/10.13012/B2IDB-4311325_V1

Molecular Biology Databases Published in Nucleic Acids Research between 1991-2016

Explore at:

2 scholarly articles cite this dataset (View in Google Scholar)

Unique identifier

https://doi.org/10.13012/B2IDB-4311325_V1

Dataset updated

Feb 1, 2024

Authors

Heidi Imker

License

CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically

Description

This dataset was developed to create a census of sufficiently documented molecular biology databases to answer several preliminary research questions. Articles published in the annual Nucleic Acids Research (NAR) “Database Issues” were used to identify a population of databases for study. Namely, the questions addressed herein include: 1) what is the historical rate of database proliferation versus rate of database attrition?, 2) to what extent do citations indicate persistence?, and 3) are databases under active maintenance and does evidence of maintenance likewise correlate to citation? An overarching goal of this study is to provide the ability to identify subsets of databases for further analysis, both as presented within this study and through subsequent use of this openly released dataset.

Clear search

Close search

Google apps

Main menu

Molecular Biology Databases Published in Nucleic Acids Research between...

Fantastic databases and where to find them: Web applications for researchers...

List of bioinformatics tools and databases students used.

Alternative Splicing Annotation Project II Database

Funding and Operating Organizations for Long-Lived Molecular Biology...

3D-Genomics Database

Bioinformatics Links Directory

NCBI Nt (Nucleotide) database FASTA file from 2017-10-26

Bioinformatic Harvester IV (beta) at Karlsruhe Institute of Technology

ASURAT knowledge-based databases

DAVID

uniprot-database_(type_ko).27.09.2019.tab.rar

Databases for MyCodentifier: A tool for routine identification of...

Bioinformatics Goes to School—New Avenues for Teaching Contemporary Biology

Bioinformatics Market size was USD 12.76 Billion in 2022!

Data from: Prophage-DB: A comprehensive database to explore diversity,...

Methods

Data from: PseudoResistance DB: A new Database of antibiotics related to...

Software and database resource mentions across the whole of PubMed Central...

High Quality SNP Database

Bio Resource for Array Genes Database

Molecular Biology Databases Published in Nucleic Acids Research between 1991-2016