100+ datasets found

r
NCBI Structure
rrid.site
scicrunch.org
+2more
Updated Jul 6, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). NCBI Structure [Dataset]. http://identifiers.org/RRID:SCR_004218
Explore at:
Unique identifier
https://identifiers.org/RRID:SCR_004218 https://identifiers.org/RRID:SCR_004218/resolver?q=&i=rrid
Dataset updated
Jul 6, 2025
Description
Database of three-dimensional structures of macromolecules that allows the user to retrieve structures for specific molecule types as well as structures for genes and proteins of interest. Three main databases comprise Structure-The Molecular Modeling Database; Conserved Domains and Protein Classification; and the BioSystems Database. Structure also links to the PubChem databases to connect biological activity data to the macromolecular structures. Users can locate structural templates for proteins and interactively view structures and sequence data to closely examine sequence-structure relationships. * Macromolecular structures: The three-dimensional structures of biomolecules provide a wealth of information on their biological function and evolutionary relationships. The Molecular Modeling Database (MMDB), as part of the Entrez system, facilitates access to structure data by connecting them with associated literature, protein and nucleic acid sequences, chemicals, biomolecular interactions, and more. It is possible, for example, to find 3D structures for homologs of a protein of interest by following the Related Structure link in an Entrez Protein sequence record. * Conserved domains and protein classification: Conserved domains are functional units within a protein that act as building blocks in molecular evolution and recombine in various arrangements to make proteins with different functions. The Conserved Domain Database (CDD) brings together several collections of multiple sequence alignments representing conserved domains, in addition to NCBI-curated domains that use 3D-structure information explicitly to define domain boundaries and provide insights into sequence/structure/function relationships. * Small molecules and their biological activity: The PubChem project provides information on the biological activities of small molecules and is a component of NIH''''s Molecular Libraries Roadmap Initiative. PubChem includes three databases: PCSubstance, PCBioAssay, and PCCompound. The PubChem data are linked to other data types (illustrated example) in the Entrez system, making it possible, for example, to retrieve information about a compound and then Link to its biological activity data, retrieve 3D protein structures bound to the compound and interactively view their active sites, and find biosystems that include the compound as a component. * Biological Systems: A biosystem, or biological system, is a group of molecules that interact directly or indirectly, where the grouping is relevant to the characterization of living matter. The NCBI BioSystems Database provides centralized access to biological pathways from several source databases and connects the biosystem records with associated literature, molecular, and chemical data throughout the Entrez system. BioSystem records list and categorize components (illustrated example), such as the genes, proteins, and small molecules involved in a biological system. The companion FLink icon FLink tool, in turn, allows you to input a list of proteins, genes, or small molecules and retrieve a ranked list of biosystems.
r
NCBI BioSystems Database
rrid.site
scicrunch.org
+2more
Updated Jul 27, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). NCBI BioSystems Database [Dataset]. http://identifiers.org/RRID:SCR_004690
Explore at:
Unique identifier
https://identifiers.org/RRID:SCR_004690
Dataset updated
Jul 27, 2025
Description
Database that provides access to biological systems and their component genes, proteins, and small molecules, as well as literature describing those biosystems and other related data throughout Entrez. A biosystem, or biological system, is a group of molecules that interact directly or indirectly, where the grouping is relevant to the characterization of living matter. BioSystem records list and categorize components, such as the genes, proteins, and small molecules involved in a biological system. The companion FLink tool, in turn, allows you to input a list of proteins, genes, or small molecules and retrieve a ranked list of biosystems. A number of databases provide diagrams showing the components and products of biological pathways along with corresponding annotations and links to literature. This database was developed as a complementary project to (1) serve as a centralized repository of data; (2) connect the biosystem records with associated literature, molecular, and chemical data throughout the Entrez system; and (3) facilitate computation on biosystems data. The NCBI BioSystems Database currently contains records from several source databases: KEGG, BioCyc (including its Tier 1 EcoCyc and MetaCyc databases, and its Tier 2 databases), Reactome, the National Cancer Institute's Pathway Interaction Database, WikiPathways, and Gene Ontology (GO). It includes several types of records such as pathways, structural complexes, and functional sets, and is desiged to accomodate other record types, such as diseases, as data become available. Through these collaborations, the BioSystems database facilitates access to, and provides the ability to compute on, a wide range of biosystems data. If you are interested in depositing data into the BioSystems database, please contact them.
Search NCBI databases
integbio.jp
Updated May 25, 2017
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
NCBI (National Center for Biotechnology Information) (2017). Search NCBI databases [Dataset]. https://integbio.jp/dbcatalog/en/record/nbdc00055?jtpl=56
Explore at:
Dataset updated
May 25, 2017
Dataset provided by
National Center for Biotechnology Informationhttp://www.ncbi.nlm.nih.gov/
Description
This search engine combs for information from over 30 major databases at NCBI, including PubMed, nucleic acids, amino acid sequences, expression data, PubChem (small molecules with biochemical functions), protein structure, sequenced genomes, and taxonomy. The search engine provides links to the search results, as well as to other related databases.
n
NCBI Protein Database
neuinfo.org
dknet.org
+2more
Updated Aug 31, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). NCBI Protein Database [Dataset]. http://identifiers.org/RRID:SCR_003257
Explore at:
Unique identifier
https://identifiers.org/RRID:SCR_003257
Dataset updated
Aug 31, 2024
Description
Databases of protein sequences and 3D structures of proteins. Collection of sequences from several sources, including translations from annotated coding regions in GenBank, RefSeq and TPA, as well as records from SwissProt, PIR, PRF, and PDB.
Data from: NCBI Taxonomy
demo.gbif-test.org
demo.gbif.org
+1more
Updated Feb 19, 2015
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
National Center for Biotechnology Information (NCBI) (2015). NCBI Taxonomy [Dataset]. http://doi.org/10.15468/rhydar
Explore at:
Unique identifier
https://doi.org/10.15468/rhydar
Dataset updated
Feb 19, 2015
Dataset provided by
Global Biodiversity Information Facilityhttps://www.gbif.org/
National Center for Biotechnology Informationhttp://www.ncbi.nlm.nih.gov/
Description
The NCBI taxonomy database is not a primary source for taxonomic or phylogenetic information. Furthermore, the database does not follow a single taxonomic treatise but rather attempts to incorporate phylogenetic and taxonomic knowledge from a variety of sources, including the published literature, web-based databases, and the advice of sequence submitters and outside taxonomy experts. Consequently, the NCBI taxonomy database is not a phylogenetic or taxonomic authority and should not be cited as such.
f
Data_Sheet_1_Contamination in Reference Sequence Databases: Time for...
figshare.com
datasetcatalog.nlm.nih.gov
+1more
pdf
Updated Jun 9, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Valérian Lupo; Mick Van Vlierberghe; Hervé Vanderschuren; Frédéric Kerff; Denis Baurain; Luc Cornet (2023). Data_Sheet_1_Contamination in Reference Sequence Databases: Time for Divide-and-Rule Tactics.pdf [Dataset]. http://doi.org/10.3389/fmicb.2021.755101.s001
Explore at:
pdfAvailable download formats
Unique identifier
https://doi.org/10.3389/fmicb.2021.755101.s001
Dataset updated
Jun 9, 2023
Dataset provided by
Frontiers
Authors
Valérian Lupo; Mick Van Vlierberghe; Hervé Vanderschuren; Frédéric Kerff; Denis Baurain; Luc Cornet
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Contaminating sequences in public genome databases is a pervasive issue with potentially far-reaching consequences. This problem has attracted much attention in the recent literature and many different tools are now available to detect contaminants. Although these methods are based on diverse algorithms that can sometimes produce widely different estimates of the contamination level, the majority of genomic studies rely on a single method of detection, which represents a risk of systematic error. In this work, we used two orthogonal methods to assess the level of contamination among National Center for Biotechnological Information Reference Sequence Database (RefSeq) bacterial genomes. First, we applied the most popular solution, CheckM, which is based on gene markers. We then complemented this approach by a genome-wide method, termed Physeter, which now implements a k-folds algorithm to avoid inaccurate detection due to potential contamination of the reference database. We demonstrate that CheckM cannot currently be applied to all available genomes and bacterial groups. While it performed well on the majority of RefSeq genomes, it produced dubious results for 12,326 organisms. Among those, Physeter identified 239 contaminated genomes that had been missed by CheckM. In conclusion, we emphasize the importance of using multiple methods of detection while providing an upgrade of our own detection tool, Physeter, which minimizes incorrect contamination estimates in the context of unavoidably contaminated reference databases.
s
Molecular Modeling DataBase
scicrunch.org
rrid.site
+2more
Updated Dec 4, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2023). Molecular Modeling DataBase [Dataset]. http://identifiers.org/RRID:SCR_010623
Explore at:
Unique identifier
https://identifiers.org/RRID:SCR_010623
Dataset updated
Dec 4, 2023
Description
The Molecular Modeling DataBase (MMDB), also known as Entrez Structure, is a database of experimentally determined structures obtained from the RCSB Protein Data Bank (PDB). MMDB is developed by the Structure Group of the NCBI Computational Biology Branch. The data processing procedure at NCBI results in the addition of a number of useful features that facilitate computation on the data and link them to many other data types in the Entrez system. The structure database is considerably smaller than Entrez''s Protein or Nucleotide databases, but a large fraction of all known protein sequences have homologs in this set, and one may often learn more about a protein by examining 3-D structures of its homologs. These are accessible as Related Structures in the Links menu of Entrez Protein sequence records (illustrated example). It is then possible to align the query protein to the structure-based sequence, as shown in the illustration on this page. Additional resources can be used along with MMDB to interactively view the structures, find similar 3D structures, learn about the types of interactions and bound chemicals that have been found to exist among the similar 3D structures, and more.
d
NCBI Virus
catalog.data.gov
datadiscovery.nlm.nih.gov
+4more
Updated Jun 19, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
National Library of Medicine (2025). NCBI Virus [Dataset]. https://catalog.data.gov/dataset/ncbi-virus
Explore at:
Dataset updated
Jun 19, 2025
Dataset provided by
National Library of Medicine
Description
NCBI Virus is an integrative, value-added resource designed to support retrieval, display and analysis of a curated collection of virus sequences and large sequence datasets. Its goal is to increase the usability of viral sequence data archived in GenBank and other NCBI repositories. This resource includes resources previously included in HIV-1, Human Protein Interaction Database, Influenza Virus Resource, and Virus Variation.
e
NCBIFAM
ebi.ac.uk
Updated Dec 16, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). NCBIFAM [Dataset]. https://www.ebi.ac.uk/interpro/
Explore at:
Dataset updated
Dec 16, 2024
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
NCBIfam is a collection of protein families, featuring curated multiple sequence alignments, hidden Markov models (HMMs) and annotation, which provides a tool for identifying functionally related proteins based on sequence homology. NCBIfam is maintained at the National Center for Biotechnology Information (Bethesda, MD). NCBIfam includes models from TIGRFAMs, another database of protein families developed at The Institute for Genomic Research, then at the J. Craig Venter Institute (Rockville, MD, US).
n
Data from: Genetic diversity and spread dynamics of SARS-CoV-2 variants...
data.niaid.nih.gov
datadryad.org
zip
Updated May 31, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Desire Mtetwa (2024). Genetic diversity and spread dynamics of SARS-CoV-2 variants present in African populations [Dataset]. http://doi.org/10.5061/dryad.1c59zw42d
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5061/dryad.1c59zw42d
Dataset updated
May 31, 2024
Dataset provided by
Chinhoyi University of Technology
Authors
Desire Mtetwa
License
https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
Description
The dynamics of coronavirus disease-19 (COVID-19) have been extensively researched in many settings around the world, but little is known about these patterns in Africa. 7540 complete nucleotide genomes from 51 African nations were obtained and analysed from the National Center for Biotechnology Information (NCBI) and Global Initiative on Sharing Influenza Data (GISAID) databases to examine genetic diversity and spread dynamics of Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) lineages circulating in Africa. Utilising a variety of clade and lineage nomenclature schemes, we looked at their diversity, and used maximum parsimony inference methods to recreate their evolutionary divergence and history. According to this study, only 465 of the 2610 Pango lineages found to have existed in the world circulated in Africa after three years of the COVID-19 pandemic outbreak, with five different lineages dominating at various points during the outbreak. We identified South Africa, Kenya, and Nigeria as key sources of viral transmissions between Sub-Saharan African nations. These findings provide insight into the viral strains that are circulating in Africa and their evolutionary patterns. Methods Dataset mining and workflow SARS-CoV-2 genome sequences collected from Africa were obtained from NCBI database and GISAID database on February 26, 2023. 24415 African sequences were retrieved from both databases so as to examine the number of lineages circulating within Africa. The two databases had only 8044 complete genome sequences combined from Africa, and these sequences excluding those with low coverage using NextClade were retrieved to determine spread dynamics. 5908 sequences from 23 African countries were available in the NCBI and 2137 sequences from 41 African countries from GISAID database. The sequences were aligned using the online version of the MAFFT multiple sequence alignment tool, with the Wuhan-Hu-1 (MN 908947.3) as the reference sequence, and sequences with more than 5.0% ambiguous letters were removed. Duplicates were removed using goalign dedup software and only high quality African complete sequences remained (n=7540). Phylogenetic reconstruction Using IQ-TREE multicore software version v1.6.12 and NextClade, phylogeny reconstruction on the dataset was performed numerous times. Lineage classification PANGOLin, a web application was used to classify sequences into their lineages. The objective was to determine the SARS-CoV-2 lineages that are circulating in Africa that are most important from an epidemiological perspective, as well as the lineage dynamics within and across the African continent, due to the fact that this naming system integrates genetic and geographic data concerning SARS-CoV-2 dynamics. Phylogeographic reconstruction VOC, (VOI) and VUM were designated based on the WHO framework as of 20 January 2022. We included one lineage, namely A.23.1 and labelled it as VOI for the purposes of this analysis. This lineage was included because it demonstrated the continued evolution of African lineages into potentially more transmissible variants. VOI, VOC, and VUM that emerged on the African continent were marked. These were A.23.1 (VOI), B.1.351 and B.1.1.529 (VOC), B.1.640, and B.1.525 (VUM). Genome sequences of these five lineages were extracted from NCBI database for phylogeographic reconstruction. A similar approach to that described above (including alignment using online MAFFT) was employed. Phylogeographic reconstruction for all variants circulating in Africa and all VOI, VOC, and VUM was conducted using PASTML.
f
Table S1 - Construction of Customized Sub-Databases from NCBI-nr Database...
plos.figshare.com
docx
Updated May 31, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ke Yu; Tong Zhang (2023). Table S1 - Construction of Customized Sub-Databases from NCBI-nr Database for Rapid Annotation of Huge Metagenomic Datasets Using a Combined BLAST and MEGAN Approach [Dataset]. http://doi.org/10.1371/journal.pone.0059831.s001
Explore at:
docxAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0059831.s001
Dataset updated
May 31, 2023
Dataset provided by
PLOS ONE
Authors
Ke Yu; Tong Zhang
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Number of sequences derived from NCBI-nr database, which were annotated to the fatty acid metabolism pathway and bisphenol A degradation metabolism pathway. (DOCX)
A
Academic Research Databases Report
archivemarketresearch.com
doc, pdf, ppt
Updated Mar 15, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Archive Market Research (2025). Academic Research Databases Report [Dataset]. https://www.archivemarketresearch.com/reports/academic-research-databases-58991
Explore at:
pdf, doc, pptAvailable download formats
Dataset updated
Mar 15, 2025
Dataset authored and provided by
Archive Market Research
License
https://www.archivemarketresearch.com/privacy-policyhttps://www.archivemarketresearch.com/privacy-policy
Time period covered
2025 - 2033
Area covered
Global
Variables measured
Market Size
Description
The global market for academic research databases is experiencing robust growth, projected to reach $388.2 million in 2025. While the exact Compound Annual Growth Rate (CAGR) is not provided, considering the ongoing digitalization of research and education, a conservative estimate would place the CAGR in the range of 7-9% for the forecast period (2025-2033). This growth is fueled by several key drivers. The increasing reliance on digital resources by students, teachers, and researchers across all academic disciplines is a significant factor. Furthermore, the expanding volume of scholarly publications and the need for efficient access and management of research data are propelling market expansion. The rising adoption of cloud-based solutions and the development of sophisticated search and analytical tools within these databases are also contributing to this growth trajectory. The market segmentation highlights the diverse user base, with students, teachers, and experts representing major segments, each with varying needs and subscription models (charge-based or free access). The competitive landscape is characterized by established players like Scopus, Web of Science, and PubMed, alongside other significant contributors like ERIC, ProQuest, and IEEE Xplore, indicating a market with both established dominance and emerging players vying for market share. Geographic distribution shows a strong presence across North America and Europe, but with significant growth potential in Asia-Pacific regions. The market's future trajectory will likely be shaped by several trends. The increasing integration of artificial intelligence (AI) for enhanced search and data analysis capabilities will be a major factor. The ongoing development of open-access initiatives and the expansion of free databases will influence market dynamics, potentially impacting the revenue streams of subscription-based services. However, challenges such as data security concerns, the need for continuous content updates, and the varying levels of digital literacy across different user groups may act as restraints on market growth. Nevertheless, the overall outlook for the academic research database market remains positive, driven by the continued expansion of scholarly research and the growing demand for efficient and reliable access to research information globally.
d
Bio Resource for Array Genes Database
dknet.org
scicrunch.org
+1more
Updated Jan 29, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2022). Bio Resource for Array Genes Database [Dataset]. http://identifiers.org/RRID:SCR_000748
Explore at:
Unique identifier
https://identifiers.org/RRID:SCR_000748
Dataset updated
Jan 29, 2022
Description
Bio Resource for array genes is a free online resource for easy access to collective and integrated information from various public biological resources for human, mouse, rat, fly and c. elegans genes. The resource includes information about the genes that are represented in Unigene clusters. This resource provides interactive tools to selectively view, analyze and interpret gene expression patterns against the background of gene and protein functional information. Different query options are provided to mine the biological relationships represented in the underlying database. Search button will take you to the list of query tools available. This Bio resource is a platform designed as an online resource to assist researchers in analyzing results of microarray experiments and developing a biological interpretation of the results. This site is mainly to interpret the unique gene expression patterns found as biological changes that can lead to new diagnostic procedures and drug targets. This interactive site allows users to selectively view a variety of information about gene functions that is stored in an underlying database. Although there are other online resources that provide a comprehensive annotation and summary of genes, this resource differs from these by further enabling researchers to mine biological relationships amongst the genes captured in the database using new query tools. Thus providing a unique way of interpreting the microarray data results based on the knowledge provided for the cellular roles of genes and proteins. A total of six different query tools are provided and each offer different search features, analysis options and different forms of display and visualization of data. The data is collected in relational database from public resources: Unigene, Locus link, OMIM, NCBI dbEST, protein domains from NCBI CDD, Gene Ontology, Pathways (Kegg, Genmapp and Biocarta) and BIND (Protein interactions). Data is dynamically collected and compiled twice a week from public databases. Search options offer capability to organize and cluster genes based on their Interactions in biological pathways, their association with Gene Ontology terms, Tissue/organ specific expression or any other user-chosen functional grouping of genes. A color coding scheme is used to highlight differential gene expression patterns against a background of gene functional information. Concept hierarchies (Anatomy and Diseases) of MESH (Medical Subject Heading) terms are used to organize and display the data related to Tissue specific expression and Diseases. Sponsors: BioRag database is maintained by the Bioinformatics group at Arizona Cancer Center. The material presented here is compiled from different public databases. BioRag is hosted by the Biotechnology Computing Facility of the University of Arizona. 2002,2003 University of Arizona.
n
Data from: NCBI Taxonomy
data.niaid.nih.gov
Updated Mar 29, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
TAXON (2021). NCBI Taxonomy [Dataset]. https://data.niaid.nih.gov/resources?id=ds_385ea4f5f9
Explore at:
Dataset updated
Mar 29, 2021
Dataset authored and provided by
TAXON
Description
The NCBI Taxonomy database is a curated set of names and classifications for all organisms that are represented in the Entrez databases. The Taxonomy database attempts to incorporate phylogenetic and taxonomic knowledge from a variety of sources, including the published literature, web-based databases, and the advice of sequence submitters and outside taxonomy experts.
d
RefSeq: NCBI Reference Sequence Database
catalog.data.gov
datadiscovery.nlm.nih.gov
+3more
Updated Jun 19, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
National Library of Medicine (2025). RefSeq: NCBI Reference Sequence Database [Dataset]. https://catalog.data.gov/dataset/refseq-ncbi-reference-sequence-database-f511f
Explore at:
Dataset updated
Jun 19, 2025
Dataset provided by
National Library of Medicine
Description
A comprehensive, integrated, non-redundant, well-annotated set of reference sequences including genomic, transcript, and protein.
r
NCBI Genome Survey Sequences Database
rrid.site
neuinfo.org
+2more
Updated Aug 10, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). NCBI Genome Survey Sequences Database [Dataset]. http://identifiers.org/RRID:SCR_002146
Explore at:
Unique identifier
https://identifiers.org/RRID:SCR_002146
Dataset updated
Aug 10, 2025
Description
Database of unannotated short single-read primarily genomic sequences from GenBank including random survey sequences clone-end sequences and exon-trapped sequences. The GSS division of GenBank is similar to the EST division, with the exception that most of the sequences are genomic in origin, rather than cDNA (mRNA). It should be noted that two classes (exon trapped products and gene trapped products) may be derived via a cDNA intermediate. Care should be taken when analyzing sequences from either of these classes, as a splicing event could have occurred and the sequence represented in the record may be interrupted when compared to genomic sequence. The GSS division contains (but is not limited to) the following types of data: * random single pass read genome survey sequences. * cosmid/BAC/YAC end sequences * exon trapped genomic sequences * Alu PCR sequences * transposon-tagged sequences Although dbGSS sequences are incorporated into the GSS Division of GenBank, annotation in dbGSS is more comprehensive and includes detailed information about the contributors, experimental conditions, and genetic map locations.
NCBI Handbook
datasets.ai
21
Updated Sep 7, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. Department of Health & Human Services (2024). NCBI Handbook [Dataset]. https://datasets.ai/datasets/ncbi-handbook
Explore at:
21Available download formats
Dataset updated
Sep 7, 2024
Dataset provided by
United States Department of Health and Human Serviceshttp://www.hhs.gov/
Authors
U.S. Department of Health & Human Services
Description
An extensive collection of articles about NCBI databases, data models, and software.
n
GENSAT at NCBI - Gene Expression Nervous System Atlas
neuinfo.org
scicrunch.org
+2more
Updated Mar 23, 2012
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2012). GENSAT at NCBI - Gene Expression Nervous System Atlas [Dataset]. http://identifiers.org/RRID:SCR_003923
Explore at:
Unique identifier
https://identifiers.org/RRID:SCR_003923
Dataset updated
Mar 23, 2012
Description
THIS RESOURCE IS NO LONGER IN SERVICE, documented on March 19, 2012. Due to budgetary constraints, the National Center for Biotechnology Information (NCBI) has discontinued support for the NCBI GENSAT database, and it has been removed from the Entrez System. The Gene Expression Nervous System Atlas (GENSAT) project involves the large-scale creation of transgenic mouse lines expressing green fluorescent protein (GFP) reporter or Cre recombinase under control of the BAC promoter in specific neural and glial cell populations. BAC expression data for all the lines generated (over 1300 lines) are available in online, searchable databases (www.gensat.org and the Database of GENSAT BAC-Cre driver lines). If you have any specific questions, please feel free to contact us at info_at_ncbi.nlm.nih.gov The GENSAT project aims to map the expression of genes in the central nervous system of the mouse, using both in situ hybridization and transgenic mouse techniques. Search criteria include gene names, gene symbols, gene aliases and synonyms, mouse ages, and imaging protocols. Mouse ages are restricted to E10.5 (embryonic day 10.5), E15.5 (embryonic day 15.5), P7 (postnatal day 7), and Adult (adult). The project focuses on two techniques * Evaluation of unmodified mice lines for expression of a given gene using radiolabelled riboprobes and in-situ hybridization. * Creation of transgenic mice lines containing a BAC construct that expresses a marker gene in the same environment as the native gene
v
Library LinkOut
res1catalogd-o-tdatad-o-tgov.vcapture.xyz
healthdata.gov
+6more
Updated Jun 19, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
National Library of Medicine (2025). Library LinkOut [Dataset]. https://res1catalogd-o-tdatad-o-tgov.vcapture.xyz/dataset/library-linkout
Explore at:
Dataset updated
Jun 19, 2025
Dataset provided by
National Library of Medicine
Description
LinkOut is a service that allows you to link directly from PubMed and other NCBI databases to a wide range of information and services beyond the NCBI systems. LinkOut aims to facilitate access to relevant online resources in order to extend, clarify, and supplement information found in NCBI databases. Third parties can link directly from PubMed and other Entrez database records to relevant Web-accessible resources beyond the Entrez system. Includes full-text publications, biological databases, consumer health information and research tools.
r
Data from: Indexed reference databases for KMA and CCMetagen
researchdata.edu.au
Updated Apr 30, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Dr Vanessa Rossetto Marcelino; Dr Vanessa Rossetto Marcelino; Dr Jan Buchmann; Clausen Philip (2019). Indexed reference databases for KMA and CCMetagen [Dataset]. http://doi.org/10.25910/5CC7CD40FCA8E
Explore at:
Unique identifier
https://doi.org/10.25910/5CC7CD40FCA8E
Dataset updated
Apr 30, 2019
Dataset provided by
The University of Sydney
Authors
Dr Vanessa Rossetto Marcelino; Dr Vanessa Rossetto Marcelino; Dr Jan Buchmann; Clausen Philip
License
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Time period covered
Apr 9, 2019 - Apr 30, 2019
Description
This database was built to identify taxa in metagenome samples using the CCMetagen pipeline. The whole NCBI nt collection allows a complete taxonomic overview, including from microbial eukaryotes that may be present in the dataset. This database is already indexed, ready to use with KMA and CCMetagen.

A manual describing how to use this dataset can be found at: https://github.com/vrmarcelino/CCMetagen

Additionally, a tutorial on the whole analysis of a set of metatranscriptome samples can be found at: https://github.com/vrmarcelino/CCMetagen/tree/master/tutorial

The database was built as follows:

The partially non-redundant nucleotide database was downloaded from the NCBI website (ftp://ftp.ncbi.nih.gov/blast/db/FASTA/nt.gz) in January 2018. This database was formatted to include taxids in sequence headers.

Indexing was then performed with KMA using the commands:

kma_index -i nt_taxid.fas -o ncbi_nt -NI -Sparse TG

Three indexed databases are provided:

NCBI nucleotide collection

RefSeq database of bacterial and fungal genomes

Facebook

Twitter

Click to copy link

Link copied

Cite

(2025). NCBI Structure [Dataset]. http://identifiers.org/RRID:SCR_004218

NCBI Structure

RRID:SCR_004218, nlx_23947, NCBI Structure (RRID:SCR_004218), NCBI Structure

Explore at:

281 scholarly articles cite this dataset (View in Google Scholar)

Unique identifier

https://identifiers.org/RRID:SCR_004218 https://identifiers.org/RRID:SCR_004218/resolver?q=&i=rrid

Dataset updated

Jul 6, 2025

Description

Database of three-dimensional structures of macromolecules that allows the user to retrieve structures for specific molecule types as well as structures for genes and proteins of interest. Three main databases comprise Structure-The Molecular Modeling Database; Conserved Domains and Protein Classification; and the BioSystems Database. Structure also links to the PubChem databases to connect biological activity data to the macromolecular structures. Users can locate structural templates for proteins and interactively view structures and sequence data to closely examine sequence-structure relationships. * Macromolecular structures: The three-dimensional structures of biomolecules provide a wealth of information on their biological function and evolutionary relationships. The Molecular Modeling Database (MMDB), as part of the Entrez system, facilitates access to structure data by connecting them with associated literature, protein and nucleic acid sequences, chemicals, biomolecular interactions, and more. It is possible, for example, to find 3D structures for homologs of a protein of interest by following the Related Structure link in an Entrez Protein sequence record. * Conserved domains and protein classification: Conserved domains are functional units within a protein that act as building blocks in molecular evolution and recombine in various arrangements to make proteins with different functions. The Conserved Domain Database (CDD) brings together several collections of multiple sequence alignments representing conserved domains, in addition to NCBI-curated domains that use 3D-structure information explicitly to define domain boundaries and provide insights into sequence/structure/function relationships. * Small molecules and their biological activity: The PubChem project provides information on the biological activities of small molecules and is a component of NIH''''s Molecular Libraries Roadmap Initiative. PubChem includes three databases: PCSubstance, PCBioAssay, and PCCompound. The PubChem data are linked to other data types (illustrated example) in the Entrez system, making it possible, for example, to retrieve information about a compound and then Link to its biological activity data, retrieve 3D protein structures bound to the compound and interactively view their active sites, and find biosystems that include the compound as a component. * Biological Systems: A biosystem, or biological system, is a group of molecules that interact directly or indirectly, where the grouping is relevant to the characterization of living matter. The NCBI BioSystems Database provides centralized access to biological pathways from several source databases and connects the biosystem records with associated literature, molecular, and chemical data throughout the Entrez system. BioSystem records list and categorize components (illustrated example), such as the genes, proteins, and small molecules involved in a biological system. The companion FLink icon FLink tool, in turn, allows you to input a list of proteins, genes, or small molecules and retrieve a ranked list of biosystems.

Clear search

Close search

Google apps

Main menu

NCBI Structure

NCBI BioSystems Database

Search NCBI databases

NCBI Protein Database

Data from: NCBI Taxonomy

Data_Sheet_1_Contamination in Reference Sequence Databases: Time for...

Molecular Modeling DataBase

NCBI Virus

NCBIFAM

Data from: Genetic diversity and spread dynamics of SARS-CoV-2 variants...

Table S1 - Construction of Customized Sub-Databases from NCBI-nr Database...

Academic Research Databases Report

Bio Resource for Array Genes Database

Data from: NCBI Taxonomy

RefSeq: NCBI Reference Sequence Database

NCBI Genome Survey Sequences Database

NCBI Handbook

GENSAT at NCBI - Gene Expression Nervous System Atlas

Library LinkOut

Data from: Indexed reference databases for KMA and CCMetagen

NCBI Structure

RRID:SCR_004218, nlx_23947, NCBI Structure (RRID:SCR_004218), NCBI Structure