100+ datasets found
  1. r

    NCBI Structure

    • rrid.site
    • scicrunch.org
    • +2more
    Updated Jul 6, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). NCBI Structure [Dataset]. http://identifiers.org/RRID:SCR_004218
    Explore at:
    Dataset updated
    Jul 6, 2025
    Description

    Database of three-dimensional structures of macromolecules that allows the user to retrieve structures for specific molecule types as well as structures for genes and proteins of interest. Three main databases comprise Structure-The Molecular Modeling Database; Conserved Domains and Protein Classification; and the BioSystems Database. Structure also links to the PubChem databases to connect biological activity data to the macromolecular structures. Users can locate structural templates for proteins and interactively view structures and sequence data to closely examine sequence-structure relationships. * Macromolecular structures: The three-dimensional structures of biomolecules provide a wealth of information on their biological function and evolutionary relationships. The Molecular Modeling Database (MMDB), as part of the Entrez system, facilitates access to structure data by connecting them with associated literature, protein and nucleic acid sequences, chemicals, biomolecular interactions, and more. It is possible, for example, to find 3D structures for homologs of a protein of interest by following the Related Structure link in an Entrez Protein sequence record. * Conserved domains and protein classification: Conserved domains are functional units within a protein that act as building blocks in molecular evolution and recombine in various arrangements to make proteins with different functions. The Conserved Domain Database (CDD) brings together several collections of multiple sequence alignments representing conserved domains, in addition to NCBI-curated domains that use 3D-structure information explicitly to define domain boundaries and provide insights into sequence/structure/function relationships. * Small molecules and their biological activity: The PubChem project provides information on the biological activities of small molecules and is a component of NIH''''s Molecular Libraries Roadmap Initiative. PubChem includes three databases: PCSubstance, PCBioAssay, and PCCompound. The PubChem data are linked to other data types (illustrated example) in the Entrez system, making it possible, for example, to retrieve information about a compound and then Link to its biological activity data, retrieve 3D protein structures bound to the compound and interactively view their active sites, and find biosystems that include the compound as a component. * Biological Systems: A biosystem, or biological system, is a group of molecules that interact directly or indirectly, where the grouping is relevant to the characterization of living matter. The NCBI BioSystems Database provides centralized access to biological pathways from several source databases and connects the biosystem records with associated literature, molecular, and chemical data throughout the Entrez system. BioSystem records list and categorize components (illustrated example), such as the genes, proteins, and small molecules involved in a biological system. The companion FLink icon FLink tool, in turn, allows you to input a list of proteins, genes, or small molecules and retrieve a ranked list of biosystems.

  2. r

    NCBI BioSystems Database

    • rrid.site
    • scicrunch.org
    • +2more
    Updated Jul 27, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). NCBI BioSystems Database [Dataset]. http://identifiers.org/RRID:SCR_004690
    Explore at:
    Dataset updated
    Jul 27, 2025
    Description

    Database that provides access to biological systems and their component genes, proteins, and small molecules, as well as literature describing those biosystems and other related data throughout Entrez. A biosystem, or biological system, is a group of molecules that interact directly or indirectly, where the grouping is relevant to the characterization of living matter. BioSystem records list and categorize components, such as the genes, proteins, and small molecules involved in a biological system. The companion FLink tool, in turn, allows you to input a list of proteins, genes, or small molecules and retrieve a ranked list of biosystems. A number of databases provide diagrams showing the components and products of biological pathways along with corresponding annotations and links to literature. This database was developed as a complementary project to (1) serve as a centralized repository of data; (2) connect the biosystem records with associated literature, molecular, and chemical data throughout the Entrez system; and (3) facilitate computation on biosystems data. The NCBI BioSystems Database currently contains records from several source databases: KEGG, BioCyc (including its Tier 1 EcoCyc and MetaCyc databases, and its Tier 2 databases), Reactome, the National Cancer Institute's Pathway Interaction Database, WikiPathways, and Gene Ontology (GO). It includes several types of records such as pathways, structural complexes, and functional sets, and is desiged to accomodate other record types, such as diseases, as data become available. Through these collaborations, the BioSystems database facilitates access to, and provides the ability to compute on, a wide range of biosystems data. If you are interested in depositing data into the BioSystems database, please contact them.

  3. Search NCBI databases

    • integbio.jp
    Updated May 25, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    NCBI (National Center for Biotechnology Information) (2017). Search NCBI databases [Dataset]. https://integbio.jp/dbcatalog/en/record/nbdc00055?jtpl=56
    Explore at:
    Dataset updated
    May 25, 2017
    Dataset provided by
    National Center for Biotechnology Informationhttp://www.ncbi.nlm.nih.gov/
    Description

    This search engine combs for information from over 30 major databases at NCBI, including PubMed, nucleic acids, amino acid sequences, expression data, PubChem (small molecules with biochemical functions), protein structure, sequenced genomes, and taxonomy. The search engine provides links to the search results, as well as to other related databases.

  4. n

    NCBI Protein Database

    • neuinfo.org
    • dknet.org
    • +2more
    Updated Aug 31, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). NCBI Protein Database [Dataset]. http://identifiers.org/RRID:SCR_003257
    Explore at:
    Dataset updated
    Aug 31, 2024
    Description

    Databases of protein sequences and 3D structures of proteins. Collection of sequences from several sources, including translations from annotated coding regions in GenBank, RefSeq and TPA, as well as records from SwissProt, PIR, PRF, and PDB.

  5. Data from: NCBI Taxonomy

    • demo.gbif-test.org
    • demo.gbif.org
    • +1more
    Updated Feb 19, 2015
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    National Center for Biotechnology Information (NCBI) (2015). NCBI Taxonomy [Dataset]. http://doi.org/10.15468/rhydar
    Explore at:
    Dataset updated
    Feb 19, 2015
    Dataset provided by
    Global Biodiversity Information Facilityhttps://www.gbif.org/
    National Center for Biotechnology Informationhttp://www.ncbi.nlm.nih.gov/
    Description

    The NCBI taxonomy database is not a primary source for taxonomic or phylogenetic information. Furthermore, the database does not follow a single taxonomic treatise but rather attempts to incorporate phylogenetic and taxonomic knowledge from a variety of sources, including the published literature, web-based databases, and the advice of sequence submitters and outside taxonomy experts. Consequently, the NCBI taxonomy database is not a phylogenetic or taxonomic authority and should not be cited as such.

  6. f

    Data_Sheet_1_Contamination in Reference Sequence Databases: Time for...

    • figshare.com
    • datasetcatalog.nlm.nih.gov
    • +1more
    pdf
    Updated Jun 9, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Valérian Lupo; Mick Van Vlierberghe; Hervé Vanderschuren; Frédéric Kerff; Denis Baurain; Luc Cornet (2023). Data_Sheet_1_Contamination in Reference Sequence Databases: Time for Divide-and-Rule Tactics.pdf [Dataset]. http://doi.org/10.3389/fmicb.2021.755101.s001
    Explore at:
    pdfAvailable download formats
    Dataset updated
    Jun 9, 2023
    Dataset provided by
    Frontiers
    Authors
    Valérian Lupo; Mick Van Vlierberghe; Hervé Vanderschuren; Frédéric Kerff; Denis Baurain; Luc Cornet
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Contaminating sequences in public genome databases is a pervasive issue with potentially far-reaching consequences. This problem has attracted much attention in the recent literature and many different tools are now available to detect contaminants. Although these methods are based on diverse algorithms that can sometimes produce widely different estimates of the contamination level, the majority of genomic studies rely on a single method of detection, which represents a risk of systematic error. In this work, we used two orthogonal methods to assess the level of contamination among National Center for Biotechnological Information Reference Sequence Database (RefSeq) bacterial genomes. First, we applied the most popular solution, CheckM, which is based on gene markers. We then complemented this approach by a genome-wide method, termed Physeter, which now implements a k-folds algorithm to avoid inaccurate detection due to potential contamination of the reference database. We demonstrate that CheckM cannot currently be applied to all available genomes and bacterial groups. While it performed well on the majority of RefSeq genomes, it produced dubious results for 12,326 organisms. Among those, Physeter identified 239 contaminated genomes that had been missed by CheckM. In conclusion, we emphasize the importance of using multiple methods of detection while providing an upgrade of our own detection tool, Physeter, which minimizes incorrect contamination estimates in the context of unavoidably contaminated reference databases.

  7. s

    Molecular Modeling DataBase

    • scicrunch.org
    • rrid.site
    • +2more
    Updated Dec 4, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2023). Molecular Modeling DataBase [Dataset]. http://identifiers.org/RRID:SCR_010623
    Explore at:
    Dataset updated
    Dec 4, 2023
    Description

    The Molecular Modeling DataBase (MMDB), also known as Entrez Structure, is a database of experimentally determined structures obtained from the RCSB Protein Data Bank (PDB). MMDB is developed by the Structure Group of the NCBI Computational Biology Branch. The data processing procedure at NCBI results in the addition of a number of useful features that facilitate computation on the data and link them to many other data types in the Entrez system. The structure database is considerably smaller than Entrez''s Protein or Nucleotide databases, but a large fraction of all known protein sequences have homologs in this set, and one may often learn more about a protein by examining 3-D structures of its homologs. These are accessible as Related Structures in the Links menu of Entrez Protein sequence records (illustrated example). It is then possible to align the query protein to the structure-based sequence, as shown in the illustration on this page. Additional resources can be used along with MMDB to interactively view the structures, find similar 3D structures, learn about the types of interactions and bound chemicals that have been found to exist among the similar 3D structures, and more.

  8. d

    NCBI Virus

    • catalog.data.gov
    • datadiscovery.nlm.nih.gov
    • +4more
    Updated Jun 19, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    National Library of Medicine (2025). NCBI Virus [Dataset]. https://catalog.data.gov/dataset/ncbi-virus
    Explore at:
    Dataset updated
    Jun 19, 2025
    Dataset provided by
    National Library of Medicine
    Description

    NCBI Virus is an integrative, value-added resource designed to support retrieval, display and analysis of a curated collection of virus sequences and large sequence datasets. Its goal is to increase the usability of viral sequence data archived in GenBank and other NCBI repositories. This resource includes resources previously included in HIV-1, Human Protein Interaction Database, Influenza Virus Resource, and Virus Variation.

  9. e

    NCBIFAM

    • ebi.ac.uk
    Updated Dec 16, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). NCBIFAM [Dataset]. https://www.ebi.ac.uk/interpro/
    Explore at:
    Dataset updated
    Dec 16, 2024
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    NCBIfam is a collection of protein families, featuring curated multiple sequence alignments, hidden Markov models (HMMs) and annotation, which provides a tool for identifying functionally related proteins based on sequence homology. NCBIfam is maintained at the National Center for Biotechnology Information (Bethesda, MD). NCBIfam includes models from TIGRFAMs, another database of protein families developed at The Institute for Genomic Research, then at the J. Craig Venter Institute (Rockville, MD, US).

  10. n

    Data from: Genetic diversity and spread dynamics of SARS-CoV-2 variants...

    • data.niaid.nih.gov
    • datadryad.org
    zip
    Updated May 31, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Desire Mtetwa (2024). Genetic diversity and spread dynamics of SARS-CoV-2 variants present in African populations [Dataset]. http://doi.org/10.5061/dryad.1c59zw42d
    Explore at:
    zipAvailable download formats
    Dataset updated
    May 31, 2024
    Dataset provided by
    Chinhoyi University of Technology
    Authors
    Desire Mtetwa
    License

    https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html

    Description

    The dynamics of coronavirus disease-19 (COVID-19) have been extensively researched in many settings around the world, but little is known about these patterns in Africa. 7540 complete nucleotide genomes from 51 African nations were obtained and analysed from the National Center for Biotechnology Information (NCBI) and Global Initiative on Sharing Influenza Data (GISAID) databases to examine genetic diversity and spread dynamics of Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) lineages circulating in Africa. Utilising a variety of clade and lineage nomenclature schemes, we looked at their diversity, and used maximum parsimony inference methods to recreate their evolutionary divergence and history. According to this study, only 465 of the 2610 Pango lineages found to have existed in the world circulated in Africa after three years of the COVID-19 pandemic outbreak, with five different lineages dominating at various points during the outbreak. We identified South Africa, Kenya, and Nigeria as key sources of viral transmissions between Sub-Saharan African nations. These findings provide insight into the viral strains that are circulating in Africa and their evolutionary patterns. Methods Dataset mining and workflow SARS-CoV-2 genome sequences collected from Africa were obtained from NCBI database and GISAID database on February 26, 2023. 24415 African sequences were retrieved from both databases so as to examine the number of lineages circulating within Africa. The two databases had only 8044 complete genome sequences combined from Africa, and these sequences excluding those with low coverage using NextClade were retrieved to determine spread dynamics. 5908 sequences from 23 African countries were available in the NCBI and 2137 sequences from 41 African countries from GISAID database. The sequences were aligned using the online version of the MAFFT multiple sequence alignment tool, with the Wuhan-Hu-1 (MN 908947.3) as the reference sequence, and sequences with more than 5.0% ambiguous letters were removed. Duplicates were removed using goalign dedup software and only high quality African complete sequences remained (n=7540). Phylogenetic reconstruction Using IQ-TREE multicore software version v1.6.12 and NextClade, phylogeny reconstruction on the dataset was performed numerous times. Lineage classification PANGOLin, a web application was used to classify sequences into their lineages. The objective was to determine the SARS-CoV-2 lineages that are circulating in Africa that are most important from an epidemiological perspective, as well as the lineage dynamics within and across the African continent, due to the fact that this naming system integrates genetic and geographic data concerning SARS-CoV-2 dynamics. Phylogeographic reconstruction VOC, (VOI) and VUM were designated based on the WHO framework as of 20 January 2022. We included one lineage, namely A.23.1 and labelled it as VOI for the purposes of this analysis. This lineage was included because it demonstrated the continued evolution of African lineages into potentially more transmissible variants. VOI, VOC, and VUM that emerged on the African continent were marked. These were A.23.1 (VOI), B.1.351 and B.1.1.529 (VOC), B.1.640, and B.1.525 (VUM). Genome sequences of these five lineages were extracted from NCBI database for phylogeographic reconstruction. A similar approach to that described above (including alignment using online MAFFT) was employed. Phylogeographic reconstruction for all variants circulating in Africa and all VOI, VOC, and VUM was conducted using PASTML.

  11. f

    Table S1 - Construction of Customized Sub-Databases from NCBI-nr Database...

    • plos.figshare.com
    docx
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ke Yu; Tong Zhang (2023). Table S1 - Construction of Customized Sub-Databases from NCBI-nr Database for Rapid Annotation of Huge Metagenomic Datasets Using a Combined BLAST and MEGAN Approach [Dataset]. http://doi.org/10.1371/journal.pone.0059831.s001
    Explore at:
    docxAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Ke Yu; Tong Zhang
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Number of sequences derived from NCBI-nr database, which were annotated to the fatty acid metabolism pathway and bisphenol A degradation metabolism pathway. (DOCX)

  12. A

    Academic Research Databases Report

    • archivemarketresearch.com
    doc, pdf, ppt
    Updated Mar 15, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Archive Market Research (2025). Academic Research Databases Report [Dataset]. https://www.archivemarketresearch.com/reports/academic-research-databases-58991
    Explore at:
    pdf, doc, pptAvailable download formats
    Dataset updated
    Mar 15, 2025
    Dataset authored and provided by
    Archive Market Research
    License

    https://www.archivemarketresearch.com/privacy-policyhttps://www.archivemarketresearch.com/privacy-policy

    Time period covered
    2025 - 2033
    Area covered
    Global
    Variables measured
    Market Size
    Description

    The global market for academic research databases is experiencing robust growth, projected to reach $388.2 million in 2025. While the exact Compound Annual Growth Rate (CAGR) is not provided, considering the ongoing digitalization of research and education, a conservative estimate would place the CAGR in the range of 7-9% for the forecast period (2025-2033). This growth is fueled by several key drivers. The increasing reliance on digital resources by students, teachers, and researchers across all academic disciplines is a significant factor. Furthermore, the expanding volume of scholarly publications and the need for efficient access and management of research data are propelling market expansion. The rising adoption of cloud-based solutions and the development of sophisticated search and analytical tools within these databases are also contributing to this growth trajectory. The market segmentation highlights the diverse user base, with students, teachers, and experts representing major segments, each with varying needs and subscription models (charge-based or free access). The competitive landscape is characterized by established players like Scopus, Web of Science, and PubMed, alongside other significant contributors like ERIC, ProQuest, and IEEE Xplore, indicating a market with both established dominance and emerging players vying for market share. Geographic distribution shows a strong presence across North America and Europe, but with significant growth potential in Asia-Pacific regions. The market's future trajectory will likely be shaped by several trends. The increasing integration of artificial intelligence (AI) for enhanced search and data analysis capabilities will be a major factor. The ongoing development of open-access initiatives and the expansion of free databases will influence market dynamics, potentially impacting the revenue streams of subscription-based services. However, challenges such as data security concerns, the need for continuous content updates, and the varying levels of digital literacy across different user groups may act as restraints on market growth. Nevertheless, the overall outlook for the academic research database market remains positive, driven by the continued expansion of scholarly research and the growing demand for efficient and reliable access to research information globally.

  13. d

    Bio Resource for Array Genes Database

    • dknet.org
    • scicrunch.org
    • +1more
    Updated Jan 29, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2022). Bio Resource for Array Genes Database [Dataset]. http://identifiers.org/RRID:SCR_000748
    Explore at:
    Dataset updated
    Jan 29, 2022
    Description

    Bio Resource for array genes is a free online resource for easy access to collective and integrated information from various public biological resources for human, mouse, rat, fly and c. elegans genes. The resource includes information about the genes that are represented in Unigene clusters. This resource provides interactive tools to selectively view, analyze and interpret gene expression patterns against the background of gene and protein functional information. Different query options are provided to mine the biological relationships represented in the underlying database. Search button will take you to the list of query tools available. This Bio resource is a platform designed as an online resource to assist researchers in analyzing results of microarray experiments and developing a biological interpretation of the results. This site is mainly to interpret the unique gene expression patterns found as biological changes that can lead to new diagnostic procedures and drug targets. This interactive site allows users to selectively view a variety of information about gene functions that is stored in an underlying database. Although there are other online resources that provide a comprehensive annotation and summary of genes, this resource differs from these by further enabling researchers to mine biological relationships amongst the genes captured in the database using new query tools. Thus providing a unique way of interpreting the microarray data results based on the knowledge provided for the cellular roles of genes and proteins. A total of six different query tools are provided and each offer different search features, analysis options and different forms of display and visualization of data. The data is collected in relational database from public resources: Unigene, Locus link, OMIM, NCBI dbEST, protein domains from NCBI CDD, Gene Ontology, Pathways (Kegg, Genmapp and Biocarta) and BIND (Protein interactions). Data is dynamically collected and compiled twice a week from public databases. Search options offer capability to organize and cluster genes based on their Interactions in biological pathways, their association with Gene Ontology terms, Tissue/organ specific expression or any other user-chosen functional grouping of genes. A color coding scheme is used to highlight differential gene expression patterns against a background of gene functional information. Concept hierarchies (Anatomy and Diseases) of MESH (Medical Subject Heading) terms are used to organize and display the data related to Tissue specific expression and Diseases. Sponsors: BioRag database is maintained by the Bioinformatics group at Arizona Cancer Center. The material presented here is compiled from different public databases. BioRag is hosted by the Biotechnology Computing Facility of the University of Arizona. 2002,2003 University of Arizona.

  14. n

    Data from: NCBI Taxonomy

    • data.niaid.nih.gov
    Updated Mar 29, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    TAXON (2021). NCBI Taxonomy [Dataset]. https://data.niaid.nih.gov/resources?id=ds_385ea4f5f9
    Explore at:
    Dataset updated
    Mar 29, 2021
    Dataset authored and provided by
    TAXON
    Description

    The NCBI Taxonomy database is a curated set of names and classifications for all organisms that are represented in the Entrez databases. The Taxonomy database attempts to incorporate phylogenetic and taxonomic knowledge from a variety of sources, including the published literature, web-based databases, and the advice of sequence submitters and outside taxonomy experts.

  15. d

    RefSeq: NCBI Reference Sequence Database

    • catalog.data.gov
    • datadiscovery.nlm.nih.gov
    • +3more
    Updated Jun 19, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    National Library of Medicine (2025). RefSeq: NCBI Reference Sequence Database [Dataset]. https://catalog.data.gov/dataset/refseq-ncbi-reference-sequence-database-f511f
    Explore at:
    Dataset updated
    Jun 19, 2025
    Dataset provided by
    National Library of Medicine
    Description

    A comprehensive, integrated, non-redundant, well-annotated set of reference sequences including genomic, transcript, and protein.

  16. r

    NCBI Genome Survey Sequences Database

    • rrid.site
    • neuinfo.org
    • +2more
    Updated Aug 10, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). NCBI Genome Survey Sequences Database [Dataset]. http://identifiers.org/RRID:SCR_002146
    Explore at:
    Dataset updated
    Aug 10, 2025
    Description

    Database of unannotated short single-read primarily genomic sequences from GenBank including random survey sequences clone-end sequences and exon-trapped sequences. The GSS division of GenBank is similar to the EST division, with the exception that most of the sequences are genomic in origin, rather than cDNA (mRNA). It should be noted that two classes (exon trapped products and gene trapped products) may be derived via a cDNA intermediate. Care should be taken when analyzing sequences from either of these classes, as a splicing event could have occurred and the sequence represented in the record may be interrupted when compared to genomic sequence. The GSS division contains (but is not limited to) the following types of data: * random single pass read genome survey sequences. * cosmid/BAC/YAC end sequences * exon trapped genomic sequences * Alu PCR sequences * transposon-tagged sequences Although dbGSS sequences are incorporated into the GSS Division of GenBank, annotation in dbGSS is more comprehensive and includes detailed information about the contributors, experimental conditions, and genetic map locations.

  17. NCBI Handbook

    • datasets.ai
    21
    Updated Sep 7, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Department of Health & Human Services (2024). NCBI Handbook [Dataset]. https://datasets.ai/datasets/ncbi-handbook
    Explore at:
    21Available download formats
    Dataset updated
    Sep 7, 2024
    Dataset provided by
    United States Department of Health and Human Serviceshttp://www.hhs.gov/
    Authors
    U.S. Department of Health & Human Services
    Description

    An extensive collection of articles about NCBI databases, data models, and software.

  18. n

    GENSAT at NCBI - Gene Expression Nervous System Atlas

    • neuinfo.org
    • scicrunch.org
    • +2more
    Updated Mar 23, 2012
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2012). GENSAT at NCBI - Gene Expression Nervous System Atlas [Dataset]. http://identifiers.org/RRID:SCR_003923
    Explore at:
    Dataset updated
    Mar 23, 2012
    Description

    THIS RESOURCE IS NO LONGER IN SERVICE, documented on March 19, 2012. Due to budgetary constraints, the National Center for Biotechnology Information (NCBI) has discontinued support for the NCBI GENSAT database, and it has been removed from the Entrez System. The Gene Expression Nervous System Atlas (GENSAT) project involves the large-scale creation of transgenic mouse lines expressing green fluorescent protein (GFP) reporter or Cre recombinase under control of the BAC promoter in specific neural and glial cell populations. BAC expression data for all the lines generated (over 1300 lines) are available in online, searchable databases (www.gensat.org and the Database of GENSAT BAC-Cre driver lines). If you have any specific questions, please feel free to contact us at info_at_ncbi.nlm.nih.gov The GENSAT project aims to map the expression of genes in the central nervous system of the mouse, using both in situ hybridization and transgenic mouse techniques. Search criteria include gene names, gene symbols, gene aliases and synonyms, mouse ages, and imaging protocols. Mouse ages are restricted to E10.5 (embryonic day 10.5), E15.5 (embryonic day 15.5), P7 (postnatal day 7), and Adult (adult). The project focuses on two techniques * Evaluation of unmodified mice lines for expression of a given gene using radiolabelled riboprobes and in-situ hybridization. * Creation of transgenic mice lines containing a BAC construct that expresses a marker gene in the same environment as the native gene

  19. v

    Library LinkOut

    • res1catalogd-o-tdatad-o-tgov.vcapture.xyz
    • healthdata.gov
    • +6more
    Updated Jun 19, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    National Library of Medicine (2025). Library LinkOut [Dataset]. https://res1catalogd-o-tdatad-o-tgov.vcapture.xyz/dataset/library-linkout
    Explore at:
    Dataset updated
    Jun 19, 2025
    Dataset provided by
    National Library of Medicine
    Description

    LinkOut is a service that allows you to link directly from PubMed and other NCBI databases to a wide range of information and services beyond the NCBI systems. LinkOut aims to facilitate access to relevant online resources in order to extend, clarify, and supplement information found in NCBI databases. Third parties can link directly from PubMed and other Entrez database records to relevant Web-accessible resources beyond the Entrez system. Includes full-text publications, biological databases, consumer health information and research tools.

  20. r

    Data from: Indexed reference databases for KMA and CCMetagen

    • researchdata.edu.au
    Updated Apr 30, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dr Vanessa Rossetto Marcelino; Dr Vanessa Rossetto Marcelino; Dr Jan Buchmann; Clausen Philip (2019). Indexed reference databases for KMA and CCMetagen [Dataset]. http://doi.org/10.25910/5CC7CD40FCA8E
    Explore at:
    Dataset updated
    Apr 30, 2019
    Dataset provided by
    The University of Sydney
    Authors
    Dr Vanessa Rossetto Marcelino; Dr Vanessa Rossetto Marcelino; Dr Jan Buchmann; Clausen Philip
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Time period covered
    Apr 9, 2019 - Apr 30, 2019
    Description

    This database was built to identify taxa in metagenome samples using the CCMetagen pipeline. The whole NCBI nt collection allows a complete taxonomic overview, including from microbial eukaryotes that may be present in the dataset. This database is already indexed, ready to use with KMA and CCMetagen.

    A manual describing how to use this dataset can be found at: https://github.com/vrmarcelino/CCMetagen

    Additionally, a tutorial on the whole analysis of a set of metatranscriptome samples can be found at: https://github.com/vrmarcelino/CCMetagen/tree/master/tutorial

    The database was built as follows:

    The partially non-redundant nucleotide database was downloaded from the NCBI website (ftp://ftp.ncbi.nih.gov/blast/db/FASTA/nt.gz) in January 2018. This database was formatted to include taxids in sequence headers.

    Indexing was then performed with KMA using the commands:

    kma_index -i nt_taxid.fas -o ncbi_nt -NI -Sparse TG

    Three indexed databases are provided:

    1. NCBI nucleotide collection
    2. RefSeq database of bacterial and fungal genomes
Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
(2025). NCBI Structure [Dataset]. http://identifiers.org/RRID:SCR_004218

NCBI Structure

RRID:SCR_004218, nlx_23947, NCBI Structure (RRID:SCR_004218), NCBI Structure

Explore at:
281 scholarly articles cite this dataset (View in Google Scholar)
Dataset updated
Jul 6, 2025
Description

Database of three-dimensional structures of macromolecules that allows the user to retrieve structures for specific molecule types as well as structures for genes and proteins of interest. Three main databases comprise Structure-The Molecular Modeling Database; Conserved Domains and Protein Classification; and the BioSystems Database. Structure also links to the PubChem databases to connect biological activity data to the macromolecular structures. Users can locate structural templates for proteins and interactively view structures and sequence data to closely examine sequence-structure relationships. * Macromolecular structures: The three-dimensional structures of biomolecules provide a wealth of information on their biological function and evolutionary relationships. The Molecular Modeling Database (MMDB), as part of the Entrez system, facilitates access to structure data by connecting them with associated literature, protein and nucleic acid sequences, chemicals, biomolecular interactions, and more. It is possible, for example, to find 3D structures for homologs of a protein of interest by following the Related Structure link in an Entrez Protein sequence record. * Conserved domains and protein classification: Conserved domains are functional units within a protein that act as building blocks in molecular evolution and recombine in various arrangements to make proteins with different functions. The Conserved Domain Database (CDD) brings together several collections of multiple sequence alignments representing conserved domains, in addition to NCBI-curated domains that use 3D-structure information explicitly to define domain boundaries and provide insights into sequence/structure/function relationships. * Small molecules and their biological activity: The PubChem project provides information on the biological activities of small molecules and is a component of NIH''''s Molecular Libraries Roadmap Initiative. PubChem includes three databases: PCSubstance, PCBioAssay, and PCCompound. The PubChem data are linked to other data types (illustrated example) in the Entrez system, making it possible, for example, to retrieve information about a compound and then Link to its biological activity data, retrieve 3D protein structures bound to the compound and interactively view their active sites, and find biosystems that include the compound as a component. * Biological Systems: A biosystem, or biological system, is a group of molecules that interact directly or indirectly, where the grouping is relevant to the characterization of living matter. The NCBI BioSystems Database provides centralized access to biological pathways from several source databases and connects the biosystem records with associated literature, molecular, and chemical data throughout the Entrez system. BioSystem records list and categorize components (illustrated example), such as the genes, proteins, and small molecules involved in a biological system. The companion FLink icon FLink tool, in turn, allows you to input a list of proteins, genes, or small molecules and retrieve a ranked list of biosystems.

Search
Clear search
Close search
Google apps
Main menu