Curated catalog of worldwide biological databases to provide landscape of biological databases throughout the world and enable easy retrieval and access to specific collection of databases of interest. Catalog of worldwide biological databases as well as their curated meta information and derived statistics.
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Activity data for small molecules are invaluable in chemoinformatics. Various bioactivity databases exist containing detailed information of target proteins and quantitative binding data for small molecules extracted from journals and patents. In the current work, we have merged several public and commercial bioactivity databases into one bioactivity metabase. The molecular presentation, target information, and activity data of the vendor databases were standardized. The main motivation of the work was to create a single relational database which allows fast and simple data retrieval by in-house scientists. Second, we wanted to know the amount of overlap between databases by commercial and public vendors to see whether the former contain data complementing the latter. Third, we quantified the degree of inconsistency between data sources by comparing data points derived from the same scientific article cited by more than one vendor. We found that each data source contains unique data which is due to different scientific articles cited by the vendors. When comparing data derived from the same article we found that inconsistencies between the vendors are common. In conclusion, using databases of different vendors is still useful since the data overlap is not complete. It should be noted that this can be partially explained by the inconsistencies and errors in the source data.
Supplementary material A.3 for the paper 'IDPredictor: predict database links in biomedical database'. Abstract: Knowledge found in biomedical databases, in particular in Web information systems, is a major bioinformatics resource. In general, this biological knowledge is worldwide represented in a network of databases. These data are spread among thousands of databases, which overlap in content, but differ substantially with respect to content detail, interface, formats and data structure. To support a functional annotation of lab data, such as protein sequences, metabolites or DNA sequences as well as a semi-automated data exploration in information retrieval environments an integrated view to databases is essential. Search engines have the potential of assisting in data retrieval from these structured sources, but fall short of providing a comprehensive knowledge excerpt out of the interlinked databases. A prerequisit for supporting the concept of an integrated data view is the to acquiring insights into cross-references among database entities. But only a fraction of all possible cross-references are explicitely tagged in the particular biomedical informations systems. In this work, we investigate to what extend an automated construction of an integrated data network is possible. We propose a method that predict and extracts cross-references from multiple life science databases and thier possible referenced data targets. We study the retrieval quality of our method and the relationship between manually crafted relevance ranking and relevance ranking based on cross-references, and report on first, promising results.
Maintains and provides archival, retrieval and analytical resources for biological information. Central DDBJ resource consists of public, open-access nucleotide sequence databases including raw sequence reads, assembly information and functional annotation. Database content is exchanged with EBI and NCBI within the framework of the International Nucleotide Sequence Database Collaboration (INSDC). In 2011, DDBJ launched two new resources: DDBJ Omics Archive and BioProject. DOR is archival database of functional genomics data generated by microarray and highly parallel new generation sequencers. Data are exchanged between the ArrayExpress at EBI and DOR in the common MAGE-TAB format. BioProject provides organizational framework to access metadata about research projects and data from projects that are deposited into different databases.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
To enable the identification of mutated peptide sequences in complex biological samples, in this work, a cancer protein database with mutation information collected from several public resources such as COSMIC, IARC P53, OMIM and UniProtKB, was developed. In-house developed Perl-scripts were used to search and process the data, and to translate each gene-level mutation into a mutated peptide sequence. The cancer mutation database comprises a total of 872,125 peptide entries from 25,642 protein IDs. A description line for each entry provides the parent protein ID and name, the cDNA- and protein-level mutation site and type, the originating database, and the cancer tissue type and corresponding hits. The database is FASTA formatted to enable data retrieval by commonly used tandem MS search engines.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The EBI is a centre for research and services in bioinformatics. The Institute manages databases of biological data including nucleic acid, protein sequences and macromolecular structures.
As we move towards understanding biology at the systems level, access to large data sets of many different types has become crucial. Technologies such as genome-sequencing, microarrays, proteomics and structural genomics have provided 'parts lists' for many living organisms, and researchers are now focusing on how the individual components fit together to build systems. The hope is that scientists will be able to translate their new insights into improving the quality of life for everyone. However, the high-throughput revolution also threatens to drown us in data. There is an ongoing, and growing, need to collect, store and curate all this information in ways that allow its efficient retrieval and exploitation. The European Bioinformatics Institute is one of the few places in the world that has the resources and expertise to fulfil this important task.
Databases that represent sets of pre-compiled information on biological relationships and associations, interactions and facts which have been extracted from the biomedical literature using Ariadne's MedScan technology. ResNet databases store information harvested from the entire PubMed in a formal structure that allows searching, retrieval and updating by Pathway Studio user. ResNet is seamlessly installed when Pathway Studio is installed. There are several available ResNet databases: *ResNet Mammalian Database includes data for Human, Rat, and Mouse *ResNet Plant Database has data on Arabidopsis, Rice and several other plants. Features of ResNet: *All extracted relations have linked access to the original article or abstract *Synonyms and homologs are included to maintain gene identity and to obviate redundancy in search results *Users can update ResNet as often as required using the MedScan technology built into all Ariadne products *Updates are made available by Ariadne every quarter To purchase Pathway Studio software with ResNet database, for information, or to schedule a web demonstration, call our sales department at (240) 453-6272, or (866) 340-5040 (toll free).
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
To enable the identification of mutated peptide sequences in complex biological samples, in this work, two novel cancer- and disease-related protein databases with mutation information collected from several public resources such as COSMIC, IARC P53, OMIM, and UniProtKB were developed. In-house developed Perl scripts were used to search and process the data and to translate each gene-level mutation into a mutated peptide sequence. The cancer and disease mutation databases comprise a total of 872 125 and 27 148 peptide entries from 25 642 and 2913 proteins, respectively. A description line for each entry provides the parent protein ID and name, the cDNA- and protein-level mutation site and type, the originating database, and the disease or cancer tissue type and corresponding hits. The two databases are FASTA-formatted to enable data retrieval by commonly used tandem MS search engines. While the largest number of mutations were encountered for the amino acids A/D/E/G/L/P/R/S, the global mutation profiles replicate closely the outcome of the 1000 Genomes Project aimed at cataloguing natural mutations in the human population. The affected proteins were primarily involved in transcription regulation, splicing, protein synthesis/folding/binding, redox/energy production, adhesion/motility, and to some extent in DNA damage repair and signaling. The applicability of the database to identifying the presence of mutated peptides was investigated with MCF-7 breast cancer cell extracts.
The EBI is a centre for research and services in bioinformatics. The Institute manages databases of biological data including nucleic acid, protein sequences and macromolecular structures.
As we move towards understanding biology at the systems level, access to large data sets of many different types has become crucial. Technologies such as genome-sequencing, microarrays, proteomics and structural genomics have provided 'parts lists' for many living organisms, and researchers are now focusing on how the individual components fit together to build systems. The hope is that scientists will be able to translate their new insights into improving the quality of life for everyone. However, the high-throughput revolution also threatens to drown us in data. There is an ongoing, and growing, need to collect, store and curate all this information in ways that allow its efficient retrieval and exploitation. The European Bioinformatics Institute is one of the few places in the world that has the resources and expertise to fulfil this important task.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Open Access and Open Data/Code Policies 2012.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset is a summary table of the Heatwaves and Coral-Recovery Database (HeatCRD) introduced in van Woesik and Kratochwill (2024). The HeatCRD is the most comprehensive reference on coral recovery following marine heatwaves and other disturbances, encompassing 29,205 data records spanning 44 years from 12,266 sites, 83 countries, and 160 data sources. These data provide essential information to coral-reef scientists and managers to best guide coral-reef conservation efforts at both local and regional scales. The dataset includes metadata for coral reef sampling events, such as site descriptions, geographical coordinates, depth, distance to shore, exposure, turbidity, coral cover percentages, MPA descriptions, temperature measurements, windspeed, and thermal stress indicators over 23 years.
See van Woesik and Kratochwill (2024) https://doi.org/10.1038/s41597-024-03221-3 for more information.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Classification of Journal Policies.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Publishing Houses for Journal Titles.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Neocryptolepine is a natural alkaloid isolated from the African climbing plant Paeonia lactiflora, belonging to the indole quinoline alkaloid class. This compound has become a natural precursor widely studied by medicinal chemists due to its diverse biological activities, especially its potential applications in anti-tumor, anti-inflammatory, anti malaria and other fields. As a natural product with multiple biological activities,Neocryptolepine has great potential in cancer treatment research. Through in-depth research and development of the Neocryptolepine, it may provide new treatment options for cancer patients in the future.Cancer, as a global health challenge, has long plagued the medical community and patients. It is a disease caused by the unlimited proliferation, invasion, and metastasis of abnormal cells, which can affect any part of the human body. With the change of lifestyle, the aggravation of environmental pollution and the trend of aging population, the incidence rate of cancer has increased year by year and has become the second leading cause of death in the world. Despite its enormous potential in cancer treatment, the diversity, mechanisms, and unknown targets of action make it extremely challenging to obtain Neocryptolepine anti-cancer pathways from it. In addition, it is difficult to search for systematic information on anti-cancer Neocryptolepine from a large amount of information such as the internet. Neocryptolepine derivatives, as a natural compound, have shown great potential and diversity in cancer treatment. Despite facing challenges in screening and utilization, they remain important resources for drug development.In order to construct the NDADS database, authoritative literature search websites such as Pubmed and Google Scholar were used to systematically collect key information on the generic name, anti-tumor activity, cancer type, mechanism of action, and targets of Neocryptolepine and its derivatives using keywords such as Neocryptolepine, Cancer, and Target. On this basis, all data were integrated and included in the data of 85 Neocryptolepine derivatives in the laboratory, ultimately forming a database containing information on 203 anti-tumor compounds derived from Neocryptolepine derivatives. In order to integrate and evaluate numerous research resources and results, the Neocryptolepine derivatives anti-tumor database can provide rich retrieval and analysis tools, such as cross database retrieval, citation retrieval, journal retrieval, etc., enabling users to easily search for anti-tumor related information of Neocryptolepine derivatives. Supplement the current inclusion status, covering the names, structures, molecular weights, activities, functions, cancer types, cancer cells, targets/signaling pathways, references, and corresponding website sources of various compounds. This interface supports the query function for the content of the Neocryptolepine derivatives mentioned above. Therefore, the anti-tumor database of the Neocryptolepine derivatives will help to study the potential of Neocryptolepine derivatives in the treatment of cancer from multiple aspects such as activity, structure, method of action, and target, assisting in cancer treatment and improving cancer survival rate.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
ISI Classifications Represented in the Journal Titles.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Distribution of Impact Factors for Journal Titles.
Functional Annotation of Variants - Online Resource (FAVOR, https://favor.genohub.org) is a comprehensive whole-genome variant annotation database and a variant browser, providing hundreds of functional annotation scores from a variety of aspects of variant biological function. This FAVOR Essential Database (aGDS Format) is comprised of a collection of essential annotation scores for all possible SNVs (8,812,917,339) and observed indels (79,997,898) in Build GRCh38/hg38, including variant info, chromosome, position, reference allele, alternative allele, aPC-Conservation, aPC-Epigenetics, aPC-Epigenetics-Active, aPC-Epigenetics-Repressed, aPC-Epigenetics-Transcription, aPC-Local-Nucleotide-Diversity, aPC-Mappability, aPC-Mutation-Density, aPC-Protein-Function, aPC-Proximity-To-TSSTES, aPC-Transcription-Factor, CAGE promoter, CAGE, MetaSVM, rsID, FATHMM-XF, Gencode Comprehensive Category, Gencode Comprehensive Info, Gencode Comprehensive Exonic Category, Gencode Comprehensive Exonic Info, GeneHancer, LINSIGHT, CADD, rDHS. These annotation scores are stored in annotated Genomic Data Structure (aGDS) file format (without genotype data) to support fast query and retrieval at variant-level. The aGDS file can then facilitate a wide range of functionally-informed downstream analyses.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Journal Review and Hosting Policies, 2012.
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Curated catalog of worldwide biological databases to provide landscape of biological databases throughout the world and enable easy retrieval and access to specific collection of databases of interest. Catalog of worldwide biological databases as well as their curated meta information and derived statistics.