Facebook
TwitterDatabase of non-redundant sets of protein - small-molecule complexes that are especially suitable for structure-based drug design and protein - small-molecule interaction research. PSMB supports: * Support frequent updates - The number of new structures in the PDB is growing rapidly. In order to utilize these structures, frequent updates are required. In contrast to manual procedures which require significant time and effort per update, generation of the PSMDB database is fully automatic thereby facilitating frequent database updates. * Consider both protein and ligand structural redundancy - In the database, two complexes are considered redundant if they share a similar protein and ligand (the protein - small-molecule non-redundant set). This allows the database to contain structural information for the same protein bound to several different ligands (and vice-versa). Additionally, for completeness, the database contains a set of non-redundant complexes when only protein structural redundancy is considered (our protein non-redundant set). The following images demonstrate the structural redundancy of the protein complexes in the PDB compared to the PSMDB. * Efficient handling of covalent bonds -Many protein complexes contain covalently bound ligands. Typically, protein-ligand databases discard these complexes; however, the PSMDB simply removes the covalently bound ligand from the complex, retaining any non-covalently bound ligands. This increases the number of usable complexes in the database. * Separate complexes into protein and ligand files -The PSMDB contains individual structure files for both the protein and all non-covalently bound ligands. The unbound proteins are in PDB format while the individual ligands are in SDF format (in their native coordinate frame).
Facebook
TwitterA machine usable dictionary containing thousands of words, each with linguistic and psycholinguistic attributes (psychological measures are recorded for a small percentage of words). The dictionary may be of use to researchers in psychology or linguistics to develop sets of experimental stimuli, or those in artificial intelligence and computer science who require psychological and linguistic descriptions of words.
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Automatically extracted Catalan word database using alignment techniques (Montreal Forced Alignment, MFA) from speech databases with transcriptions. Precisely: Mozilla Common Voice, ParlamentParla, and OpenSLR-69. Usable for training keyword spotting models for home automation. MFA leverages algorithms to accurately synchronize speech signals with the corresponding text at the phoneme level. Two versions of the database have been created: general: This version encompasses all data, providing a comprehensive dataset for various analyses and applications. split: This version is divided into train, dev, and test to ease the task of training a keyword spotting model. Speaker-wise, It is divided by 80%, 10%, and 10%.
Facebook
Twitterhttps://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
The scientific community has entered an era of big data. However, with big data comes big responsibilities, and best practices for how data are contributed to databases have not kept pace with the collection, aggregation, and analysis of big data. Here, we rigorously assess the quantity of data for specific leaf area (SLA) available within the largest and most frequently used global plant trait database, the TRY Plant Trait Database, exploring how much of the data were applicable (i.e., original, representative, logical, and comparable) and traceable (i.e., published, cited, and consistent). Over three-quarters of the SLA data in TRY either lacked applicability or traceability, leaving only 22.9% of the original data usable compared to the 64.9% typically deemed usable by standard data cleaning protocols. The remaining usable data differed markedly from the original for many species, which led to altered interpretation of ecological analyses. Though the data we consider here make up only 4.5% of SLA data within TRY, similar issues of applicability and traceability likely apply to SLA data for other species as well as other commonly measured, uploaded, and downloaded plant traits. We end with suggested steps forward for global ecological databases, including suggestions for both uploaders to and curators of databases with the hope that, through addressing the issues raised here, we can increase data quality and integrity within the ecological community. Methods SLA data was downlaoded from TRY (traits 3115, 3116, and 3117) for all conifer (Araucariaceae, Cupressaceae, Pinaceae, Podocarpaceae, Sciadopityaceae, and Taxaceae), Plantago, Poa, and Quercus species. The data has not been processed in any way, but additional columns have been added to the datset that provide the viewer with information about where each data point came from, how it was cited, how it was measured, whether it was uploaded correctly, whether it had already been uploaded to TRY, and whether it was uploaded by the individual who collected the data.
Facebook
TwitterAttribution-NonCommercial-ShareAlike 3.0 (CC BY-NC-SA 3.0)https://creativecommons.org/licenses/by-nc-sa/3.0/
License information was derived automatically
(:unav)...........................................
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
These databases consolidate a variety of datasets related to the model organism Ruegeria pomeroyi DSS-3. The data were primarily generated by members of the Moran Lab at the University of Georgia, and put together in this format using anvi'o v7.1-dev through the collaborative efforts of Zac Cooper, Sam Miller, and Iva Veseli (special thanks to Christa Smith and Lidimarie Trujillo Rodriguez for their help with gene annotations). The data includes:
- (R_POM_DSS3-contigs.db) the complete genome and megaplasmid sequence of R. pomeroyi, along with highly-curated gene annotations established by the Moran Lab and automatically-generated annotations from NCBI COGs, KEGG KOfam/BRITE, Pfams, and anvi'o single-copy core gene sets. It also contains annotations for the Moran Lab's TnSeq mutant library (https://doi.org/10.1101/2022.09.11.507510; https://doi.org/10.1038/s43705-023-00244-6).
- (PROFILE-VER_01.db) read-mapping data from multiple transcriptome and metatranscriptome samples generated by the Moran lab to the R. pomeroyi genome. Some coverage data is stored in the AUXILIARY-DATA.db file. This data can be visualized using anvi-interactive. Publicly-available samples are labeled with their SRA accession number.
- (DEFAULT-EVERYTHING.db) gene-level coverage data from the transcriptome and meta-transcriptomes samples stored in the profile database, as well as per-gene normalized spectral abundance counts from proteomes matched to a subset of the transcriptomes and gene mutant fitness data from https://doi.org/10.1073/pnas.2217200120. This data can also be visualized using anvi-interactive (see instructions below). The proteome data layers are labeled according to their matching transcriptome samples.
- (R_pom_reproducible_workflow.md) a reproducible workflow describing how the databases were generated.
Please note that using these databases requires the development version of anvi'o `v8-dev`, or a later version of anvi'o if available. They are not usable with anvi'o `v8` or earlier.
Instructions for visualizing the genes database in the anvi'o interactive interface: Anvi'o expects genes databases to be located in a folder called `GENES`, so in order to use the specific database included in this datapack, you must move it to the expected location by running the following commands in your terminal:
mkdir GENES
mv DEFAULT-EVERYTHING.db GENES/
Once that is done, you can use the following command to visualize the gene-level information:
anvi-interactive -c R_POM_DSS3-contigs.db -p PROFILE-VER_01.db -C DEFAULT -b EVERYTHING --gene-mode
To view only the proteomic data and its matched transcriptomes, you can add the flag `--state-autoload proteomes` to the above command.
To view all transcriptomes and the proteomes organized by study of origin, you can add the flag `--state-autoload figure` to the above command.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Our research reports a systematic literature review of 49 publications on security studies with software developer participants. These attached files are: - A BibTeX file: includes all 49 references in BibTex format. - An Excel spreadsheet: our analysis of each publication. Each row represents a publication and columns represent features that we analysed such as number of participants, whether there was a clear research question, or whether the paper reports ethics. - Database queries: actual queries that we executed on databases.
Facebook
TwitterFrom the repository: https://github.com/allenai/Break
Break is a human annotated dataset of natural language questions and their Question Decomposition Meaning Representations (QDMRs). Break consists of 83,978 examples sampled from 10 question answering datasets over text, images and databases. This repository contains the Break dataset along with information on the exact data format.
Datasets - QDMR: Contains questions over text, images and databases annotated with their Question Decomposition Meaning Representation. In addition to the train, dev and (hidden) test sets we provide lexicon_tokens files. For each question, the lexicon file contains the set of valid tokens that could potentially appear in its decomposition (Section 3). - QDMR high-level: Contains questions annotated with the high-level variant of QDMR. These decomposition are exclusive to Reading Comprehension tasks (Section 2). lexicon_tokens files are also provided. - logical-forms: Contains questions and QDMRs annotated with full logical-forms of QDMR operators + arguments. Full logical-forms were inferred by the annotation-consistency algorithm described in Section 4.3. Data Format - QDMR & QDMR high-level: train.csv, dev.csv, test.csv: question_id: The Break question id, of the format [ORIGINAL DATASET]_[original split]_[original id]. E.g., NLVR2_dev_dev-1049-1-1 is from NLVR2 dev split with its NLVR2 id being, dev-1049-1-1. question_text: Original question text. decomposition: The annotated QDMR of the question, its steps delimited by ;. E.g., return flights ;return #1 from washington ;return #2 to boston ;return #3 in the afternoon. operators: List of tagged QDMR operators for each step. QDMR operators are fully described in (Section 2) of the paper. The 14 potential operators are, select, project, filter, aggregate, group, superlative, comparative, union, intersection, discard, sort, boolean, arithmetic, comparison. Unidefntified operators are tagged with None. split: The Break dataset split of the example, train / dev / test. train_lexicon_tokens.json, dev_lexicon_tokens.json, test_lexicon_tokens.json: "source": The source question. "allowed_tokens": The set of valid lexicon tokens that can appear in the QDMR of the question. For the method used to generate lexicon tokens see here. - logical-forms: train.csv, dev.csv, test.csv: question_id: Same as before. question_text: Same as before. decomposition: Same as before. program: List of QDMR operators and arguments that the original QDMR was mapped to. E.g., for the QDMR, return citations ;return #1 of Making database systems usable ;return number of #2, its program is, [ SELECT['citations'], FILTER['#1', 'of Making database systems usable'], AGGREGATE['count', '#2'] ]. operators: Same as before. split: Same as before.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
BackgroundThe ability to apply standard and interoperable solutions for implementing and managing medical registries as well as aggregate, reproduce, and access data sets from legacy formats and platforms to advanced standard formats and operating systems are crucial for both clinical healthcare and biomedical research settings.PurposeOur study describes a reproducible, highly scalable, standard framework for a device registry implementation addressing both local data quality components and global linking problems.Methods and ResultsWe developed a device registry framework involving the following steps: (1) Data standards definition and representation of the research workflow, (2) Development of electronic case report forms using REDCap (Research Electronic Data Capture), (3) Data collection according to the clinical research workflow and, (4) Data augmentation by enriching the registry database with local electronic health records, governmental database and linked open data collections, (5) Data quality control and (6) Data dissemination through the registry Web site. Our registry adopted all applicable standardized data elements proposed by American College Cardiology / American Heart Association Clinical Data Standards, as well as variables derived from cardiac devices randomized trials and Clinical Data Interchange Standards Consortium. Local interoperability was performed between REDCap and data derived from Electronic Health Record system. The original data set was also augmented by incorporating the reimbursed values paid by the Brazilian government during a hospitalization for pacemaker implantation. By linking our registry to the open data collection repository Linked Clinical Trials (LinkedCT) we found 130 clinical trials which are potentially correlated with our pacemaker registry.ConclusionThis study demonstrates how standard and reproducible solutions can be applied in the implementation of medical registries to constitute a re-usable framework. Such approach has the potential to facilitate data integration between healthcare and research settings, also being a useful framework to be used in other biomedical registries.
Facebook
TwitterThis Excel file is Table S1: Contamination levels for 111,088 bacterial genomes of NCBI RefSeq. This table includes all CheckM classic output fields of contamination estimation (option lineage_wf). Estimates from Physeter in k-fold mode are further provided for 12,326 genomes for which CheckM values were not usable. The reasons for the rejection of CheckM estimates in these cases are given.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Probabilistic volcanic hazard assessment (PVHA) has become the paradigm to quantify volcanic hazard over the last decades. Substantial aleatory and epistemic uncertainties in PVHA arise from complexity of physico-chemical processes, impossibility of their direct observation and, importantly, a severe scarcity of observables from past eruptions. One factor responsible for data scarcity is the infrequency of moderate/large eruptions; other factors include lack of discoverability and accessibility to volcanological data. Open-access databases can help alleviate data scarcity and have significantly contributed to long-term PVHA of eruption onset and size, while are less common for data required in other PVHA components (e.g., vent opening). Making datasets open is complicated by economical, technological, ethical and/or policy-related challenges. International synergies (e.g., Global Volcanism Program, WOVOdat, Global Volcano Model, EPOS) will be key to facilitate the creation and maintenance of open-access databases that support Next-Generation PVHA. Additionally, clarification of some misconceptions about PVHA can also help progress. Firstly, PVHA should be understood as an expansion of deterministic, scenario-based hazard assessments. Secondly, a successful PVHA should sometimes be evaluated by its ability to deliver useful and usable hazard-related messages that help mitigate volcanic risk. Thirdly, PVHA is not simply an end product but a driver for research: identifying the most relevant sources of epistemic uncertainty can guide future efforts to reduce the overall uncertainty. Broadening of the volcanological community expertise to statistics or engineering has already brought major breakthroughs in long-term PVHA. A vital next step is developing and maintaining more open-access datasets that support PVHA worldwide.
Facebook
Twitterhttps://catalogue.elra.info/static/from_media/metashare/licences/ELRA_VAR.pdfhttps://catalogue.elra.info/static/from_media/metashare/licences/ELRA_VAR.pdf
https://catalogue.elra.info/static/from_media/metashare/licences/ELRA_END_USER.pdfhttps://catalogue.elra.info/static/from_media/metashare/licences/ELRA_END_USER.pdf
The (polyphone-like) English SpeechDat(M) database was recorded within the framework of the SPEECHDAT(M) Project. It consists of 1,000 speakers, chosen according to their individual demographics, who were recorded over digital telephone lines using fixed telephone sets. The material to be spoken was provided to the caller via a prompt sheet. The database is divided into two sub-sets: the phonetically rich sentences (one CD) known as DB2, and the application-oriented utterances (two CDs) known as DB1.The recorded material in DB1 comprises immediately usable and relevant speech, including number and letter sequences, common control keywords, dates, times, money amounts, etc. This provides a realistic basis for using these resources for the training and assessment of speaker-independent recognition of both isolated and continuous speech utterances, employing either whole-word modeling and/or phoneme based approaches.The sample rate for speech is 8 KHz, quantisation is 8 bit, and a-law encoding is used. This results in a data rate of 64 kB/s.A pronunciation lexicon with a phonemic transcription in SAMPA is also included.
Facebook
TwitterA geographic database of lakes on the Antarctic Peninsula compiled over the past five years from a number of information sources: satellite images, aerial photography, old maps and reports. The database fields include: Lake unique id; Name; location; imager reference/how identified; locality; size (longest axis); area; type (as per Hutchinson''s lake classification); reference - any existing scientific work on the lake; salinity; depth; x co-ordinate; y co-ordinate. Many of the lakes are previously unknown, and very few have been studied before. The list represents the first attempt to collate all the lakes in the area into one usable dataset.
The data is available as a down-loadable text file with point co-ordinates, or as a polygon coverage downloadable from the Antarctic Digital database.
Facebook
TwitterIMGT/GENE-DB is the comprehensive IMGT genome database for immunoglobulin (IG) and T cell receptor (TR) genes from human and mouse, and, in development, from other vertebrates. IMGT/GENE-DB is the international reference for the IG and TR gene nomenclature and works in close collaboration with the HUGO Nomenclature Committee, Mouse Genome Database and genome committees for other species. IMGT/GENE-DB allows a search of IG and TR genes by locus, group and subgroup, which are CLASSIFICATION concepts of IMGT-ONTOLOGY. Short cuts allow the retrieval gene information by gene name or clone name. Direct links with configurable URL give access to information usable by humans or programs. An IMGT/GENE-DB entry displays accurate gene data related to genome (gene localization), allelic polymorphisms (number of alleles, IMGT reference sequences, functionality, etc.) gene expression (known cDNAs), proteins and structures (Protein displays, IMGT Colliers de Perles). It provides internal links to the IMGT sequence databases and to the IMGT Repertoire Web resources, and external links to genome and generalist sequence databases. IMGT/GENE-DB manages the IMGT reference directory used by the IMGT tools for IG and TR gene and allele comparison and assignment, and by the IMGT databases for gene data annotation., THIS RESOURCE IS NO LONGER IN SERVICE. Documented on September 16,2025.
Facebook
TwitterThe Center for Archaeology and Society (CAS), the Phoenix Area Office of the Bureau of Reclamation, and the Center for Digital Antiquity (DA) have created and are making freely available, via tDAR (the Digital Archaeological Record), a large collection of reports, articles, and data sets resulting from the archaeological investigations undertaken for the Theodore Roosevelt Dam project in the Tonto Basin of central Arizona. At present, this tDAR collection includes over two dozen volumes (more than 11,200 pages), plus several articles that present the results of the investigations undertaken as a part of the Roosevelt Dam project. In addition, we present 205 spreadsheets of key data tables extracted from the comprehensive database of the largest of these projects (the Roosevelt Platform Mound Study [RPMS]) along with the complete database of archaeological data for that project. We intend to continue to expand this collection, especially with databases and extracted spreadsheets from the other two projects. Making the collection of data and information available in tDAR allows anyone with an Internet connection to benefit from unlimited, text-searchable access to the full set of reports that represents core documentation of the Salado phenomenon, important aspects of the ancient Hohokam culture, and a detailed case study of the economic and social organization of village-scale human societies. By providing access to key data tables and the full database we hope to facilitate and stimulate comparative studies and additional analysis of this enormous set of data that will further advance our knowledge of these ancient cultures and the workings of human societies more generally.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Annex 1 - plants powered by RES in the Lazio Region; Annex 2 - Electricity consumptions, RES electricity production and percentages of electricity consumption from local RES for each Lazio Municipality; Annex 3 - Additional PV power and PV surface for each Lazio Municipality
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
LukProt is the EukProt database with additional species added, mostly the undersampled animal and some holozoan taxa. The database is composed of sequences translated from annotated genomes, transcriptomes or ESTs. The main purposes of the database are to consolidate sequences from undersampled animal taxa and provide usable search tools. The publication associated with LukProt can be found here: https://doi.org/10.1093/gbe/evae231.
The current version of the database (v1.5.1) is based on EukProt v3. The home of all public versions of LukProt is this page (Zenodo).
Proteomes that are novel in LukProt are denoted as LPXXXXX and those coming from AniProtDB are called APXXXXX. The sequence IDs from EukProt are conserved in LukProt. This means that each sequence is assigned an ID in the following format:
(A/E/L)PXXXXX_Species_epithet_(strain)_PYYYYYY
where XXXXX is a number from 00001 to 99999 and YYYYYY is a number from 000001 to 999999. Each sequence is assigned a unique number YYYYYY, and each taxon XXXXXX. All the IDs are compatible with BLAST v5 "-parse_seqids" option and the database can be readily deployed, for example on a server running SequenceServer. Within each of the source fasta files, the source sequence identifier was kept after a blank space, so that it can still be retrieved if needed.
A publicly available BLAST server providing LukProt search is available at: https://lukprot.hirszfeld.pl/.
Comparison of EukProt v2/v3, LukProt 1.4.1 and LukProt v1.5.1 in their main areas of difference:
Taxogroup EukProt v2 EukProt v3 LukProt v1.4.1 LukProt v1.5.1
Holozoa
(excluding Metazoa)
31 40 39 43
Ctenophora 2 2 35 38
Porifera 4 5 30 47
Placozoa 2 2 3 6
Cnidaria 3 5 65 88
Bilateria 51 51 94 142
Included with the database are:
ready to use main database files:
LukProt_v1.5.1_single_species_FASTA.7z – a FASTA file with the sequences - 7-zipped, uncompressed size: 17.6 GB
to concatenate all into one file, run this in the parent directory: for file in $(find . -type f -name "*.fasta"); do awk 'FNR==1{print ""}1' $file >> LukProt_v1.5.1.fa; done. This will create single FASTA file with all the sequences in the parent directory. awk is used to insert a new line after every file because cat would sometimes merge the last sequence with the header of the first sequence.
LukProt_v1.5.1_full_BLAST_db.7z – a preformatted, full BLAST database (NCBI BLAST database format version: v5, masked with segmasker), uncompressed size: 28.3 GB
LukProt_v1.5.1_taxogroup_BLAST_db.7z – a collection of BLAST databases where each proteome is one taxogroup and is placed within the eukaryotic tree of life directory structure, uncompressed size: 26.3 GB
LukProt_v1.5.1_single_species_BLAST_db.7z – a collection of BLAST databases where each proteome is one BLAST database and is placed within the eukaryotic tree of life directory structure, uncompressed size: 26.4 GB
auxiliary database files:
LukProt_v1.5.1.cdhit70.7z – the full database clustered at 70% identity using CD-HIT with the following command: cd-hit -g 1 -d 0 -T 20 -M 90000 -c 0.7 -uL 0.2 -uS 0.9 -s 0.2, uncompressed sizes: fasta file - 11 GB, clstr file - 2.5 GB
LukProt_IDs_mapped.txt.gz – a text file mapping the LukProt IDs to the AniProtDB IDs and EukProt IDs that are different
BUSCO_tables.ods – a spreadsheet with full result tables generated by BUSCO analysis
OMAmer_output.zip – a folder with full results of OMAmer analyses (includes per-sequence taxonomy classification)
OMArk_output.zip – a folder with the results of all OMArk analyses
metadata:
README.md – a README file describing the metadata
LukProt_metadata_sheet.ods – main metadata file. A spreadsheet with information about each proteome (in an open .ods format, most compatible with LibreOffice)
LukProt_metadata_other.zip – an archive with other metadata files, documented in the README. Contents include:
the LukProt taxonomy in various formats
supporting scripts for data manipulation and visualization
a recoloring script (modified by LFS, originally by Dr. Celine Petitjean). The script is in public domain and reuploaded here only for convenience.
other files - see README
changelog.md – database changelog
Words of caution:
The database has been synchronized to EukProt v3 in version v1.5.1. This means that identifiers were modified in comparison to LukProt v1.4.1. The convention is not expected to change any more in future updates.
Many proteomes, especially those transcriptome-based, may contain contamination from different species. In addition, the translation algorithms often introduce errors (e.g. the transcript may not represent a full length protein). For this reason, to get accurate sequences from each organism, users are directed to source data and to the included OMAmer, OMArk and BUSCO data for details.
The taxonomy is different to UniEuk/EukMap, but UniEuk data were integrated where possible.
A few NCBI taxids are missing and will be added in due course.
Proteomes from NCBI and UniProt will be updated to current versions.
A number of proteomes present in some metadata, are unpublished and were held back.
While the database contains metadata that present a particular phylogeny of animals, holozoans and other eukaryotes, no particular claims or hypotheses are made by the author(s). However, in the future efforts will be made to name clades officially, once they are more firmly established.
Please report any problems or suggestions to Lukasz Sobala: lukasz.sobala (at) hirszfeld.pl.
Acknowledgements:
Andrew E. Allen Lab for creating the original PhyloDB.
Daniel Richter et al. for creating EukProt and keeping it updated.
Members of the Multicellgenome Lab, especially Michelle Leger (for donating her database), for the bioinformatics support and for doing great science.
All the authors of the original data.
National Science Centre of Poland for funding of the project 2020/36/C/NZ8/00081, "The role of glycosylation in the emergence of animal multicellularity", which enabled the creation of this database.
Facebook
TwitterThe Storm Events Database documents storms and other weather phenomena having sufficient intensity to cause loss of life, injuries, property damage, and/or disruption to commerce across the United States. Records date from 1950 up to the current year. Created by the National Weather Service and the National Centers for Environmental Information, the dataset is presented here by The Internet of Water Coalition, The Commons, and Earth Genome.
The SED was developed for the Internet of Water Coalition in partnership with Duke University's Nicholas Institute for Energy, Environment and Sustainability, Earth Genome, and The Commons. This collaboration highlights the power of accessible, connected data and visualization platforms, and demonstrates how the water data community can transform massive federal datasets into usable mediums.
The IoW Storm Events Database (SED) is a comprehensive data exploration tool that gathers millions of data points from the National Oceanic and Atmospheric Administration. By sourcing NOAA’s vast storm data collection into a single, intuitive platform, the SED makes it easier than ever to unlock new insights and access critical weather information.
The SED lowers technical barriers to access this data so that anyone—including researchers, emergency planners, community leaders—can make data-driven decisions to protect their community. Users can search and download data from a variety of weather events, such extreme heat and cold, tropical storms, flooding, from the past 50+ years.
Facebook
TwitterAttribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
This resource is about projects in the Indian Crop Phenome Database (ICPD). The ICPD is a domain of the Indian Biological Data Center (IBDC), Regional Centre for Biotechnology, Faridabad, developed for the digitization of crop phenome data. Being the global agricultural powerhouse, the bulk of biological data generated in India is associated with agricultural trials. Ironically most of the trial data has been inaccessible to other researchers, and remains unpublished, and is lost as time passes. Therefore, ICPD would act as single-stop user-friendly platform for freely archiving, organizing, analyzing, and sharing the multi-crop phenome data following FAIR (Findable, Accessible, Interoperable and Re-usable) data principles. [from homepage]
Facebook
TwitterThe HCV Immunology Database contains a curated inventory of immunological epitopes in HCV and their interaction with the immune system, with associated retrieval and analysis tools. The funding for the HCV database project has stopped, and this website and the HCV immunology database are no longer maintained. The site will stay up, but problems will not be fixed. The database was last updated in September 2007. The HIV immunology website contains the same tools, and may be usable for non-HCV-specific analyses. For new epitope information, users of this database can try the Immuno Epitope Database (http://www.immuneepitope.org).
Facebook
TwitterDatabase of non-redundant sets of protein - small-molecule complexes that are especially suitable for structure-based drug design and protein - small-molecule interaction research. PSMB supports: * Support frequent updates - The number of new structures in the PDB is growing rapidly. In order to utilize these structures, frequent updates are required. In contrast to manual procedures which require significant time and effort per update, generation of the PSMDB database is fully automatic thereby facilitating frequent database updates. * Consider both protein and ligand structural redundancy - In the database, two complexes are considered redundant if they share a similar protein and ligand (the protein - small-molecule non-redundant set). This allows the database to contain structural information for the same protein bound to several different ligands (and vice-versa). Additionally, for completeness, the database contains a set of non-redundant complexes when only protein structural redundancy is considered (our protein non-redundant set). The following images demonstrate the structural redundancy of the protein complexes in the PDB compared to the PSMDB. * Efficient handling of covalent bonds -Many protein complexes contain covalently bound ligands. Typically, protein-ligand databases discard these complexes; however, the PSMDB simply removes the covalently bound ligand from the complex, retaining any non-covalently bound ligands. This increases the number of usable complexes in the database. * Separate complexes into protein and ligand files -The PSMDB contains individual structure files for both the protein and all non-covalently bound ligands. The unbound proteins are in PDB format while the individual ligands are in SDF format (in their native coordinate frame).