Facebook
TwitterThe NIST Chemical Kinetics Database includes essentially all reported kinetics results for thermal gas-phase chemical reactions. The database is designed to be searched for kinetics data based on the specific reactants involved, for reactions resulting in specified products, for all the reactions of a particular species, or for various combinations of these. In addition, the bibliography can be searched by author name or combination of names. The database contains in excess of 38,000 separate reaction records for over 11,700 distinct reactant pairs. These data have been abstracted from over 12,000 papers with literature coverage through early 2000. Rate constant records for a specified reaction are found by searching the Reaction Database. All rate constant records for that reaction are returned, with a link to 'Details' on that record. Each rate constant record contains the following information (as available): a) Reactants and, if defined, reaction products; b) Rate parameters: A, n, Ea/R, where k = A (T/298)*n exp[-(Ea/R)/T], where T is the temperature in Kelvins; c) Uncertainty in A, n, and Ea/R, if reported; d) Temperature range of experiment or temperature range of validity of a review or theoretical paper; e) Pressure range and bulk gas of the experiment; f) Data type of the record (i.e., experimental, relative rate measurement, theoretical calculation, modeling result, etc.). If the result is a relative rate measurement, then the reaction to which the rate is relative is also given; g) Experimental procedure, including separate fields for the description of the apparatus, the time resolution of the experiment, and the excitation technique. A majority of contemporary chemical kinetics methods are represented. The Kinetics Database is being expanded to include other resources for the convenience of the users. Presently this includes direct links to the corresponding NIST WebBook page for all substances for which such a link is possible. This is indicated by underling and highlighting the species. The WebBook provides thermodynamic, spectral, and other data on the species. Note that the link to the WebBook is opened as a new frame in your browser.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This is the collection associated with list S73 MetXBioDB Metabolite Reaction Database from BioTransformer on the NORMAN Suspect List Exchange.
https://www.norman-network.com/nds/SLE/
This dataset is extracted from the database behind BioTransformer (http://biotransformer.ca/) by Yannick Djoumbou-Feunang, David S. Wishart and colleagues, for addition to the PubChem Transformations section. Change logs and version tracking at the ECI GitLab site.
Please cite the BioTransformer article when using this set: https://jcheminf.biomedcentral.com/articles/10.1186/s13321-018-0324-5
NOTE: This deposition is work in progress ...
Change log: 13 Oct: added InChIKey file. 16 Oct: updated substances with missing CIDs and transformations. 5/11 many bug fixes finally committed, added DTXSIDs. 22/6/2023 adjusted one CID that changed upon PubChem standardization. 15 Nov 2023: fixed typo in reaction description. 26 Feb 2024: corrected name for CID 65564. 6 Aug 2024: fixed many triazine synonyms.
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
This is a dataset generated by Yet Another Reaction Program (YARP), including pyGSM reaction pathways, Gaussian transitions states optimization file, IRC calculation results, etc.
Four systems are provided, 'KHP network' involves reactions of gamma-ketohydroperoxide and it's 12 intended products. 'Z-benchmark' involves reactants obtained from Zimmerman testing set.
Facebook
TwitterKEGG LIGAND contains knowledge of chemical substances and reactions that are relevant to life. It is a composite database consisting of COMPOUND, GLYCAN, REACTION, RPAIR, and ENZYME databases, whose entries are identified by C, G, R, RP, and EC numbers, respectively. ENZYME is derived from the IUBMB/IUPAC Enzyme Nomenclature, but the others are internally developed and maintained. The primary database of KEGG LIGAND is a relational database with the KegDraw interface, which is used to generated the secondary (flat file) database for DBGET.
Facebook
TwitterAttribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Computer-assisted synthetic planning has seen major advancements that stem from the availability of large reaction databases and artificial intelligence methodologies. SynRoute is a new retrosynthetic planning software tool that uses a relatively small number of general reaction templates, currently 263, along with a literature-based reaction database to find short, practical synthetic routes for target compounds. For each reaction template, a machine learning classifier is trained using data from the Pistachio reaction database to predict whether new computer-generated reactions based on the template are likely to work experimentally in the laboratory. This reaction generation methodology is used together with a vectorized Dijkstra-like search of top-scoring routes organized by synthetic strategies for easy browsing by a synthetic chemist. SynRoute was able to find routes for an average of 83% of compounds based on selection of random subsets of drug-like compounds from the ChEMBL database. Laboratory evaluation of 12 routes produced by SynRoute, to synthesize compounds not from the previous random subsets, demonstrated the ability to produce feasible overall synthetic strategies for all compounds evaluated.
Facebook
TwitterThe NDRL/NIST Solution Kinetics Database contains data on rate constants for solution-phase chemical reactions. The database is designed to be searched by reactants, products, solvents, or any combination of these. In addition, the bibliography may be searched by author name, title words, journal, page(s), and/or year. This is not the same database as the one at Notre Dame, although both databases share a common data source.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This reaction database is generated along with the manuscript "Comprehensive exploration of graphically defined reaction spaces".RGD1CHNO_AMsmiles.csv contains atom-mapped SMILES, activation energies, and enthalpies of formation for each reaction. RGD!_CHNO.h5 contains the geometry information and can be iterated by a python script from Github (https://github.com/zhaoqy1996/RGD1/parse_data.py). DFT_reaction_info.csv is supplied to reproduce figures in the article.RandP_smiles.txt is a dictionary to map the reactant and product smiles appear in RGD!_CHNO.h5 to a molecule index (molX).RGD1_RPs.h5 provides xtb and DFT optimized geometries of each individual reactant/product molecules. 3D ML models can be trained by combining RGD1_RPs.h5, RGD!_CHNO.h5, and RandP_smiles.txt (see https://github.com/zhaoqy1996/RGD1 for more details)IMPORTANT: We provided an UPDATED VERSION of RGD1 dataset in Ari 24, 2023. The initially posted version of the dataset reported swapped activation energies for ~24% of the forward/reverse reactions which were all corrected in this updated version.
Facebook
TwitterManually annotated reaction database where all reaction participants (reactants and products) are linked to the ChEBI database (Chemical Entities of Biological Interest) which provides detailed information about structure, formula and charge. Rhea provides built-in validations that ensure both elemental and charge balance of the reactions. The database has been populated with the reactions found in the Enzyme Commission (EC) list (and in the IntEnz and ENZYME databases), extending it with additional known reactions of biological interest. While the main focus of Rhea is enzyme-catalyzed reactions, other biochemical reactions are also included. Rhea is a manually annotated resource and it provides: stable reaction identifiers for each of its reactions; directionality information if the physiological direction of the reaction is known; the possibility to link several reactions together to form overall reactions; extensive cross-references to other resources including enzyme-catalyzed and other metabolic reactions, such as the EC list (in IntEnz), KEGG, MetaCyc and UniPathway; and chemical substructure and similarity searches on compounds in Rhea.
Facebook
TwitterA database based on the SABIO relational database that contains information about biochemical reactions, their kinetic equations with their parameters, and the experimental conditions under which these parameters were measured. It aims to support modelers in the setting-up of models of biochemical networks, but it is also useful for experimentalists or researchers with interest in biochemical reactions and their kinetics. SABIO-RK contains and merges information about reactions such as reactants and modifiers, organism, tissue and cellular location, as well as the kinetic properties of the reactions. The type of the kinetic mechanism, modes of inhibition or activation, and corresponding rate equations are presented together with their parameters and measured values, specifying the experimental conditions under which these were determined. Links to other databases are provided for users to gather further information and to refer to the original publication. Information about reactions and their kinetic data can be exported to an SBML file. The reaction kinetics data are obtained by manual extraction from literature sources and curated.
Facebook
TwitterHerein, we disclose the total synthesis of honokiol in six steps with an overall yield of 66%. Two distinct routes were explored, with the key steps being highly efficient and selective cross-couplings of commercially available phenols to construct the main biphenolic backbone. The routes employ inexpensive reagents and are scalable, high-yielding processes. The experimental procedures are reported in the conventional narrative format and in two machine-readable formats.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This data repository was created by Rasmus Fromsejer (Technical University of Denmark) to supplement the research paper "Accurate Formation Enthalpies of Solids Using Reaction Networks" by Rasmus Fromsejer, Bjørn Maribo-Mogensen, Georgios Kontogeorgis and Xiaodong Liang in npj computational materials.The data repository consists of:a directory with results and databases in .csv format excluding detailed information about the reactions used in the reaction network predictions (.csv/)a directory with the results including detailed information about the reactions used in the reaction network predictions in gzipped .pkl format including detailed information about the reactions used in the reaction network predictions (.pkl/).Refer to the READMEs in the aforementioned directories for detailed information about the directories, files and the file contents.
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Reactions extracted by text-mining from United States patents published between 1976 and September 2016. The reactions are available as CML or reaction SMILES. Note that the reactions SMILES are derived from the CML. The files can be unzipped using a program like 7-Zip.The reactions were extracted using an enhanced version of the reaction extraction code described in https://www.repository.cam.ac.uk/handle/1810/244727with LeadMine (https://www.nextmovesoftware.com/leadmine.html) used for chemical entity recognition.General tips:Duplicate reactions are frequent due to the same or highly similar text occurring in multiple patents, this is especially true when combining the applications and grant datasets, many reactions from applications will later appear in patent grants.Paragraph numbers are only present for 2005+ patent grants and patent applications.Multiple reactions can be extracted from the same paragraph.Atom maps in the reactions SMILES are derived using Epam's Indigo toolkit. While typically correct, the atom-maps are wrong in many cases and hence should not be entirely relied on.The reactions have been filtered to remove common cases of incorrectly extracted reactions:All product atoms must be accounted for by the atom-mappingThe product(s) must have >8 heavy atomsThe product must not be charged if it is a single componentThe number of products must be
Facebook
TwitterBIOINF595 W2025 Bioactivity Project Dataset Author: Carl Mauro The reaction data used in this project is from the following publication, accessed through the Open Reaction Database (https://open-reaction-database.org/). The original data is used under an MIT license, and is under copyright by the original authors (see LICENSE.txt file for details). Ahneman, D. T.; Estrada, J. G.; Lin, S.; Dreher, S. D.; Doyle, A. G. Predicting Reaction Performance in C–N Cross-Coupling Using Machine… See the full description on the dataset page: https://huggingface.co/datasets/cmmauro/ORD_Ahneman_2018.
Facebook
TwitterMulti species reference database. Comprehensive plant biochemical pathway database, containing curated information from literature and computational analyses about genes, enzymes, compounds, reactions, and pathways involved in primary and secondary metabolism.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This repository presents approximately 750 million atom-mapped reaction SMILES. Reactions are generated by applying templates from the Reaction Mechanism Generator (RMG) database to a subset of the species from GDB11. Thus, we refer to this dataset as RMG-DB-11 i.e., the Reaction Mechanism Generator Database whose species contain up to 11 heavy atoms. All SMILES have been canonicalized by RDKit. All reactions are labeled with their corresponding RMG template.
This data serves as a crucial starting point for quantitative predictive chemistry. Many methods that search for transition state structures require atom-mapped SMILES, which this repository provides. This data is also well-suited for unsupervised pre-training of various machine learning models.
To parse the data with Python, start with import pandas as pd. Reactions with 1-8 heavy atoms can be parsed using the following code snippet: pd.read_csv(). Reactions with 9 heavy atoms can be parsed using pd.read_pickle(, compression='zip'). The file names below include the word "zip" as a helpful hint to use the compression argument. Due to the large number of reactions with 10 and 11 heavy atoms, these are split into smaller chunks. First untar the file using tar -xvf to obtain several zipped pickle files that can each be parsed using the same method as with 9 heavy atoms.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The data of the article "General reactive machine learning potentials for CHON elements"
Facebook
TwitterThe human pathway database which contains different biological entities and reactions and software tools for analysis. PATIKA Database integrates data from several sources, including Entrez Gene, UniProt, PubChem, GO, IntAct, HPRD, and Reactome. Users can query and access this data using the PATIKAweb query interface. Users can also save their results in XML or export to common picture formats. The BioPAX and SBML exporters can be used as part of this Web service.
Facebook
TwitterThe University of Minnesota Biocatalysis/Biodegradation Database (UM-BBD) contains information on microbial biocatalytic reactions and biodegradation pathways for primarily xenobiotic, chemical compounds. The goal of the UM-BBD is to provide information on microbial enzyme-catalyzed reactions that are important for biotechnology. This collection refers to reaction information.
Facebook
Twitterhttps://www.nist.gov/open/copyright-fair-use-and-licensing-statements-srd-data-software-and-technical-series-publications#SRDhttps://www.nist.gov/open/copyright-fair-use-and-licensing-statements-srd-data-software-and-technical-series-publications#SRD
The NIST Chemistry WebBook provides users with easy access to chemical and physical property data for chemical species through the internet. The data provided in the site are from collections maintained by the NIST Standard Reference Data Program and outside contributors. Data in the WebBook system are organized by chemical species. The WebBook system allows users to search for chemical species by various means. Once the desired species has been identified, the system will display data for the species. Data include thermochemical properties of species and reactions, thermophysical properties of species, and optical, electronic and mass spectra.
Facebook
Twitterhttps://spdx.org/licenses/etalab-2.0.htmlhttps://spdx.org/licenses/etalab-2.0.html
This dataset transforms the many generic molecules in the ChEBI ontology—those whose structures contain undefined R‑groups—into fully specified molecular instances. Its purpose is to let cheminformaticians, enzymologists and AI/ML developers treat R‑group–bearing ChEBI entries as ordinary molecules, so they can be indexed, searched and used to augment training sets for tasks such as reaction prediction, bio‑isosteric replacement and retro‑biosynthetic pathway design. In nature, the resource is a gzip‑compressed CSV file produced by a three‑stage RDKit‑based pipeline that: 1. Extracts every ChEBI SMILES that contains at least one R‑group from the Rhea reaction database (release 134). 2. Finds real PubChem compounds whose heavy‑atom core matches the ChEBI scaffold, allowing only the R‑group position to vary. 3. Filters matches so that the final list comprises molecules differing from the template only at the R‑group site, and records their PubChem CIDs for traceability. Each record therefore links a generic ChEBI structure to the enumerated set of concrete PubChem structures that realise it, along with molecular weight, heavy‑atom count and bookkeeping fields that distinguish “exact core” versus “core + extra substituent” matches. The dataset’s scope encompasses all R‑group–containing entries in Rhea/ChEBI that survive atomic filters (≥ 6 heavy atoms and atoms found in living organisms), yielding 12,709 rows and eight columns that summarise: the canonical SMILES, the list of ChEBI IDs sharing that SMILES, computed properties, matched PubChem SMILES/CIDs with and without extra substituents, and provenance metadata. By expanding more than a thousand otherwise unusable generic templates into over ten thousand explicit molecules, the dataset bridges a long‑standing gap between curated biochemical ontologies and large‑scale public compound repositories, enabling systematic benchmarking, data augmentation and method development wherever R‑groups once forced researchers to discard valuable reaction data.
Facebook
TwitterThe NIST Chemical Kinetics Database includes essentially all reported kinetics results for thermal gas-phase chemical reactions. The database is designed to be searched for kinetics data based on the specific reactants involved, for reactions resulting in specified products, for all the reactions of a particular species, or for various combinations of these. In addition, the bibliography can be searched by author name or combination of names. The database contains in excess of 38,000 separate reaction records for over 11,700 distinct reactant pairs. These data have been abstracted from over 12,000 papers with literature coverage through early 2000. Rate constant records for a specified reaction are found by searching the Reaction Database. All rate constant records for that reaction are returned, with a link to 'Details' on that record. Each rate constant record contains the following information (as available): a) Reactants and, if defined, reaction products; b) Rate parameters: A, n, Ea/R, where k = A (T/298)*n exp[-(Ea/R)/T], where T is the temperature in Kelvins; c) Uncertainty in A, n, and Ea/R, if reported; d) Temperature range of experiment or temperature range of validity of a review or theoretical paper; e) Pressure range and bulk gas of the experiment; f) Data type of the record (i.e., experimental, relative rate measurement, theoretical calculation, modeling result, etc.). If the result is a relative rate measurement, then the reaction to which the rate is relative is also given; g) Experimental procedure, including separate fields for the description of the apparatus, the time resolution of the experiment, and the excitation technique. A majority of contemporary chemical kinetics methods are represented. The Kinetics Database is being expanded to include other resources for the convenience of the users. Presently this includes direct links to the corresponding NIST WebBook page for all substances for which such a link is possible. This is indicated by underling and highlighting the species. The WebBook provides thermodynamic, spectral, and other data on the species. Note that the link to the WebBook is opened as a new frame in your browser.