100+ datasets found
  1. Disconnection Labelled Reaction Data

    • zenodo.org
    bin, csv
    Updated Sep 23, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Amol Thakkar; Alain Vaucher; Andrea Byekwaso; Philippe Schwaller; Alessandra Toniato; Teodoro Laino; Amol Thakkar; Alain Vaucher; Andrea Byekwaso; Philippe Schwaller; Alessandra Toniato; Teodoro Laino (2022). Disconnection Labelled Reaction Data [Dataset]. http://doi.org/10.5281/zenodo.7101695
    Explore at:
    bin, csvAvailable download formats
    Dataset updated
    Sep 23, 2022
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Amol Thakkar; Alain Vaucher; Andrea Byekwaso; Philippe Schwaller; Alessandra Toniato; Teodoro Laino; Amol Thakkar; Alain Vaucher; Andrea Byekwaso; Philippe Schwaller; Alessandra Toniato; Teodoro Laino
    Description

    Dataset containing reaction centers used to train the disconnection aware model

  2. g

    NIST Chemical Kinetics Database

    • gimi9.com
    • res1catalogd-o-tdatad-o-tgov.vcapture.xyz
    • +3more
    Updated Feb 1, 2002
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2002). NIST Chemical Kinetics Database [Dataset]. https://gimi9.com/dataset/data-gov_nist-chemical-kinetics-database-bee86
    Explore at:
    Dataset updated
    Feb 1, 2002
    Description

    The NIST Chemical Kinetics Database includes essentially all reported kinetics results for thermal gas-phase chemical reactions. The database is designed to be searched for kinetics data based on the specific reactants involved, for reactions resulting in specified products, for all the reactions of a particular species, or for various combinations of these. In addition, the bibliography can be searched by author name or combination of names. The database contains in excess of 38,000 separate reaction records for over 11,700 distinct reactant pairs. These data have been abstracted from over 12,000 papers with literature coverage through early 2000. Rate constant records for a specified reaction are found by searching the Reaction Database. All rate constant records for that reaction are returned, with a link to 'Details' on that record. Each rate constant record contains the following information (as available): a) Reactants and, if defined, reaction products; b) Rate parameters: A, n, Ea/R, where k = A* (T/298)**n exp[-(Ea/R)/T], where T is the temperature in Kelvins; c) Uncertainty in A, n, and Ea/R, if reported; d) Temperature range of experiment or temperature range of validity of a review or theoretical paper; e) Pressure range and bulk gas of the experiment; f) Data type of the record (i.e., experimental, relative rate measurement, theoretical calculation, modeling result, etc.). If the result is a relative rate measurement, then the reaction to which the rate is relative is also given; g) Experimental procedure, including separate fields for the description of the apparatus, the time resolution of the experiment, and the excitation technique. A majority of contemporary chemical kinetics methods are represented. The Kinetics Database is being expanded to include other resources for the convenience of the users. Presently this includes direct links to the corresponding NIST WebBook page for all substances for which such a link is possible. This is indicated by underling and highlighting the species. The WebBook provides thermodynamic, spectral, and other data on the species. Note that the link to the WebBook is opened as a new frame in your browser.

  3. f

    Yield curation USPTO rsmi/csv datasets

    • figshare.com
    zip
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Alexander Minidis (2023). Yield curation USPTO rsmi/csv datasets [Dataset]. http://doi.org/10.6084/m9.figshare.14414039.v1
    Explore at:
    zipAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    figshare
    Authors
    Alexander Minidis
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    In 2017 Lowe shared curated and published USPTO based chemical reaction datasets in csv format. Based on this, Schwaller et al. published curated reaction smiles (they in turn used the curated set disclosed by Jin and coworkers). Both versions have the drawback of containing only partially curated yields. In those datasets, two columns are available, TextMinedYield and Calculated yield. Many entries there don't contain any, partial, or incorrect numbers. For certain forms of reaction analysis focusing on yield as only available correlation, that information becomes essentially useless since there is no correlation to reaction conditions (unless one would data-mine the CML files or original XML).By correcting and merging the yield into a new column, followed by eliminating faulty entries, the noise in the data set is reduced. The new datasets are reduced by nearly 50%.Attached are two kinds of datasets (of each, Lowe and Schwaller):A "cropped" version, containing only the reaction smiles and the curated yield (and an added ID), and only entries with valid yields. Everything else was filtered out.A second type, a "full" version, including the curated yields and all original input columns and entries (no filtration). The latter might come in handy for other applications where one doesn't agree with the applied removal of invalid entries, or to apply further curation.More details can be found on Github containing Python scripts used to procure the attached datasets and a Readme file.For the less adept programmer, a graphical workflow based on the open-source data analysis platform Knime(R) is also available. The latter contains furthermore a proof of concept reaction splitter (data not included here).

  4. h

    ORD_Ahneman_2018

    • huggingface.co
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Carl Mauro, ORD_Ahneman_2018 [Dataset]. https://huggingface.co/datasets/cmmauro/ORD_Ahneman_2018
    Explore at:
    Authors
    Carl Mauro
    Description

    BIOINF595 W2025 Bioactivity Project Dataset Author: Carl Mauro The reaction data used in this project is from the following publication, accessed through the Open Reaction Database (https://open-reaction-database.org/). The original data is used under an MIT license, and is under copyright by the original authors (see LICENSE.txt file for details). Ahneman, D. T.; Estrada, J. G.; Lin, S.; Dreher, S. D.; Doyle, A. G. Predicting Reaction Performance in C–N Cross-Coupling Using Machine… See the full description on the dataset page: https://huggingface.co/datasets/cmmauro/ORD_Ahneman_2018.

  5. ORDerly Transformer Models for chemical tasks

    • figshare.com
    bin
    Updated Jul 14, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Daniel Wigh; Joe arrowsmith; Kobi Felton; Alexander Pomberger; Alexei A. Lapkin (2025). ORDerly Transformer Models for chemical tasks [Dataset]. http://doi.org/10.6084/m9.figshare.29552543.v4
    Explore at:
    binAvailable download formats
    Dataset updated
    Jul 14, 2025
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Daniel Wigh; Joe arrowsmith; Kobi Felton; Alexander Pomberger; Alexei A. Lapkin
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Transformer models trained on tasks in organic chemistry on ORDerly benchmark datasets.ORDerly_retro: Retrosynthesis prediction (prediction reactants given a desired product)ORDerly_forward_separated: Forward reaction prediction (predict reaction products given reactants, solvents, and agents), with reactants separated by > from the solvents and agents in the reaction string.ORDerly_forward_mixed: Forward reaction prediction (predict reaction products given reactants, solvents, and agents), with reactants, solvents and agents mixed together in the reaction string.non-uspto-eval: Evaluation of transformer models trained on USPTO data on non-uspto data available in the Open Reaction Database.Full details can be found in our paper: https://chemrxiv.org/engage/chemrxiv/article-details/64ca5d3e4a3f7d0c0d78ca42Neurips workshop paper: https://openreview.net/forum?id=R8FQMsECISCode: https://github.com/sustainable-processes/orderlyThe supplementary datasets used for this work can be found here: https://doi.org/10.6084/m9.figshare.23502372.v3Transformer model architecture is from Molecular Transformer: https://pubs.acs.org/doi/10.1021/acscentsci.9b00576Find the results, models, and checkpoints within MolecularTransformer/experiments. Note that the "wandb" folder was deleted since figshare only allows uploads up to 500 files.Notes:There's a limit of 500 files in figshare, so I deleted the the "docs" and "onmt", and "OpenNMT_py.egg-info" , and "tools" folders from all folders except "ORDerly_retro". I also deleted all wandb-associated files and all checkpoint files.Empty files cannot be uploaded to figshare, so you have to create these yourself, where appropriate (e.g. MolecularTransformer/onmt/tests/_init_.py and non-uspto-eval/MolecularTransformer/experiments/models/ofs_1.pt).Feel free to email me, Daniel Wigh, at dsw46@cam.ac.uk or daniel@reactwise.com or my supervisor Alexei A. Lapkin.

  6. f

    Data from: AiZynthFinder: a fast, robust and flexible open-source software...

    • figshare.com
    hdf
    Updated May 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Samuel Genheden; Esben Jannik Bjerrum; Amol Thakkar; Jean-Louis Reymond; Veronika Chadimova; Ola Engkvist (2023). AiZynthFinder: a fast, robust and flexible open-source software for retrosynthetic planning [Dataset]. http://doi.org/10.6084/m9.figshare.12334577.v1
    Explore at:
    hdfAvailable download formats
    Dataset updated
    May 30, 2023
    Dataset provided by
    figshare
    Authors
    Samuel Genheden; Esben Jannik Bjerrum; Amol Thakkar; Jean-Louis Reymond; Veronika Chadimova; Ola Engkvist
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    This is public data to be used with the aizynthfinder tool for retrosynthesis planning (https://github.com/MolecularAI/aizynthfinder)There are three files available:* full_uspto_03_05_19_rollout_policy.hdf5 - the Keras neural network model used as rollout policy* full_uspto_03_05_19_unique_templates.hdf5 - unique template codes that are used together with the policy to generate new precursors in the tree search* zinc_stock_17_04_20.hdf - stock file made from the ZINC database on 17:th of april 2020.

  7. Canada Vigilance Adverse Reaction Online Database

    • ouvert.canada.ca
    • open.canada.ca
    • +1more
    html, json, xml, zip
    Updated May 28, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Health Canada (2025). Canada Vigilance Adverse Reaction Online Database [Dataset]. https://ouvert.canada.ca/data/dataset/9cbaef00-b52c-4a70-9fed-d9aa8263ab74
    Explore at:
    json, xml, html, zipAvailable download formats
    Dataset updated
    May 28, 2025
    Dataset provided by
    Health Canadahttp://www.hc-sc.gc.ca/
    License

    Open Government Licence - Canada 2.0https://open.canada.ca/en/open-government-licence-canada
    License information was derived automatically

    Area covered
    Canada
    Description

    The data extract is a series of compressed ASCII text files of the full data set contained in the Canada Vigilance Adverse Reaction Online Database. It is intended for users who are familiar with database structures and setting up their own queries. Find details on the data structure required for the data file in the Canada Vigilance Adverse Reaction Online Database - Data Structure. In order to use the data, the file must be loaded into an existing database or information system provided by the user. The Canada Vigilance Adverse Reaction Online Database contains information about suspected adverse reactions (also known as side effects) to health products, captured from adverse reaction reports submitted to Health Canada by consumers and health professionals, who submit reports voluntarily, as well as by market authorization holders (manufacturers and distributors), who are required to submit reports according to the Food and Drugs Regulations. Information concerning vaccines used for immunization have only been included in the database since January 1, 2011. Indication data has recently been added to the data extract files and the Detailed Adverse Reaction Report. Indication refers to the particular condition for which a health product was taken. For example, diabetes is an indication for insulin. Health products are often authorised for use in treating more than one indication. Note: The database cannot be used on its own to evaluate a health product's safety profile. It does not provide conclusive information on the safety of health products, and is not a substitute for medical advice. Should you have an issue of medical concern, consult a qualified health professional.

  8. m

    Data from: Chemical Kinetics Bayesian Inference Toolbox (CKBIT)

    • data.mendeley.com
    Updated May 17, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Maximilian Cohen (2021). Chemical Kinetics Bayesian Inference Toolbox (CKBIT) [Dataset]. http://doi.org/10.17632/tnzk2jvffs.2
    Explore at:
    Dataset updated
    May 17, 2021
    Authors
    Maximilian Cohen
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    The robust estimation of chemical kinetic parameters and their associated uncertainty is essential in the field of chemistry and catalysis. The Chemical Kinetics Bayesian Inference Toolbox (CKBIT) is a Python software library introduced to enable users to implement advanced Bayesian inference techniques for kinetic parameter estimation and uncertainty quantification. Leveraging functionalities of other open source Python packages and offering simplified implementation through minimal user-required coding and straightforward Excel input files, CKBIT aspires to make the inference method easily accessible for chemical kinetics. CKBIT provides maximum a posteriori, Markov chain Monte Carlo, and variational inference estimation options. Users may apply these functionalities to estimate activation energies, reaction orders, and pre-exponential terms from chemical reaction data from batch reactors, continuous stirred-tank reactors, and plug flow reactors. The availability of prior distribution specification and the implementation of hierarchical modeling in CKBIT provide a heightened level of accuracy in estimates of kinetic parameters and their uncertainties.

  9. Ames Quantum Chemistry - Dataset - NASA Open Data Portal

    • data.nasa.gov
    Updated Mar 31, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    nasa.gov (2025). Ames Quantum Chemistry - Dataset - NASA Open Data Portal [Dataset]. https://data.nasa.gov/dataset/ames-quantum-chemistry
    Explore at:
    Dataset updated
    Mar 31, 2025
    Dataset provided by
    NASAhttp://nasa.gov/
    Description

    Ames Quantum Chemistry Dataset collects electronic structure, reaction kinetics, and dynamics data calculated at Ames Research Center. This includes potential energy curves and surfaces as well as the reaction cross sections and rate coefficients.

  10. ROSETTA REACTION WHEEL ENGINEERING DATA - Dataset - NASA Open Data Portal

    • data.nasa.gov
    • data.staging.idas-ds1.appdat.jsc.nasa.gov
    Updated Mar 31, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    nasa.gov (2025). ROSETTA REACTION WHEEL ENGINEERING DATA - Dataset - NASA Open Data Portal [Dataset]. https://data.nasa.gov/dataset/rosetta-reaction-wheel-engineering-data
    Explore at:
    Dataset updated
    Mar 31, 2025
    Dataset provided by
    NASAhttp://nasa.gov/
    License

    U.S. Government Workshttps://www.usa.gov/government-works
    License information was derived automatically

    Description

    This CODMAC level 3 data set contains the key parameters of the four Reaction wheel housekeeping. In particular, it provides information on the Reaction wheel friction, measured angular momentum & wheel direction. It covers the period from launch in 2004, through the 3 Earth and 1 Mars flyby, plus the hibernation phases, plus the asteroid flybys and finally covers the Prelanding, comet escort & Extension phases of the prime target of the mission. The prime target is comet 67P/Churyumov-Gerasimenko 1 (1969 R1). This version V1.0 is the first version of this dataset.

  11. Canada Vigilance Adverse Reaction Online Database - Data Structure

    • ouvert.canada.ca
    • open.canada.ca
    html
    Updated Apr 28, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Health Canada (2022). Canada Vigilance Adverse Reaction Online Database - Data Structure [Dataset]. https://ouvert.canada.ca/data/dataset/786f35a3-6170-4419-92c5-5834f071d8bc
    Explore at:
    htmlAvailable download formats
    Dataset updated
    Apr 28, 2022
    Dataset provided by
    Health Canadahttp://www.hc-sc.gc.ca/
    License

    Open Government Licence - Canada 2.0https://open.canada.ca/en/open-government-licence-canada
    License information was derived automatically

    Area covered
    Canada
    Description

    Although the Canada Vigilance Adverse Reaction Online Database is a relational database, there is a requirement to provide the data to users in a common format; therefore the data has been extracted into a flat file format. All files are dollar ($) sign delimited enclosed in "quotes".

  12. V

    Data from: Incomplete evidence: the inadequacy of databases in tracing...

    • odgavaprod.ogopendata.com
    • healthdata.gov
    • +1more
    html
    Updated Jul 23, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    National Institutes of Health (2025). Incomplete evidence: the inadequacy of databases in tracing published adverse drug reactions in clinical trials [Dataset]. https://odgavaprod.ogopendata.com/dataset/incomplete-evidence-the-inadequacy-of-databases-in-tracing-published-adverse-drug-reactions-in-
    Explore at:
    htmlAvailable download formats
    Dataset updated
    Jul 23, 2025
    Dataset provided by
    National Institutes of Health
    Description

    Background We would expect information on adverse drug reactions in randomised clinical trials to be easily retrievable from specific searches of electronic databases. However, complete retrieval of such information may not be straightforward, for two reasons. First, not all clinical drug trials provide data on the frequency of adverse effects. Secondly, not all electronic records of trials include terms in the abstract or indexing fields that enable us to select those with adverse effects data. We have determined how often automated search methods, using indexing terms and/or textwords in the title or abstract, would fail to retrieve trials with adverse effects data.

       Methods
       We used a sample set of 107 trials known to report frequencies of adverse drug effects, and measured the proportion that (i) were not assigned the appropriate adverse effects indexing terms in the electronic databases, and (ii) did not contain identifiable adverse effects textwords in the title or abstract.
    
    
       Results
       Of the 81 trials with records on both MEDLINE and EMBASE, 25 were not indexed for adverse effects in either database. Twenty-six trials were indexed in one database but not the other. Only 66 of the 107 trials reporting adverse effects data mentioned this in the abstract or title of the paper. Simultaneous use of textword and indexing terms retrieved only 82/107 (77%) papers.
    
    
       Conclusions
       Specific search strategies based on adverse effects textwords and indexing terms will fail to identify nearly a quarter of trials that report on the rate of drug adverse effects.
    
  13. f

    Data from: Ring-Opening Reactions of Tetrahydrofuran versus Alkyne...

    • figshare.com
    txt
    Updated Jun 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Torsten Beweries; Ulrike Jäger-Fiedler; Marc A. Bach; Vladimir V. Burlakov; Perdita Arndt; Wolfgang Baumann; Anke Spannenberg; Uwe Rosenthal (2023). Ring-Opening Reactions of Tetrahydrofuran versus Alkyne Complexation by Group 4 Metallocene Complexes Leading to General Consequences for Synthesis and Reactions of Metallocene Complexes [Dataset]. http://doi.org/10.1021/om0702173.s006
    Explore at:
    txtAvailable download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    ACS Publications
    Authors
    Torsten Beweries; Ulrike Jäger-Fiedler; Marc A. Bach; Vladimir V. Burlakov; Perdita Arndt; Wolfgang Baumann; Anke Spannenberg; Uwe Rosenthal
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    The reduction of certain group 4 metallocene dichlorides by magnesium or lithium in the presence or absence of Me3SiC2SiMe3 in THF or toluene was investigated, giving in the case of titanium the dinuclear Ti(III) complex [rac-(ebthi)Ti(μ-Cl)]2 (1). For zirconium the 1-oxa-2-zirconacyclohexane 2 was formed by ring-opening reaction of rac-(ebthi)Zr(η2-Me3SiC2SiMe3) with THF. As a byproduct from the synthesis of Cp*2Zr(η2-Me3SiC2SiMe3) starting from Cp*2ZrCl2 another 1-oxa-2-zirconacyclohexane (3) was obtained by ring-opening reaction of THF via the dinuclear complex Cp*2Zr(Cl)-(CH2)4O−Zr(Cl)Cp*2 (4). In the case of hafnium the analogous dinuclear complex Cp*2Hf(Cl)−(CH2)4O−Hf(Cl)Cp*2 (5) and 1-oxa-2-hafnacyclohexane (6) were the main products of the reaction, inhibiting the synthesis of Cp*2Hf(η2-Me3SiC2SiMe3) (7). The tendency for ring opening of THF initiated by metallocenes increases in the series Ti, Zr, Hf, thus leading to consequences for the synthesis of metallocene complexes.

  14. Data from: FoamPi: An open-source Raspberry Pi based apparatus for...

    • osf.io
    Updated Sep 11, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Harry Wright (2022). FoamPi: An open-source Raspberry Pi based apparatus for monitoring polyurethane foam reactions. [Dataset]. http://doi.org/10.17605/OSF.IO/U3295
    Explore at:
    Dataset updated
    Sep 11, 2022
    Dataset provided by
    Center for Open Sciencehttps://cos.io/
    Authors
    Harry Wright
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Adiabatic temperature rise (ATR) is an important method for determining isocyanate conversion in polyurethane foam reactions as well as many other exothermic chemical reactions. ATR can be used in conjunction with change in height and mass measurements to gain understanding into the blowing and gelling reactions that occur during polyurethane foaming as well as give important information on cell morphology. FoamPi is an open-source Raspberry Pi device for monitoring polyurethane foaming reactions. The device effectively monitors temperature rise, change in foam height as well as changes in the mass during the reaction. Three Python scripts are also presented. The first logs raw data during the reaction. The second corrects temperature data such that it can be used in ATR reactions for calculating isocyanate conversion; additionally this script reduces noise in all the data and removes erroneous readings. The final script extracts important information from the corrected data such as maximum temperature change and maximum height change as well as the time to reach these points. Commercial examples of such equipment exist however the price (£10000) of these equipment make these systems inaccessible for many research laboratories. The FoamPi build presented is inexpensive (£350).

  15. O

    Strengthening of Calcite Assemblages through Chemical Complexation Reaction...

    • data.openei.org
    • catalog.data.gov
    image
    Updated Feb 4, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Robert Choens; Jennifer Wilson; Anastasia Ilgen; Robert Choens; Jennifer Wilson; Anastasia Ilgen (2021). Strengthening of Calcite Assemblages through Chemical Complexation Reaction - Experimental Data [Dataset]. https://data.openei.org/submissions/4135
    Explore at:
    imageAvailable download formats
    Dataset updated
    Feb 4, 2021
    Dataset provided by
    Open Energy Data Initiative (OEDI)
    Sandia National Laboratories
    USDOE Office of Energy Efficiency and Renewable Energy (EERE), Multiple Programs (EE)
    Authors
    Robert Choens; Jennifer Wilson; Anastasia Ilgen; Robert Choens; Jennifer Wilson; Anastasia Ilgen
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Experimental data for manuscript "Strengthening of Calcite Assemblages through Chemical Complexation Reaction" by R. C. Choens, J. Wilson, and A. G. Ilgen; Sandia National Laboratories. The data includes scanning electron microscope images of various calcite assemblages along with experimental data .

  16. f

    OpenREACT-CHON-EFH — Open REaction Dataset of Atomic ConfiguraTions...

    • figshare.com
    hdf
    Updated May 29, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Austin Rodriguez; Justin S. Smith; Jose L. Mendoza-Cortes (2025). OpenREACT-CHON-EFH — Open REaction Dataset of Atomic ConfiguraTions comprising C, H, O, N with Energies, Forces, and Hessians [Dataset]. http://doi.org/10.6084/m9.figshare.29189858.v4
    Explore at:
    hdfAvailable download formats
    Dataset updated
    May 29, 2025
    Dataset provided by
    figshare
    Authors
    Austin Rodriguez; Justin S. Smith; Jose L. Mendoza-Cortes
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    These datasets were used in the training and testing of Machine Learning Interatomic Potentials (MLIPs) as part of the work represented in the article titled Does Hessian Data Improve the Performance of Machine Learning Potentials?.RTP Dataset (Reactant–Transition State–Product Dataset):The RTP dataset forms the core training and evaluation set and consists of 35,087 molecular geometries sampled from 11,961 unique elementary reactions. For each reaction, three critical geometries are included: the optimized reactant, transition state (TS), and product. Each geometry is labeled with its corresponding DFT-computed potential energy, atomic forces, and Hessian matrix, calculated at the wb97xd/6-31g(d) level of theory. This dataset represents stationary points (critical points) on the potential energy surface and serves as the foundation for training the MLIPs to reproduce energies, gradients, and curvatures.IRC Dataset (Intrinsic Reaction Coordinate Dataset):To assess the extrapolation performance of the trained MLIPs along continuous reaction pathways, a dataset of 34,248 geometries was compiled from 600 Intrinsic Reaction Coordinate (IRC) paths, each corresponding to a distinct elementary reaction in the RTP dataset. These geometries were obtained by following the minimum energy path (MEP) from the transition state to both reactant and product wells using quantum chemistry calculations at the wb97xd/6-31g(d) level of theory. While these geometries are not explicitly used in training, they provide a rigorous benchmark for evaluating the ability of MLIPs to generalize beyond training data and accurately model transition state connectivity and reaction dynamics.NMS Dataset (Normal Mode Sampling Dataset):To evaluate MLIP robustness on off-equilibrium, perturbed structures, 62,527 geometries were generated via Normal Mode Sampling (NMS). These structures are derived by displacing intermediate IRC geometries along their vibrational modes with random amplitudes, simulating thermal fluctuations and non-equilibrium distortions. The properties of these perturbed structures were calculated at the wb97xd/6-31g(d) level of theory. This dataset allows for testing the model's stability and accuracy in more realistic, noisy molecular environments as encountered in molecular dynamics simulations or under experimental conditions.

  17. Data extracts from the Canada Vigilance adverse reaction online database

    • open.canada.ca
    • ouvert.canada.ca
    html
    Updated Mar 30, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Health Canada (2024). Data extracts from the Canada Vigilance adverse reaction online database [Dataset]. https://open.canada.ca/data/info/29f39ab3-24fc-4b0a-90b2-3c9f97a88158
    Explore at:
    htmlAvailable download formats
    Dataset updated
    Mar 30, 2024
    Dataset provided by
    Health Canadahttp://www.hc-sc.gc.ca/
    License

    Open Government Licence - Canada 2.0https://open.canada.ca/en/open-government-licence-canada
    License information was derived automatically

    Area covered
    Canada
    Description

    The data set is updated on a monthly basis and currently covers the following time period: 1965 to 2023-10-31. The data extract is a series of compressed ASCII text files of the full data set contained in the Canada Vigilance Adverse Reaction Online Database. It is intended for users who are familiar with database structures and setting up their own queries.

  18. Canada Vigilance adverse reaction online database

    • open.canada.ca
    • datasets.ai
    html
    Updated Dec 2, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Health Canada (2024). Canada Vigilance adverse reaction online database [Dataset]. https://open.canada.ca/data/info/98cad9a3-5b61-4c1e-965d-531804542560
    Explore at:
    htmlAvailable download formats
    Dataset updated
    Dec 2, 2024
    Dataset provided by
    Health Canadahttp://www.hc-sc.gc.ca/
    License

    Open Government Licence - Canada 2.0https://open.canada.ca/en/open-government-licence-canada
    License information was derived automatically

    Area covered
    Canada
    Description

    The Canada Vigilance Adverse Reaction Online Database contains information about suspected adverse reactions (also known as side effects) to health products.

  19. Data from: NIST Chemistry WebBook - SRD 69

    • webbook.nist.gov
    • data.nist.gov
    • +3more
    Updated Oct 9, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    National Institute of Standards and Technology (2023). NIST Chemistry WebBook - SRD 69 [Dataset]. http://doi.org/10.18434/T4D303
    Explore at:
    Dataset updated
    Oct 9, 2023
    Dataset provided by
    National Institute of Standards and Technologyhttp://www.nist.gov/
    License

    https://www.nist.gov/open/copyright-fair-use-and-licensing-statements-srd-data-software-and-technical-series-publications#SRDhttps://www.nist.gov/open/copyright-fair-use-and-licensing-statements-srd-data-software-and-technical-series-publications#SRD

    Description

    The NIST Chemistry WebBook provides users with easy access to chemical and physical property data for chemical species through the internet. The data provided in the site are from collections maintained by the NIST Standard Reference Data Program and outside contributors. Data in the WebBook system are organized by chemical species. The WebBook system allows users to search for chemical species by various means. Once the desired species has been identified, the system will display data for the species. Data include thermochemical properties of species and reactions, thermophysical properties of species, and optical, electronic and mass spectra.

  20. d

    Data from: FOUNTAIN: A JAVA open-source package to assist large sequencing...

    • catalog.data.gov
    Updated Jul 24, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    National Institutes of Health (2025). FOUNTAIN: A JAVA open-source package to assist large sequencing projects [Dataset]. https://catalog.data.gov/dataset/fountain-a-java-open-source-package-to-assist-large-sequencing-projects
    Explore at:
    Dataset updated
    Jul 24, 2025
    Dataset provided by
    National Institutes of Health
    Description

    Background Better automation, lower cost per reaction and a heightened interest in comparative genomics has led to a dramatic increase in DNA sequencing activities. Although the large sequencing projects of specialized centers are supported by in-house bioinformatics groups, many smaller laboratories face difficulties managing the appropriate processing and storage of their sequencing output. The challenges include documentation of clones, templates and sequencing reactions, and the storage, annotation and analysis of the large number of generated sequences. Results We describe here a new program, named FOUNTAIN, for the management of large sequencing projects . FOUNTAIN uses the JAVA computer language and data storage in a relational database. Starting with a collection of sequencing objects (clones), the program generates and stores information related to the different stages of the sequencing project using a web browser interface for user input. The generated sequences are subsequently imported and annotated based on BLAST searches against the public databases. In addition, simple algorithms to cluster sequences and determine putative polymorphic positions are implemented. Conclusions A simple, but flexible and scalable software package is presented to facilitate data generation and storage for large sequencing projects. Open source and largely platform and database independent, we wish FOUNTAIN to be improved and extended in a community effort.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Amol Thakkar; Alain Vaucher; Andrea Byekwaso; Philippe Schwaller; Alessandra Toniato; Teodoro Laino; Amol Thakkar; Alain Vaucher; Andrea Byekwaso; Philippe Schwaller; Alessandra Toniato; Teodoro Laino (2022). Disconnection Labelled Reaction Data [Dataset]. http://doi.org/10.5281/zenodo.7101695
Organization logo

Disconnection Labelled Reaction Data

Explore at:
bin, csvAvailable download formats
Dataset updated
Sep 23, 2022
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Amol Thakkar; Alain Vaucher; Andrea Byekwaso; Philippe Schwaller; Alessandra Toniato; Teodoro Laino; Amol Thakkar; Alain Vaucher; Andrea Byekwaso; Philippe Schwaller; Alessandra Toniato; Teodoro Laino
Description

Dataset containing reaction centers used to train the disconnection aware model

Search
Clear search
Close search
Google apps
Main menu