71 datasets found
  1. P

    QM9 Dataset

    • paperswithcode.com
    Updated Nov 25, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2021). QM9 Dataset [Dataset]. https://paperswithcode.com/dataset/qm9
    Explore at:
    Dataset updated
    Nov 25, 2021
    Description

    QM9 provides quantum chemical properties (at DFT level) for a relevant, consistent, and comprehensive chemical space of small organic molecules. This database may serve the benchmarking of existing methods, development of new methods, such as hybrid quantum mechanics/machine learning, and systematic identification of structure-property relationships.

  2. T

    qm9

    • tensorflow.org
    Updated Dec 11, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). qm9 [Dataset]. http://doi.org/10.6084/m9.figshare.c.978904.v5
    Explore at:
    Dataset updated
    Dec 11, 2024
    Description

    QM9 consists of computed geometric, energetic, electronic, and thermodynamic properties for 134k stable small organic molecules made up of C, H, O, N, and F. As usual, we remove the uncharacterized molecules and provide the remaining 130,831.

    To use this dataset:

    import tensorflow_datasets as tfds
    
    ds = tfds.load('qm9', split='train')
    for ex in ds.take(4):
     print(ex)
    

    See the guide for more informations on tensorflow_datasets.

  3. Quantum Machine 9, aka QM9

    • kaggle.com
    zip
    Updated Jun 12, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    nosound (2019). Quantum Machine 9, aka QM9 [Dataset]. https://www.kaggle.com/zaharch/quantum-machine-9-aka-qm9
    Explore at:
    zip(282580282 bytes)Available download formats
    Dataset updated
    Jun 12, 2019
    Authors
    nosound
    Description

    downloaded from: http://quantum-machine.org/datasets/

    Abstract

    Computational de novo design of new drugs and materials requires rigorous and unbiased exploration of chemical compound space. However, large uncharted territories persist due to its size scaling combinatorially with molecular size. We report computed geometric, energetic, electronic, and thermodynamic properties for 134k stable small organic molecules made up of CHONF. These molecules correspond to the subset of all 133,885 species with up to nine heavy atoms (CONF) out of the GDB-17 chemical universe of 166 billion organic molecules. We report geometries minimal in energy, corresponding harmonic frequencies, dipole moments, polarizabilities, along with energies, enthalpies, and free energies of atomization. All properties were calculated at the B3LYP/6-31G(2df,p) level of quantum chemistry. Furthermore, for the predominant stoichiometry, C7H10O2, there are 6,095 constitutional isomers among the 134k molecules. We report energies, enthalpies, and free energies of atomization at the more accurate G4MP2 level of theory for all of them. As such, this data set provides quantum chemical properties for a relevant, consistent, and comprehensive chemical space of small organic molecules. This database may serve the benchmarking of existing methods, development of new methods, such as hybrid quantum mechanics/machine learning, and systematic identification of structure-property relationships.

    Download Available via figshare.

    How to cite When using this dataset, please make sure to cite the following two papers:

    L. Ruddigkeit, R. van Deursen, L. C. Blum, J.-L. Reymond, Enumeration of 166 billion organic small molecules in the chemical universe database GDB-17, J. Chem. Inf. Model. 52, 2864–2875, 2012.

    R. Ramakrishnan, P. O. Dral, M. Rupp, O. A. von Lilienfeld, Quantum chemistry structures and properties of 134 kilo molecules, Scientific Data 1, 140022, 2014. [bibtex]

  4. h

    QM9-Dataset

    • huggingface.co
    Updated Oct 1, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Reza Hemmati (2024). QM9-Dataset [Dataset]. https://huggingface.co/datasets/HR-machine/QM9-Dataset
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Oct 1, 2024
    Authors
    Reza Hemmati
    Description

    HR-machine/QM9-Dataset dataset hosted on Hugging Face and contributed by the HF Datasets community

  5. Revised QM9 dataset (revQM9)

    • zenodo.org
    bin
    Updated Feb 9, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Danish Khan; Danish Khan; Anatole von Lilienfeld; Anatole von Lilienfeld (2025). Revised QM9 dataset (revQM9) [Dataset]. http://doi.org/10.5281/zenodo.10689884
    Explore at:
    binAvailable download formats
    Dataset updated
    Feb 9, 2025
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Danish Khan; Danish Khan; Anatole von Lilienfeld; Anatole von Lilienfeld
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Revised QM9 dataset with properties calculated using aPBE0 in the cc-pVTZ basis set.

    The atomic coordinates, atomic numbers, chemical symbols, total energies, atomization energies, MO energies, homos, lumos, dipoles moment norms are in the arrays "coords", "charges", "elements", "energies", "atomization", "moenergies", "homo", "lumo", "dipole" respectively.
    Density matrices will be uploaded soon.

    Usage example :

    import numpy as np
    data = np.load('revQM9.npz',allow_pickle=True)
    coords, q, elems, energies = data['coords'], data['charges'], data['elements'], data['energies']
  6. f

    Accurate GW frontier orbital energies of 134 kilo molecules of the QM9...

    • figshare.com
    txt
    Updated May 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Artem Fediai; Patrick Reiser; Jorge Enrique Olivares Peña; Pascal Friederich; Wolfgang Wenzel (2023). Accurate GW frontier orbital energies of 134 kilo molecules of the QM9 dataset. [Dataset]. http://doi.org/10.6084/m9.figshare.21610077.v1
    Explore at:
    txtAvailable download formats
    Dataset updated
    May 30, 2023
    Dataset provided by
    figshare
    Authors
    Artem Fediai; Patrick Reiser; Jorge Enrique Olivares Peña; Pascal Friederich; Wolfgang Wenzel
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    A dataset of HOMO/LUMO energies of the QM9 dataset computed at GW level of theory.

  7. s

    Results of Quantum Chemical and Machine Learning Computations for Molecules...

    • purl.stanford.edu
    Updated Aug 6, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sinitskiy, Anton V.; Pande, Vijay S. (2019). Results of Quantum Chemical and Machine Learning Computations for Molecules in the QM9 Database [Dataset]. https://purl.stanford.edu/kf921gd3855
    Explore at:
    Dataset updated
    Aug 6, 2019
    Authors
    Sinitskiy, Anton V.; Pande, Vijay S.
    License

    Attribution 3.0 (CC BY 3.0)https://creativecommons.org/licenses/by/3.0/
    License information was derived automatically

    Description

    Two types of approaches to modeling molecular systems have demonstrated high practical efficiency. Density functional theory (DFT), the most widely used quantum chemical method, is a physical approach predicting energies and electron densities of molecules. Recently, numerous papers on machine learning (ML) of molecular properties have also been published. ML models greatly outperform DFT in terms of computational costs, and may even reach comparable accuracy, but they are missing physicality - a direct link to Quantum Physics - which limits their applicability. Here, we propose an approach that combines the strong sides of DFT and ML, namely, physicality and low computational cost. We derive general equations for exact electron densities and energies that can naturally guide applications of ML in Quantum Chemistry. Based on these equations, we build a deep neural network that can compute electron densities and energies of a wide range of organic molecules not only much faster, but also closer to exact physical values than current versions of DFT. In particular, we reached a mean absolute error in energies of molecules with up to eight non-hydrogen atoms as low as 0.9 kcal/mol relative to CCSD(T) values, noticeably lower than those of DFT (approaching ~2 kcal/mol) and ML (~1.5 kcal/mol) methods. A simultaneous improvement in the accuracy of predictions of electron densities and energies suggests that the proposed approach describes the physics of molecules better than DFT functionals developed by "human learning" earlier. Thus, physics-based ML offers exciting opportunities for modeling, with high-theory-level quantum chemical accuracy, of much larger molecular systems than currently possible.

  8. Taxnonomic classifications for all structure in the QM9 dataset

    • zenodo.org
    application/gzip, png
    Updated Jul 16, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Evan Komp; Evan Komp (2024). Taxnonomic classifications for all structure in the QM9 dataset [Dataset]. http://doi.org/10.5281/zenodo.6498857
    Explore at:
    application/gzip, pngAvailable download formats
    Dataset updated
    Jul 16, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Evan Komp; Evan Komp
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The classification of molecules according to ClassyFire [1] for the QM9 dataset [2].

    The QM9 dataset is a set of nearly 140k organic molecules with no more than 9 C, N, O, and F atoms optimized to a stable structure with DFT.

    ClassyFire is a tool and taxonomic library for the labeling of molecules.

    1. Djoumbou Feunang, Y. et al. ClassyFire: automated chemical classification with a comprehensive, computable taxonomy. J. Cheminform. 8, 1–20 (2016).

    2. Ramakrishnan, R., Dral, P. O., Rupp, M. & Von Lilienfeld, O. A. Quantum chemistry structures and properties of 134 kilo molecules. Sci. Data 1, 1–7 (2014).

    The data directory ('QM9_jsons_classified.tar.gz') contains a `json` file for each structure in the QM9 dataset. The name of the file is the same identifier as from QM9. Data fields include:

    - `cf_alternative_parents` : classifications describing the compound that do not fall in the given ancestry

    - `cf_ancestors` : classes along the taxonomic branch for the structure

    - `cf_class` : ClassyFire given class

    - `cf_superclass` : ClassyFire given super class

    - `cf_subclass` : ClassyFire given subclass

    - `cf_direct_parent` : Class one level above this structure on the taxonomic branch

    - `cf_description` : Exposition on the given class

    - `cf_identifier` : identifier for the structure in the ClassyFire database

    - `cf_intermediate_nodes` : classes connecting branches on taxonomic tree

    - `cf_kingdom` : ClassyFire given kingdom

    - `cf_molecular_framework` : describes aromaticity and number of cycles

    - `cf_predicted_chebi_terms` : terms describing the molecule in the ChEBI framework

    - `cf_predicted_lipidmaps_terms` : terms describing the molecule in LIPID MAPS framework

    - `cf_smiles` : smiles string given by ClassyFire

    - `cf_substituents` : substituent groups in the structure

    Many fields contain subfields, seen in the example below for molecule with QM9 id 000123:

    {"cf_alternative_parents":[{"name":"Dialkylamines","description":"Organic compounds containing a dialkylamine group, characterized by two alkyl groups bonded to the amino nitrogen.","chemont_id":"CHEMONTID:0002228","url":"http:\/\/classyfire.wishartlab.com\/tax_nodes\/C0002228"},{"name":"Organopnictogen compounds","description":"Compounds containing a bond between carbon a pnictogen atom. Pnictogens are p-block element atoms that are in the group 15 of the periodic table.","chemont_id":"CHEMONTID:0004557","url":"http:\/\/classyfire.wishartlab.com\/tax_nodes\/C0004557"},{"name":"Hydrocarbon derivatives","description":"Derivatives of hydrocarbons obtained by substituting one or more carbon atoms by an heteroatom. They contain at least one carbon atom and heteroatom.","chemont_id":"CHEMONTID:0004150","url":"http:\/\/classyfire.wishartlab.com\/tax_nodes\/C0004150"}],"cf_ancestors":["Alpha-aminonitriles","Amines","Chemical entities","Dialkylamines","Hydrocarbon derivatives","Nitriles","Organic compounds","Organic cyanides","Organic nitrogen compounds","Organonitrogen compounds","Organopnictogen compounds","Secondary amines"],"cf_class":"Organonitrogen compounds","cf_classification_version":"2.1","cf_description":"This compound belongs to the class of organic compounds known as alpha-aminonitriles. These are organonitrogen compounds that contain an amino group located on the carbon at the position alpha to a carbonitrile group. They have the general formula RC(NH2)C#N, where the amine group can be substituted.","cf_direct_parent":{"name":"Alpha-aminonitriles","description":"Organonitrogen compounds that contain an amino group located on the carbon at the position alpha to a carbonitrile group. They have the general formula RC(NH2)C#N, where the amine group can be substituted.","chemont_id":"CHEMONTID:0004453","url":"http:\/\/classyfire.wishartlab.com\/tax_nodes\/C0004453"},"cf_external_descriptors":[],"cf_identifier":"Q5198051-1","cf_inchikey":"InChIKey=PVVRRUUMHFWFQV-UHFFFAOYSA-N","cf_intermediate_nodes":[{"name":"Nitriles","description":"Compounds having the structure RC#N; thus C-substituted derivatives of hydrocyanic acid, HC#N.","chemont_id":"CHEMONTID:0000362","url":"http:\/\/classyfire.wishartlab.com\/tax_nodes\/C0000362"}],"cf_kingdom":"Organic compounds","cf_molecular_framework":"Aliphatic acyclic compounds","cf_predicted_chebi_terms":["chemical entity (CHEBI:24431)","organic molecular entity (CHEBI:50860)","organonitrogen compound (CHEBI:35352)","secondary amino compound (CHEBI:50995)","nitrile (CHEBI:18379)","amine (CHEBI:32952)","secondary amine (CHEBI:32863)","cyanides (CHEBI:23424)","organic molecule (CHEBI:72695)","pnictogen molecular entity (CHEBI:33302)","nitrogen molecular entity (CHEBI:51143)"],"cf_predicted_lipidmaps_terms":[],"cf_smiles":"CNCC#N","cf_subclass":"Organic cyanides","cf_substituents":["Alpha-aminonitrile","Secondary amine","Secondary aliphatic amine","Organopnictogen compound","Hydrocarbon derivative","Amine","Aliphatic acyclic compound"],"cf_superclass":"Organic nitrogen compounds"}

    A visualization ''qm9_pie_labeled.png" is given of a fracturization of superclasses within qm9 down to subclass.

  9. QM9S dataset

    • figshare.com
    txt
    Updated Dec 18, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    zihan zou (2023). QM9S dataset [Dataset]. http://doi.org/10.6084/m9.figshare.24235333.v3
    Explore at:
    txtAvailable download formats
    Dataset updated
    Dec 18, 2023
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    zihan zou
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    We constructed the QM9Spectra(QM9S) dataset using 130K organic molecules based on the popular QM9 dataset. We firstly re-optimized molecular geometries using the Gaussian16 package (B.01 version) at B3LYP/def-TZVP level of theory. Then the molecular properties including scalars (energy, NPA charges, etc.), vectors (electric dipole, etc.), 2nd order tensors (Hessian matrix, quadrupole moment, polarizability, etc.), and 3rd order tensors (octupole moment, first hyperpolarizability, etc.) were calculated at the same level. The frequency analysis and time-dependent density functional theory (TD-DFT) were carried out at the same level to obtain the infrared, Raman, and UV-Vis spectra.Two versions of the dataset, .pt (torch_geometric version) and .csv, are provided for training and use. In addition, we also provide broadened spectra.When using this dataset, please cite to the original article's doi: https://doi.org/10.1038/s43588-023-00550-y instead of the doi provided by figshare.

  10. h

    qm9

    • huggingface.co
    Updated Sep 8, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nima Shoghi (2024). qm9 [Dataset]. https://huggingface.co/datasets/nimashoghi/qm9
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Sep 8, 2024
    Authors
    Nima Shoghi
    Description

    nimashoghi/qm9 dataset hosted on Hugging Face and contributed by the HF Datasets community

  11. d

    Indices of the QM9 molecules which are not present in either of the...

    • data.dtu.dk
    txt
    Updated Sep 29, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Surajit Nandi; Tejs Vegge; Arghya Bhowmik (2023). Indices of the QM9 molecules which are not present in either of the molecular or reaction datasets. [Dataset]. http://doi.org/10.11583/DTU.21028468.v1
    Explore at:
    txtAvailable download formats
    Dataset updated
    Sep 29, 2023
    Dataset provided by
    Technical University of Denmark
    Authors
    Surajit Nandi; Tejs Vegge; Arghya Bhowmik
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This file contains numbers that are index of the QM9 molecules. These indices are not present in either of our molecular or reaction datasets. These indices are not considered because there were problems converting the coordinates to SMILES string.
    This item is part of the collection MultiXC-QM9 with DOI: 10.11583/DTU.c.6185986

  12. modelforge curated dataset: QM9

    • zenodo.org
    application/gzip
    Updated May 12, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Christopher Iacovella; Christopher Iacovella; John Chodera; John Chodera; Shuai Yan; Shuai Yan (2025). modelforge curated dataset: QM9 [Dataset]. http://doi.org/10.5281/zenodo.15390593
    Explore at:
    application/gzipAvailable download formats
    Dataset updated
    May 12, 2025
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Christopher Iacovella; Christopher Iacovella; John Chodera; John Chodera; Shuai Yan; Shuai Yan
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Modelforge Curated QM9 Dataset:
    - 1000 conformer test set
    - Version: nc_1000_v1.1:

    This provides a curated hdf5 file for a subset of the QM9 dataset to be used for testing purposes, designed to be compatible with modelforge, an infrastructure to implement and train NNPs. This test dataset contains 1000 configurations, 1 for each unique system.

    When applicable, the units of properties are provided in the datafile, encoded as strings compatible with the openff-units package. For more information about the structure of the data file, please see the following:

    Properties Included:

    • atomic_numbers
    • positions
      • "per_atom"
      • "nanometer"
    • partial_charges
      • "per_atom"
      • "elementary_charge"
    • polarizability
      • "per_system"
      • "nanometer ** 3"
    • dipole_moment_per_system
      • "per_system"
      • "elementary_charge * nanometer"
    • dipole_moment_scalar_per_system
      • "per_system"
      • "elementary_charge * nanometer"
    • energy_of_homo
      • "per_system"
      • "kilojoule_per_mole"
    • lumo-homo_gap
      • "per_system"
      • "kilojoule_per_mole"
    • zero_point_vibrational_energy
      • "per_system"
      • "kilojoule_per_mole"
    • internal_energy_at_298.15K
      • "per_system"
      • "kilojoule_per_mole"
    • internal_energy_at_0K
      • "per_system"
      • "kilojoule_per_mole"
    • enthalpy_at_298.15K
      • "per_system"
      • "kilojoule_per_mole"
    • free_energy_at_298.15K
      • "per_system"
      • "kilojoule_per_mole"
    • heat_capacity_at_298.15K
      • "per_system"
      • "kilojoule_per_mole / kelvin"
    • rotational_constants
      • "per_system"
      • "gigahertz"
    • harmonic_vibrational_frequencies
      • "per_system"
      • "1 / centimeter"
    • electronic_spatial_extent
      • "per_system"
      • "nanometer ** 2"
    • smiles_gdb-17
      • "meta_data"
      • "meta_data"
    • inchi_corina
      • "meta_data"
    • inchi_b3lyp
      • "meta_data"
    • idx
      • "meta_data"
    • tag
      • "meta_data"

    Original Source:

    The QM9 dataset includes 133,885 organic molecules with up to nine total heavy atoms (C,O,N,or F; excluding H) original published by Ramakrishnan, et al. Properties in the QM9 dataset were calculated at the B3LYP/6-31G(2df,p) level of quantum chemistry.

    Citations:

    Original publication:

    Source dataset, released with CCO 1.0 Universal license:

  13. Z

    QM9-XAS database of 56k QM9 small organic molecules labeled with TDDFT X-ray...

    • data.niaid.nih.gov
    • zenodo.org
    Updated Sep 14, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kotobi, Amir (2023). QM9-XAS database of 56k QM9 small organic molecules labeled with TDDFT X-ray absorption spectra [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_8276901
    Explore at:
    Dataset updated
    Sep 14, 2023
    Dataset authored and provided by
    Kotobi, Amir
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Database for training graph neural network (GNN) models in Integrating Explainability into Graph Neural Network Models for the Prediction of X-ray Absorption Spectra, by Amir Kotobi, Kanishka Singh, Daniel Höche, Sadia Bari, Robert H.Meißner, and Annika Bande.

    Included:

    qm9_Cedge_xas_56k.npz: the TDDFT XAS spectra of 56k structures from the QM9 dataset, were employed to label the graph dataset. The dataset contains two pairs of key/value entries: spec_stk, which represents a 2D array containing energies and oscillator strengths of XAS spectra, and id, which consists of the indices of QM9 structures. This data was used to create the QM9-XAS graph dataset.

    qm9xas_orca_output.zip: the raw ORCA output of TDDFT calculations for the 56k QM9-XAS dataset consists of excitation energies, densities, molecular orbitals, and other relevant information. This unprocessed output serves as a source to derive ground truth data for explaining the predictions made by GNNs.

    qm9xas_spec_train_val.pt: processed graph train/validation dataset of 50k QM9 structures. It is used as input to GNN models for training and validation.

    qm9xas_spec_test.pt: processed graph test dataset of 6k QM9 structures. It is used to test the performance of trained GNN models.

    Notes on the datasets:

    The QM9-XAS dataset was created using ORCA electronic structure package [Neese, F., WIREs Computational Molecular Science 2012, 2, 73–78] to calculate carbon K-edge XAS spectra with the time-dependent density functional theory (TDDFT) method [Petersilka, M.; Gossmann, U. J.; Gross, E. K. U., Phys. Rev. Lett. 1996, 76, 1212–1215]

    The molecular structures of QM9-XAS datasets were sourced from the QM9 database [R. Ramakrishnan, P. O. Dral, M. Rupp, and O. A. Von Lilienfeld, Sci. Data 1, 1 (2014)].

    Funding:

    This research was funded by HIDA Trainee Network program, HAICU, Helmholtz AI-4-XAS, DASHH and HEIBRiDS graduate schools. For theoretical calculations and model training, computational resources at DESY and JFZ were used.

  14. d

    qm9

    • search.dataone.org
    • dataverse.harvard.edu
    Updated Nov 8, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Fu, Tianfan (2023). qm9 [Dataset]. http://doi.org/10.7910/DVN/8ZZZ6J
    Explore at:
    Dataset updated
    Nov 8, 2023
    Dataset provided by
    Harvard Dataverse
    Authors
    Fu, Tianfan
    Description
  15. QM9x

    • figshare.com
    hdf
    Updated Aug 16, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mathias Schreiner (2022). QM9x [Dataset]. http://doi.org/10.6084/m9.figshare.20449701.v2
    Explore at:
    hdfAvailable download formats
    Dataset updated
    Aug 16, 2022
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Mathias Schreiner
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    QM9x is a dataset that contains DFT calculations of energy and forces for all configurations in QM9 recalculated with the wb97x functional and 6-31G(d) basis set. Recalculating the energy and forces causes a slight shift of the potential energy surface which results in forces acting on most configurations in the dataset.

    The choice of basis set and functional makes the QM9x compatible with the Transition1x and the ANI1x dataset.

    see https://arxiv.org/abs/2207.12858 for comparison between ANI1x, QM9x and Transition1x.

    Dataloaders and example scripts are availble in https://gitlab.com/matschreiner/QM9x

    Please cite as Transition1x - Force and Energy Calculations of Millions of Near-Transition State Molecular Configurations,

    and the original QM9 dataset.

  16. d

    QM9 data for graph2mat

    • data.dtu.dk
    txt
    Updated Aug 6, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Arghya Bhowmik (2024). QM9 data for graph2mat [Dataset]. http://doi.org/10.11583/DTU.26195282.v1
    Explore at:
    txtAvailable download formats
    Dataset updated
    Aug 6, 2024
    Dataset provided by
    Technical University of Denmark
    Authors
    Arghya Bhowmik
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Creators

    Pol Febrer (pol.febrer@icn2.cat, ORCID 0000-0003-0904-2234) Peter Bjorn Jorgensen (peterbjorgensen@gmail.com, ORCID 0000-0003-4404-7276) Arghya Bhowmik (arbh@dtu.dk, ORCID 0000-0003-3198-5116)

    Related publication

    The dataset is published as part of the paper: "GRAPH2MAT: UNIVERSAL GRAPH TO MATRIX CONVERSION FOR ELECTRON DENSITY PREDICTION" (https://doi.org/10.26434/chemrxiv-2024-j4g21)

    Short description

    This dataset contains the Hamiltonian, Overlap, Density and Energy Density matrices from SIESTA calculations of the QM9 dataset (https://doi.org/10.6084/m9.figshare.c.978904.v5)

    SIESTA 5.0.0 was used to compute the dataset.

    Contents

    The dataset has four directories:

    • basis: Contains the files specifying the basis used for each atom.
    • pseudos: Contains the pseudopotentials used for the calculation (obtained from http://www.pseudo-dojo.org/, type NC SR (ONCVPSP v0.5), PBE, standard accuracy)
    • runs: The results of running the SIESTA simulations. Contents are discussed next.
    • splits: The data splits used in the published paper. Each file "splits_X.json" contains the splits for training size X.

    The "runs" directory contains one directory for each run, named with the index of the run. Each directory contains: - RUN.fdf, geom.fdf: The input files used for the SIESTA calculation. - RUN.out: The log of the SIESTA run, which apar - siesta.TSDE: Contains the Density and Energy Density matrices. - siesta.TSHS: Contains the Hamiltonian and Overlap matrices.

    Each matrix can be read using the sisl python package (https://github.com/zerothi/sisl) like:

    import sisl
    
    matrix = sisl.get_sile("RUN.fdf").read_X()
    

    where X is hamiltonian, overlap, density_matrix or energy_density_matrix.

    To reproduce the results presented in the paper, follow the documentation of the graph2mat package (https://github.com/BIG-MAP/graph2mat).

    Cite this data

    https://doi.org/10.11583/DTU.c.7310005 © 2024 Technical University of Denmark

    License

    This dataset is published under the CC BY 4.0 license. This license allows reusers to distribute, remix, adapt, and build upon the material in any medium or format, so long as attribution is given to the creator.

  17. data_qm9.pkl.tar.gz

    • figshare.com
    application/gzip
    Updated Jun 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kuang-Yu Samuel Chang (2023). data_qm9.pkl.tar.gz [Dataset]. http://doi.org/10.6084/m9.figshare.4293959.v2
    Explore at:
    application/gzipAvailable download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Kuang-Yu Samuel Chang
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    compressed python pickle file for qm9 dataset

  18. h

    JARVIS-QM9-DGL

    • huggingface.co
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    ColabFit, JARVIS-QM9-DGL [Dataset]. https://huggingface.co/datasets/colabfit/JARVIS-QM9-DGL
    Explore at:
    Dataset authored and provided by
    ColabFit
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Cite this dataset

    Ramakrishnan, R., Dral, P. O., Rupp, M., and Lilienfeld, O. A. JARVIS-QM9-DGL. ColabFit, 2023. https://doi.org/10.60732/403cd4f2

      View on the ColabFit Exchange
    

    https://materials.colabfit.org/id/DS_tat5i46x3hkr_0

      Dataset Name
    

    JARVIS-QM9-DGL

      Description
    

    The JARVIS-QM9-DGL dataset is part of the joint automated repository for various integrated simulations (JARVIS) database. This dataset contains configurations from the QM9 dataset… See the full description on the dataset page: https://huggingface.co/datasets/colabfit/JARVIS-QM9-DGL.

  19. h

    QM9_ADiT

    • huggingface.co
    Updated May 23, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Chaitanya K. Joshi (2025). QM9_ADiT [Dataset]. https://huggingface.co/datasets/chaitjo/QM9_ADiT
    Explore at:
    Dataset updated
    May 23, 2025
    Authors
    Chaitanya K. Joshi
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    All-atom Diffusion Transformers - QM9 dataset

    QM9 dataset from the paper "All-atom Diffusion Transformers: Unified generative modelling of molecules and materials", by Chaitanya K. Joshi, Xiang Fu, Yi-Lun Liao, Vahe Gharakhanyan, Benjamin Kurt Miller, Anuroop Sriram*, and Zachary W. Ulissi* from FAIR Chemistry at Meta (* Joint last author). Original data source: https://pytorch-geometric.readthedocs.io/en/latest/generated/torch_geometric.datasets.QM9.html (Adapted from MoleculeNet)… See the full description on the dataset page: https://huggingface.co/datasets/chaitjo/QM9_ADiT.

  20. f

    OD9_0 (union of PC9 and QM9) dataset dictionnary with SMILES and ECFP4...

    • figshare.com
    application/x-gzip
    Updated Feb 28, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Thomas Cauchy (2023). OD9_0 (union of PC9 and QM9) dataset dictionnary with SMILES and ECFP4 connectivity score [Dataset]. http://doi.org/10.6084/m9.figshare.20054339.v2
    Explore at:
    application/x-gzipAvailable download formats
    Dataset updated
    Feb 28, 2023
    Dataset provided by
    figshare
    Authors
    Thomas Cauchy
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    JSON file of python dictionnary. Key: SMILES, value: dict {'HAC' # number of heavy atoms, 'swscore_ChEMBL' # % of ECFP4 of the molecule that belong to ChEMBL, 'swscore_ZINC' # % of ECFP4 of the molecule that belong to ZINC or ChEMBL} Only neutral singlet molecule without any atomic charges (formal or real) composed of H, C, N, O, and F.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
(2021). QM9 Dataset [Dataset]. https://paperswithcode.com/dataset/qm9

QM9 Dataset

Explore at:
Dataset updated
Nov 25, 2021
Description

QM9 provides quantum chemical properties (at DFT level) for a relevant, consistent, and comprehensive chemical space of small organic molecules. This database may serve the benchmarking of existing methods, development of new methods, such as hybrid quantum mechanics/machine learning, and systematic identification of structure-property relationships.

Search
Clear search
Close search
Google apps
Main menu