7 datasets found
  1. T

    qm9

    • tensorflow.org
    Updated Dec 11, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). qm9 [Dataset]. http://doi.org/10.6084/m9.figshare.c.978904.v5
    Explore at:
    Dataset updated
    Dec 11, 2024
    Description

    QM9 consists of computed geometric, energetic, electronic, and thermodynamic properties for 134k stable small organic molecules made up of C, H, O, N, and F. As usual, we remove the uncharacterized molecules and provide the remaining 130,831.

    To use this dataset:

    import tensorflow_datasets as tfds
    
    ds = tfds.load('qm9', split='train')
    for ex in ds.take(4):
     print(ex)
    

    See the guide for more informations on tensorflow_datasets.

  2. h

    QM9-Dataset

    • huggingface.co
    Updated Oct 1, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Reza Hemmati (2024). QM9-Dataset [Dataset]. https://huggingface.co/datasets/HR-machine/QM9-Dataset
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Oct 1, 2024
    Authors
    Reza Hemmati
    Description

    HR-machine/QM9-Dataset dataset hosted on Hugging Face and contributed by the HF Datasets community

  3. Quantum Machine 9, aka QM9

    • kaggle.com
    zip
    Updated Jun 12, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    nosound (2019). Quantum Machine 9, aka QM9 [Dataset]. https://www.kaggle.com/zaharch/quantum-machine-9-aka-qm9
    Explore at:
    zip(282580282 bytes)Available download formats
    Dataset updated
    Jun 12, 2019
    Authors
    nosound
    Description

    downloaded from: http://quantum-machine.org/datasets/

    Abstract

    Computational de novo design of new drugs and materials requires rigorous and unbiased exploration of chemical compound space. However, large uncharted territories persist due to its size scaling combinatorially with molecular size. We report computed geometric, energetic, electronic, and thermodynamic properties for 134k stable small organic molecules made up of CHONF. These molecules correspond to the subset of all 133,885 species with up to nine heavy atoms (CONF) out of the GDB-17 chemical universe of 166 billion organic molecules. We report geometries minimal in energy, corresponding harmonic frequencies, dipole moments, polarizabilities, along with energies, enthalpies, and free energies of atomization. All properties were calculated at the B3LYP/6-31G(2df,p) level of quantum chemistry. Furthermore, for the predominant stoichiometry, C7H10O2, there are 6,095 constitutional isomers among the 134k molecules. We report energies, enthalpies, and free energies of atomization at the more accurate G4MP2 level of theory for all of them. As such, this data set provides quantum chemical properties for a relevant, consistent, and comprehensive chemical space of small organic molecules. This database may serve the benchmarking of existing methods, development of new methods, such as hybrid quantum mechanics/machine learning, and systematic identification of structure-property relationships.

    Download Available via figshare.

    How to cite When using this dataset, please make sure to cite the following two papers:

    L. Ruddigkeit, R. van Deursen, L. C. Blum, J.-L. Reymond, Enumeration of 166 billion organic small molecules in the chemical universe database GDB-17, J. Chem. Inf. Model. 52, 2864–2875, 2012.

    R. Ramakrishnan, P. O. Dral, M. Rupp, O. A. von Lilienfeld, Quantum chemistry structures and properties of 134 kilo molecules, Scientific Data 1, 140022, 2014. [bibtex]

  4. QM9S dataset

    • figshare.com
    txt
    Updated Dec 18, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    zihan zou (2023). QM9S dataset [Dataset]. http://doi.org/10.6084/m9.figshare.24235333.v3
    Explore at:
    txtAvailable download formats
    Dataset updated
    Dec 18, 2023
    Dataset provided by
    figshare
    Figsharehttp://figshare.com/
    Authors
    zihan zou
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    We constructed the QM9Spectra(QM9S) dataset using 130K organic molecules based on the popular QM9 dataset. We firstly re-optimized molecular geometries using the Gaussian16 package (B.01 version) at B3LYP/def-TZVP level of theory. Then the molecular properties including scalars (energy, NPA charges, etc.), vectors (electric dipole, etc.), 2nd order tensors (Hessian matrix, quadrupole moment, polarizability, etc.), and 3rd order tensors (octupole moment, first hyperpolarizability, etc.) were calculated at the same level. The frequency analysis and time-dependent density functional theory (TD-DFT) were carried out at the same level to obtain the infrared, Raman, and UV-Vis spectra.Two versions of the dataset, .pt (torch_geometric version) and .csv, are provided for training and use. In addition, we also provide broadened spectra.When using this dataset, please cite to the original article's doi: https://doi.org/10.1038/s43588-023-00550-y instead of the doi provided by figshare.

  5. H

    GEOM

    • dataverse.harvard.edu
    • datasetcatalog.nlm.nih.gov
    • +2more
    Updated Feb 11, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Simon Axelrod; Rafael Gomez-Bombarelli (2022). GEOM [Dataset]. http://doi.org/10.7910/DVN/JNGTDF
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Feb 11, 2022
    Dataset provided by
    Harvard Dataverse
    Authors
    Simon Axelrod; Rafael Gomez-Bombarelli
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Here you can find GEOM, a dataset with over 37 million molecular conformations annotated by energy and statistical weight for over 450,000 molecules. Over 317,000 species contain experimental data related to biophysics, physiology, and physical chemistry, and the remaining 133,000 species are from the QM9 dataset.

  6. QM9-extended-plus database

    • zenodo.org
    csv
    Updated Nov 29, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bartłomiej Fliszkiewicz; Bartłomiej Fliszkiewicz; Marcin Sajdak; Marcin Sajdak (2023). QM9-extended-plus database [Dataset]. http://doi.org/10.5281/zenodo.10184793
    Explore at:
    csvAvailable download formats
    Dataset updated
    Nov 29, 2023
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Bartłomiej Fliszkiewicz; Bartłomiej Fliszkiewicz; Marcin Sajdak; Marcin Sajdak
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Description

    QM9-extended database was further extended with 1781 compounds containing chlorine atoms and 2020 compounds containing bromine atoms.

  7. deMon QM9 features

    • kaggle.com
    zip
    Updated Aug 28, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Russ Wolfinger (2019). deMon QM9 features [Dataset]. https://www.kaggle.com/datasets/sasrdw/demon-qm9-features
    Explore at:
    zip(330070115 bytes)Available download formats
    Dataset updated
    Aug 28, 2019
    Authors
    Russ Wolfinger
    Description

    Here are shielding and J-coupling features created by the quantum chemistry package deMon using its free download binary with default settings over the QM9 set of molecules used in Predicting Molecular Properties. These features would be considered forbidden for this competition because they are based on quantum calculations, but they appear to help with predictions using boosted tree and neural net models. They took around 2.5 days to compute in parallel on two different linux boxes with 14 CPU cores each (files have '_even' and '_odd' suffixes). Python code to import them:

    root = "../"
    demon_odd = pd.read_csv(root+'deMon_jcoupling_odd.csv')
    print(demon_odd.columns, demon_odd.shape)
    demon_even = pd.read_csv(root+'deMon_jcoupling_even.csv')
    print(demon_even.columns, demon_even.shape)
    demonj = pd.concat((demon_even,demon_odd))
    print(demonj.columns, demonj.shape)
    
    demon_odd = pd.read_csv(root+'deMon_shielding_odd.csv')
    print(demon_odd.columns, demon_odd.shape)
    demon_even = pd.read_csv(root+'deMon_shielding_even.csv')
    print(demon_even.columns, demon_even.shape)
    demons = pd.concat((demon_even,demon_odd))
    print(demons.columns, demons.shape)
    

    The shielding values are at the atom level and the J coupling at the pair level. Use molecule_name and atom indices when merging since the molecules are not in the same order as the original data. Also, deMon did not produce results for a few of the molecules so the features will be missing for them.

  8. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
(2024). qm9 [Dataset]. http://doi.org/10.6084/m9.figshare.c.978904.v5

qm9

Explore at:
Dataset updated
Dec 11, 2024
Description

QM9 consists of computed geometric, energetic, electronic, and thermodynamic properties for 134k stable small organic molecules made up of C, H, O, N, and F. As usual, we remove the uncharacterized molecules and provide the remaining 130,831.

To use this dataset:

import tensorflow_datasets as tfds

ds = tfds.load('qm9', split='train')
for ex in ds.take(4):
 print(ex)

See the guide for more informations on tensorflow_datasets.

Search
Clear search
Close search
Google apps
Main menu