3 datasets found
  1. Z

    ab initio REPEAT Charge MOF Database (ARC-MOF)

    • data.niaid.nih.gov
    • zenodo.org
    Updated Oct 12, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Maley, Stephen (2024). ab initio REPEAT Charge MOF Database (ARC-MOF) [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_6908727
    Explore at:
    Dataset updated
    Oct 12, 2024
    Dataset provided by
    Woo, Tom K.
    White, Andrew
    Maley, Stephen
    Mirmiran, Adam
    Luo, Jun
    Kwon, Ohmin
    Boyd, Peter G.
    Simrod, Scott
    Gibaldi, Marco
    Burner, Jake
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This is a database of ~280,000 MOFs which have been either experimentally characterized or computationally generated, spanning all publicly available MOF databases. DFT-derived REPEAT charges, adsorption data, and various descriptors are available for all MOFs.

    all_structures_1.tar.gz and all_structures_2.tar.gz – these are the cif files that were considered to compose the “entire known design space” of MOFs, with any bad structures removed (split into two separate tarballs since it is a lot of data).

    ARCMOF_20220610.tar.gz – these are all of the cif files with REPEAT charges composing ARC-MOF.

    flig-clusters.csv, func-clusters.csv, geo-clusters.csv, mc-clusters.csv – Each file indicates for each MOF which cluster it belongs to, and whether the MOF is present in ARC-MOF. This is done for each "type" of MOF chemistry and for the geometric properties. Clusters with a negative value indicate the MOF does not belong to any cluster (i.e., it is assumed to be "unique").

    all_topology_lists.csv – a csv file containing the topology reported by the filename of applicable structures, and the topology reported by CrystalNets.jl

    ML_test_set.tar.gz – these are the cif files (with REPEAT charges) of the MOFs in the diverse-mc subset, but missing from ARC-MOF (for the purposes of a ML test set for the prediction of metal charges).

    geometric_properties.csv – a csv file containing geometric descriptors computed for this study for all MOFs. The csv file also indicates which MOFs are present in ARC-MOF, and the order in which they were chosen for the farthest point sampling (up to 100K MOFs).

    RACs.csv – See geometric_properties.csv description. Same type of file, but with the RAC descriptors.

    RDFs.csv – The RDFs for each MOF, using several atomic properties. Some atomic properties are not available for all elements. In the cases where the atomic property is not available for a particular structure, no value is assigned.

    methane.csv, methane_purification-CH4.csv, methane_purification_CO2.csv, post_comb_vsa-CO2.csv, post_comb_vsa-N2.csv, pre_comb_4040-CO2.csv, pre_comb_4040-H2.csv, landfill-CH4.csv, landfill-CO2.csv – these are csv files of the raw uptake data and various temperature, pressure conditions (with standard deviations) for each gas separation process specified in the file overall_process.csv.

    overall_process.csv – This is a csv file of the adsorption properties of the MOFs. Particularly, the csv files contain the working capacity (mmol/g_working_capacity) and selectivity of each MOF for each of the five process conditions.

    mc-diverse-set.csv, func-diverse-set.csv – csv files containing which MOFs are present in each diverse set (from farthest point sampling of the MOFs based on either their functional group chemistry or metal chemistry). The file indicates which MOFs are present in ARC-MOF and which are not.

    Version history of repository:

    v2 -- added file: "all_topology_lists.csv"

    v3 -- added file: "ML_test_set.tar.gz"

    v4 -- replaced file: "ML_test_set.tar.gz". Originally incorrect repository of cifs

    v5 -- A slightly updated version of ARC-MOF has been provided. Some MOFs were removed from ARC-MOF due to structural errors. Some MOFs in ARC-MOF containing Sm were updated, as they had incorrectly assigned charges. Additional MOFs from all_structures containing Sm were added to ARC-MOF.

    v6 -- Updated version of ARC-MOF. Removed of all m29 structures from the Boyd-Woo database, since the inorganic SBU is not known to exist.

  2. ab initio REPEAT Charge MOF Database (ARC-MOF)

    • zenodo.org
    application/gzip, csv
    Updated Feb 2, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jake Burner; Jake Burner; Jun Luo; Andrew White; Adam Mirmiran; Ohmin Kwon; Peter G. Boyd; Peter G. Boyd; Stephen Maley; Marco Gibaldi; Marco Gibaldi; Scott Simrod; Tom K. Woo; Tom K. Woo; Jun Luo; Andrew White; Adam Mirmiran; Ohmin Kwon; Stephen Maley; Scott Simrod (2023). ab initio REPEAT Charge MOF Database (ARC-MOF) [Dataset]. http://doi.org/10.5281/zenodo.6963079
    Explore at:
    csv, application/gzipAvailable download formats
    Dataset updated
    Feb 2, 2023
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Jake Burner; Jake Burner; Jun Luo; Andrew White; Adam Mirmiran; Ohmin Kwon; Peter G. Boyd; Peter G. Boyd; Stephen Maley; Marco Gibaldi; Marco Gibaldi; Scott Simrod; Tom K. Woo; Tom K. Woo; Jun Luo; Andrew White; Adam Mirmiran; Ohmin Kwon; Stephen Maley; Scott Simrod
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This is a database of ~280,000 MOFs which have been either experimentally characterized or computationally generated, spanning all publicly available MOF databases. DFT-derived REPEAT charges, adsorption data, and various descriptors are available for all MOFs.

    • all_structures_1.tar.gz and all_structures_2.tar.gz – these are the cif files that were considered to compose the “entire known design space” of MOFs, with any bad structures removed (split into two separate tarballs since it is a lot of data).
    • ARCMOF_20220610.tar.gz – these are all of the cif files with REPEAT charges composing ARC-MOF.
    • flig-clusters.csv, func-clusters.csv, geo-clusters.csv, mc-clusters.csv – Each file indicates for each MOF which cluster it belongs to, and whether the MOF is present in ARC-MOF. This is done for each "type" of MOF chemistry and for the geometric properties. Clusters with a negative value indicate the MOF does not belong to any cluster (i.e., it is assumed to be "unique").
    • all_topology_lists.csv – a csv file containing the topology reported by the filename of applicable structures, and the topology reported by CrystalNets.jl
    • geometric_properties.csv – a csv file containing geometric descriptors computed for this study for all MOFs. The csv file also indicates which MOFs are present in ARC-MOF, and the order in which they were chosen for the farthest point sampling (up to 100K MOFs).
    • RACs.csv – See geometric_properties.csv description. Same type of file, but with the RAC descriptors.
    • RDFs.csv – The RDFs for each MOF, using several atomic properties. Some atomic properties are not available for all elements. In the cases where the atomic property is not available for a particular structure, no value is assigned.
    • methane.csv, methane_purification-CH4.csv, methane_purification_CO2.csv, post_comb_vsa-CO2.csv, post_comb_vsa-N2.csv, pre_comb_4040-CO2.csv, pre_comb_4040-H2.csv, landfill-CH4.csv, landfill-CO2.csv – these are csv files of the raw uptake data and various temperature, pressure conditions (with standard deviations) for each gas separation process specified in the file overall_process.csv.
    • overall_process.csv – This is a csv file of the adsorption properties of the MOFs. Particularly, the csv files contain the working capacity (mmol/g_working_capacity) and selectivity of each MOF for each of the five process conditions.
    • mc-diverse-set.csv, func-diverse-set.csv – csv files containing which MOFs are present in each diverse set (from farthest point sampling of the MOFs based on either their functional group chemistry or metal chemistry). The file indicates which MOFs are present in ARC-MOF and which are not.
  3. ab initio REPEAT Charge MOF Database (ARC-MOF)

    • zenodo.org
    application/gzip, bin +1
    Updated Feb 2, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jake Burner; Jake Burner; Jun Luo; Andrew White; Adam Mirmiran; Ohmin Kwon; Peter G. Boyd; Peter G. Boyd; Stephen Maley; Marco Gibaldi; Marco Gibaldi; Scott Simrod; Tom K. Woo; Tom K. Woo; Jun Luo; Andrew White; Adam Mirmiran; Ohmin Kwon; Stephen Maley; Scott Simrod (2023). ab initio REPEAT Charge MOF Database (ARC-MOF) [Dataset]. http://doi.org/10.5281/zenodo.6908728
    Explore at:
    bin, csv, application/gzipAvailable download formats
    Dataset updated
    Feb 2, 2023
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Jake Burner; Jake Burner; Jun Luo; Andrew White; Adam Mirmiran; Ohmin Kwon; Peter G. Boyd; Peter G. Boyd; Stephen Maley; Marco Gibaldi; Marco Gibaldi; Scott Simrod; Tom K. Woo; Tom K. Woo; Jun Luo; Andrew White; Adam Mirmiran; Ohmin Kwon; Stephen Maley; Scott Simrod
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This is a database of ~280,000 MOFs which have been either experimentally characterized or computationally generated, spanning all publicly available MOF databases (the DOI of our preprint: 10.26434/chemrxiv-2022-mvr06, which gives more information on this dataset). Please cite the paper above if you use ARC-MOF. DFT-derived REPEAT charges, adsorption data, and various descriptors are available for all MOFs. A description of the files is given here (please ignore the README):

    • all_structures_1.tar.gz and all_structures_2.tar.gz – these are the cif files that were considered to compose the “entire known design space” of MOFs, with any bad structures removed (split into two separate tarballs since it is a lot of data).
    • ARCMOF_20220610.tar.gz – these are all of the cif files with REPEAT charges composing ARC-MOF.
    • flig-clusters.csv, func-clusters.csv, geo-clusters.csv, mc-clusters.csv – Each file indicates for each MOF which cluster it belongs to, and whether the MOF is present in ARC-MOF. This is done for each "type" of MOF chemistry and for the geometric properties. Clusters with a negative value indicate the MOF does not belong to any cluster (i.e., it is assumed to be "unique").
    • geometric_properties.csv – a csv file containing geometric descriptors computed for this study for all MOFs. The csv file also indicates which MOFs are present in ARC-MOF, and the order in which they were chosen for the farthest point sampling (up to 100K MOFs).
    • RACs.csv – See geometric_properties.csv description. Same type of file, but with the RAC descriptors.
    • RDFs.csv – The RDFs for each MOF, using several atomic properties. Some atomic properties are not available for all elements. In the cases where the atomic property is not available for a particular structure, no value is assigned.
    • methane.csv, methane_purification-CH4.csv, methane_purification_CO2.csv, post_comb_vsa-CO2.csv, post_comb_vsa-N2.csv, pre_comb_4040-CO2.csv, pre_comb_4040-H2.csv, landfill-CH4.csv, landfill-CO2.csv – these are csv files of the raw uptake data and various temperature, pressure conditions (with standard deviations) for each gas separation process specified in the file overall_process.csv.
    • overall_process.csv – This is a csv file of the adsorption properties of the MOFs. Particularly, the csv files contain the working capacity (mmol/g_working_capacity) and selectivity of each MOF for each of the five process conditions.
    • mc-diverse-set.csv, func-diverse-set.csv – csv files containing which MOFs are present in each diverse set (from farthest point sampling of the MOFs based on either their functional group chemistry or metal chemistry). The file indicates which MOFs are present in ARC-MOF and which are not.
  4. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Maley, Stephen (2024). ab initio REPEAT Charge MOF Database (ARC-MOF) [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_6908727

ab initio REPEAT Charge MOF Database (ARC-MOF)

Explore at:
Dataset updated
Oct 12, 2024
Dataset provided by
Woo, Tom K.
White, Andrew
Maley, Stephen
Mirmiran, Adam
Luo, Jun
Kwon, Ohmin
Boyd, Peter G.
Simrod, Scott
Gibaldi, Marco
Burner, Jake
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

This is a database of ~280,000 MOFs which have been either experimentally characterized or computationally generated, spanning all publicly available MOF databases. DFT-derived REPEAT charges, adsorption data, and various descriptors are available for all MOFs.

all_structures_1.tar.gz and all_structures_2.tar.gz – these are the cif files that were considered to compose the “entire known design space” of MOFs, with any bad structures removed (split into two separate tarballs since it is a lot of data).

ARCMOF_20220610.tar.gz – these are all of the cif files with REPEAT charges composing ARC-MOF.

flig-clusters.csv, func-clusters.csv, geo-clusters.csv, mc-clusters.csv – Each file indicates for each MOF which cluster it belongs to, and whether the MOF is present in ARC-MOF. This is done for each "type" of MOF chemistry and for the geometric properties. Clusters with a negative value indicate the MOF does not belong to any cluster (i.e., it is assumed to be "unique").

all_topology_lists.csv – a csv file containing the topology reported by the filename of applicable structures, and the topology reported by CrystalNets.jl

ML_test_set.tar.gz – these are the cif files (with REPEAT charges) of the MOFs in the diverse-mc subset, but missing from ARC-MOF (for the purposes of a ML test set for the prediction of metal charges).

geometric_properties.csv – a csv file containing geometric descriptors computed for this study for all MOFs. The csv file also indicates which MOFs are present in ARC-MOF, and the order in which they were chosen for the farthest point sampling (up to 100K MOFs).

RACs.csv – See geometric_properties.csv description. Same type of file, but with the RAC descriptors.

RDFs.csv – The RDFs for each MOF, using several atomic properties. Some atomic properties are not available for all elements. In the cases where the atomic property is not available for a particular structure, no value is assigned.

methane.csv, methane_purification-CH4.csv, methane_purification_CO2.csv, post_comb_vsa-CO2.csv, post_comb_vsa-N2.csv, pre_comb_4040-CO2.csv, pre_comb_4040-H2.csv, landfill-CH4.csv, landfill-CO2.csv – these are csv files of the raw uptake data and various temperature, pressure conditions (with standard deviations) for each gas separation process specified in the file overall_process.csv.

overall_process.csv – This is a csv file of the adsorption properties of the MOFs. Particularly, the csv files contain the working capacity (mmol/g_working_capacity) and selectivity of each MOF for each of the five process conditions.

mc-diverse-set.csv, func-diverse-set.csv – csv files containing which MOFs are present in each diverse set (from farthest point sampling of the MOFs based on either their functional group chemistry or metal chemistry). The file indicates which MOFs are present in ARC-MOF and which are not.

Version history of repository:

v2 -- added file: "all_topology_lists.csv"

v3 -- added file: "ML_test_set.tar.gz"

v4 -- replaced file: "ML_test_set.tar.gz". Originally incorrect repository of cifs

v5 -- A slightly updated version of ARC-MOF has been provided. Some MOFs were removed from ARC-MOF due to structural errors. Some MOFs in ARC-MOF containing Sm were updated, as they had incorrectly assigned charges. Additional MOFs from all_structures containing Sm were added to ARC-MOF.

v6 -- Updated version of ARC-MOF. Removed of all m29 structures from the Boyd-Woo database, since the inorganic SBU is not known to exist.

Search
Clear search
Close search
Google apps
Main menu