Two-dimensional (2D) materials are among the most promising candidates for beyond silicon electronic and optoelectronic applications. Recently, their recognized importance, sparked a race to discover and characterize new 2D materials. Within few years the number of experimentally exfoliated or synthesized 2D materials went from a couple of dozens to few hundreds while the number theoretically predicted compounds reached a few thousands. In 2018 we first contributed to this effort with the identification of 1825 compounds that are either easily (1036) or potentially (789) exfoliable from experimentally known 3D compounds. In the present work we report on the new materials recently added to the 2D-portfolio thanks to the extension of the screening to an additional experimental database (MPDS) as well as the most up-to-date versions of the two databases (ICSD and COD) used in our previous work. This expansion led to the discovery of an additional 1252 unique monolayers bringing the total to 3077 compounds and, notably, almost doubling the number of easily exfoliable materials (2004). Moreover, we optimized the structural properties of all the materials (regardless of their binding energy or number of atoms in the unit cell) as isolated mono-layer and explored their electronic band structure. This archive entry contains the database of 2D materials in particular it contains the structural parameters for all the 3077 structures of the global Material Cloud 2D database as extracted from their bulk 3D parent, 2710 optimized 2D structures and 2345 electronic band structure together with the provenance of all data and calculations as stored by AiiDA.
The Materials Cloud three-dimensional database is a curated set of relaxed three-dimensional crystal structures based on raw CIF data taken from the external experimental databases MPDS, COD and ICSD. The raw CIF data have been imported, cleaned and parsed into a crystal structure; their ground-state has been computed using the SIRIUS-enabled pw.x code of the Quantum ESPRESSO distribution, and tight tolerance criteria for the calculations using the SSSP protocols. This entire procedure is encoded into an AiiDA workflow which automates the process while keeping full data provenance. Here, since the original source data of the ICSD and MPDS databases are copyrighted, only the provenance of the final SCF calculation on the relaxed structures can be made publicly available. The MC3D ID numbers come from a list of unique "parent" stoichiometric structures that has been created and curated from a collection of these experimental databases. Once a parent structure has been optimized using density-functional theory, it is made public and added to the online Discover section of the Materials Cloud (as mentioned, copyright might prevent publishing the original parent). Note that since not all structures have been calculated, some ID numbers are missing from the public version of the database. The full ID of each structure also contains as an appended modifier the functional that was used in the calculations. Since the ID number points to the same unique parent, mc3d-1234/pbe and mc3d-1234/pbesol have the same starting point, but have been then relaxed according to their respective functionals.
Two-dimensional (2D) materials are among the most promising candidates for beyond silicon electronic and optoelectronic applications. Recently, their recognized importance, sparked a race to discover and characterize new 2D materials. Within few years the number of experimentally exfoliated or synthesized 2D materials went from a couple of dozens to few hundreds while the number theoretically predicted compounds reached a few thousands. In 2018 we first contributed to this effort with the identification of 1825 compounds that are either easily (1036) or potentially (789) exfoliable from experimentally known 3D compounds. In the present work we report on the new materials recently added to the 2D-portfolio thanks to the extension of the screening to an additional experimental database (MPDS) as well as the most up-to-date versions of the two databases (ICSD and COD) used in our previous work. This expansion led to the discovery of an additional 1252 unique monolayers bringing the total to 3077 compounds and, notably, almost doubling the number of easily exfoliable materials (2004). Moreover, we optimized the structural properties of all the materials (regardless of their binding energy or number of atoms in the unit cell) as isolated mono-layer and explored their electronic band structure. This archive entry contains the database of 2D materials in particular it contains the structural parameters for all the 3077 structures of the global Material Cloud 2D database as extracted from their bulk 3D parent, 2710 optimized 2D structures and 2345 electronic band structure together with the provenance of all data and calculations as stored by AiiDA.
Maximally-localised Wannier functions (MLWFs) are routinely used to compute from first-principles advanced materials properties that require very dense Brillouin zone integration and to build accurate tight-binding models for scale-bridging simulations. At the same time, high-throughput (HT) computational materials design is an emergent field that promises to accelerate the reliable and cost-effective design and optimisation of new materials with target properties. The use of MLWFs in HT workflows has been hampered by the fact that generating MLWFs automatically and robustly without any user intervention and for arbitrary materials is, in general, very challenging. We address this problem directly by proposing a procedure for automatically generating MLWFs for HT frameworks. Our approach is based on the selected columns of the density matrix method (SCDM) and we present the details of its implementation in an AiiDA workflow. We apply our approach to a dataset of 200 bulk crystalline materials that span a wide structural and chemical space. We assess the quality of our MLWFs in terms of the accuracy of the band-structure interpolation that they provide as compared to the band-structure obtained via full first-principles calculations. We provide here an AiiDA export file with the full provenance of all simulations run in the project. Moreover, we provide a downloadable virtual machine that allows to reproduce the results of this paper and also to run new calculations for different materials, including all first-principles and atomistic simulations and the computational workflows.
Crystal-graph attention networks have emerged recently as remarkable tools for the prediction of thermodynamic stability and materials properties from unrelaxed crystal structures. Previous networks trained on two million materials exhibited, however, strong biases originating from underrepresented chemical elements and structural prototypes in the available data. We tackled this issue computing additional data to provide better balance across both chemical and crystal-symmetry space. Crystal-graph networks trained with this new data show unprecedented generalization accuracy, and allow for reliable, accelerated exploration of the whole space of inorganic compounds. We applied this universal network to performed machine-learning assisted high-throughput materials searches including 2500 binary and ternary prototypes and spanning about 1 billion compounds. After validation using density-functional theory, we uncover in total 19512 additional materials on the convex hull of thermodynamic stability and around 150000 compounds with a distance of less than 50 meV/atom from the hull. Here we include the DCGAT-1, DCGAT-2, and DCGAT-3 datasets used in this work.
This record is maintained in the National Geologic Map Database (NGMDB). The NGMDB is a Congressionally mandated national archive of geoscience maps, reports, and stratigraphic information, developed according to standards defined by the cooperators, i.e., the USGS and the Association of American State Geologists (AASG). Included in this system is a comprehensive set of publication citations, stratigraphic nomenclature, downloadable content, unpublished source information, and guidance on standards development. The NGMDB contains information on more than 90,000 maps and related geoscience reports published from the early 1800s to the present day, by more than 630 agencies, universities, associations, and private companies. For more information, please see http://ngmdb.usgs.gov/.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
20 minute lightning talk presentation given by Aliaksandr Yakutovich, from École Polyechnique Fédérale de Lausanne, at the Better Science through Better Data 2018 event. The video recording and scribe are included.
In the past decade we have witnessed the appearance of large databases of calculated material properties. These are most often obtained with the Perdew-Burke-Ernzerhof (PBE) functional of density-functional theory, a well established and reliable technique that is by now the standard in materials science. However, there have been recent theoretical developments that allow for an increased accuracy in the calculations. Here, we present the updated alexandria dataset of calculations for more than 415k solid-state materials obtained with two improved functionals: PBE for solids (that yields consistently better geometries than the PBE) and SCAN (probably the best all-around functional at the moment). Our results provide an accurate overview of the landscape of stable (and nearly stable) materials, and as such can be used for more reliable predictions of novel compounds. They can also be used for training machine learning models, or even for the comparison and benchmark of PBE, PBE for solids, and SCAN.
We introduce OSCAR, a repository of thousands of experimentally derived (OSCAR seed and CSD-extracted) and combinatorially enriched organocatalysts (OSCAR!(NHC) and OSCAR!(DHBD) for N-heterocyclic carbenes and hydrogen bond donors, respectively). The structures and corresponding stereoelectronic properties are publicly available and constitute the starting point to build generative and predictive models for organocatalyst performance.
In this paper, we present a workflow that is designed to work without manual intervention to efficiently predict, by using molecular simulations, the thermodynamic data that is needed to design a carbon capture process. We developed a procedure that does not rely on fitting of the adsorption isotherms. From molecular simulations, we can obtain accurate data for both, the pure component isotherms as well as the mixture isotherms. This allowed us to make a detailed comparison of the different methods to predict the mixture isotherms. All approaches rely on an accurate description of the pure component isotherms and a model to predict the mixture isotherms. As we are interested in low CO₂ concentrations, it is essential that these models correctly predict the low pressure limit, i.e., give a correct description of the Henry regime. Among the equations that describe this limit correctly, the dual-site Langmuir (DSL) model is often used for the pure components and the extended DSL (EDSL) for the mixtures. An alternative approach, which avoids describing the pure component isotherms with a model, is to numerically integrate the pure component isotherms in the context of IAST. In this work we compare these two methods. In addition, we show that the way these data are fitted for DSL can significantly impact the ranking of materials, in particular for capture processes with low concentration of CO₂ in the feed stream.
We proposed an efficient high-throughput scheme for the discovery of new stable crystalline phases. Our approach was based on the transmutation of known compounds, through the substitution of atoms in the crystal structure with chemically similar ones. The concept of similarity is defined quantitatively using a measure of chemical replaceability, extracted by data mining experimental databases. In this way we build more than 250k possible crystal phases, with almost 20k that are on the convex hull of stability. This dataset contains the optimized structure and the energy of these 250k materials calculated with the PBE approximation, in a format that is convenient for data-mining or for machine-learning applications.
The purpose of this work is to examine environmental effects of materials which are intended for use in in situ processing systems.
Two-dimensional materials are emerging as a promising platform for ultrathin channels in field-effect transistors. To this aim, novel high-mobility semiconductors need to be found or engineered. Although extrinsic mechanisms can in general be minimized by improving fabrication processes, the suppression of intrinsic scattering (driven, for example, by electron–phonon interactions) requires modification of the electronic or vibrational properties of the material. Because intervalley scattering critically affects mobilities, a powerful approach to enhance transport performance relies on engineering the valley structure. We show here the power of this strategy using uniaxial strain to lift degeneracies and suppress scattering into entire valleys, dramatically improving performance. This is shown in detail for arsenene, where a 2% strain stops scattering into four of the six valleys and leads to a 600% increase in mobility. The mechanism is general and can be applied to many other materials, including in particular the isostructural antimonene and blue phosphorene. In this entry we provide the AiiDA database with the calculation of the electron-phonon matrix elements for arsenene, both in the equilibrium and in the strained case, together with scripts to retrieve and plot them.
We report a methodology using machine learning to capture chemical intuition from a set of (partially) failed attempts to synthesize a metal organic framework. We define chemical intuition as the collection of unwritten guidelines used by synthetic chemists to find the right synthesis conditions. As (partially) failed experiments usually remain unreported, we have reconstructed a typical track of failed experiments in a successful search for finding the optimal synthesis conditions that yields HKUST-1 with the highest surface area reported to date. We illustrate the importance of quantifying this chemical intuition for the synthesis of novel materials.
Dipole polarizabilities, computed using linear response coupled cluster theory and density functional theory (using d-aug-cc-pVDZ basis set), for 7211 molecules from the QM7b dataset of small molecules and for 52 molecules from a showcase dataset.
This dataset contains three sets of CH4 geometries that are distorted along special directions, to reveal the sensitivity to atomic displacements of structural descriptors used in machine-learning applications. The structures are stored in a format that can be visualized on http://chemiscope.org, and contain also DFT-computed energies, as well as the sensitivity analysis of four different kinds of features.
We present a workflow that traces the path from the bulk structure of a crystalline material to assessing its performance in carbon capture from coal’s postcombustion flue gases. This workflow is applied to a database of 324 covalent−organic frameworks (COFs) reported in the literature, to characterize their CO2 adsorption properties using the following steps: (1) optimization of the crystal structure (atomic positions and unit cell) using density functional theory, (2) fitting atomic point charges based on the electron density, (3) characterizing the pore geometry of the structures before and after optimization, (4) computing carbon dioxide and nitrogen isotherms using grand canonical Monte Carlo simulations with an empirical interaction potential, and finally, (5) assessing the CO2 parasitic energy via process modeling.
The full workflow has been encoded in the Automated Interactive Infrastructure and Database for Computational Science (AiiDA). Both the workflow and the automatically generated provenance graph of our calculations are made available on the Materials Cloud, allowing peers to inspect every input parameter and result along the workflow, download structures and files at intermediate stages, and start their research right from where this work has left off. In particular, our set of CURATED (Clean, Uniform, and Refined with Automatic Tracking from Experimental Database) COFs, having optimized geometry and high-quality DFT-derived point charges, are available for further investigations of gas adsorption properties. We plan to update the database as new COFs are being reported.
*** UPDATE December 2019 *** - Database extended to include 417 COFs (from papers published until September 1st 2019) - Migration to AiiDA-v1.0.0 - Using the publicly available plugin aiida-lsmo
*** UPDATE February 2020 *** - Database extended to include 505 COFs (from papers published until February 1st 2020) - Including AiiDA Groups for quick interactive visualization
*** UPDATE June 2020 *** - Database extended to include 574 COFs (from papers published until June 1st 2020)
*** UPDATE September 2020 *** - Include other applications than CCS, considering the same set of 574 COFs
In this paper, we present a workflow that is designed to work without manual intervention to efficiently predict, by using molecular simulations, the thermodynamic data that is needed to design a carbon capture process. We developed a procedure that does not rely on fitting of the adsorption isotherms. From molecular simulations, we can obtain accurate data for both, the pure component isotherms as well as the mixture isotherms. This allowed us to make a detailed comparison of the different methods to predict the mixture isotherms. All approaches rely on an accurate description of the pure component isotherms and a model to predict the mixture isotherms. As we are interested in low CO₂ concentrations, it is essential that these models correctly predict the low pressure limit, i.e., give a correct description of the Henry regime. Among the equations that describe this limit correctly, the dual-site Langmuir (DSL) model is often used for the pure components and the extended DSL (EDSL) for the mixtures. An alternative approach, which avoids describing the pure component isotherms with a model, is to numerically integrate the pure component isotherms in the context of IAST. In this work we compare these two methods. In addition, we show that the way these data are fitted for DSL can significantly impact the ranking of materials, in particular for capture processes with low concentration of CO₂ in the feed stream.
Graph neural networks have enjoyed great success in the prediction of material properties for both molecules and crystals. These networks typically use the atomic positions (usually expanded in a Gaussian basis) and the atomic species as input. Unfortunately, this information is in general not available when predicting new materials, for which the precise geometrical information is unknown. In this work, we circumvent this problem by predicting the thermodynamic stability of crystal structures without using the knowledge of the precise bond distances. We replace this information with embeddings of graph distances, allowing our networks to be used directly in high-throughput studies based on both composition and crystal structure prototype. Using these embeddings, we combine the newest developments in graph neural networks and apply them to the prediction of the distances to the convex hull. To train these networks, we curate a dataset of over 2 million density-functional calculations of crystals with consistent calculation parameters from various sources. The new dataset allows for the creation of a high quality convex hull and a large scale transfer learning approach. We apply the resulting model to the high-throughput search of 15 million tetragonal perovskites of composition ABCD2. As a result, we identify several thousand potentially stable compounds and demonstrate that transfer learning from the newly curated dataset reduces the required training data by 50%.
https://data-wake.opendata.arcgis.com/datasets/eedd069f3ea74904a8329f8ce719cc28_0/license.jsonhttps://data-wake.opendata.arcgis.com/datasets/eedd069f3ea74904a8329f8ce719cc28_0/license.json
Time-restricted areas designated for the expeditious loading or unloading of materials and goods.
Two-dimensional (2D) materials are among the most promising candidates for beyond silicon electronic and optoelectronic applications. Recently, their recognized importance, sparked a race to discover and characterize new 2D materials. Within few years the number of experimentally exfoliated or synthesized 2D materials went from a couple of dozens to few hundreds while the number theoretically predicted compounds reached a few thousands. In 2018 we first contributed to this effort with the identification of 1825 compounds that are either easily (1036) or potentially (789) exfoliable from experimentally known 3D compounds. In the present work we report on the new materials recently added to the 2D-portfolio thanks to the extension of the screening to an additional experimental database (MPDS) as well as the most up-to-date versions of the two databases (ICSD and COD) used in our previous work. This expansion led to the discovery of an additional 1252 unique monolayers bringing the total to 3077 compounds and, notably, almost doubling the number of easily exfoliable materials (2004). Moreover, we optimized the structural properties of all the materials (regardless of their binding energy or number of atoms in the unit cell) as isolated mono-layer and explored their electronic band structure. This archive entry contains the database of 2D materials in particular it contains the structural parameters for all the 3077 structures of the global Material Cloud 2D database as extracted from their bulk 3D parent, 2710 optimized 2D structures and 2345 electronic band structure together with the provenance of all data and calculations as stored by AiiDA.