MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
18,928 perovskites generated with ABX combinatorics, calculating gbllsc band gap and pbe structure, and also reporting absolute band edge positions and heat of formation. Available as Monty Encoder encoded JSON and CSV files. The recommended access method is through the matminer Python package using the datasets module.Note on citations: If you found this dataset useful and would like to cite it in your work, please be sure to cite its original sources below rather than or in addition to this page.Dataset is described in the following:Ivano E. Castelli, David D. Landis, Kristian S. Thygesen, Søren Dahl, Ib Chorkendorff, Thomas F. Jaramillo and Karsten W. Jacobsen (2012) New cubic perovskites for one- and two-photon water splitting using the computational materials repository. Energy Environ. Sci., 2012,5, 9034-9043 https://doi.org/10.1039/C2EE22341D
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Experimental band gap of 6354 inorganic semiconductors.Data is available as Monty Encoder encoded JSON and as the source CSV file. Recommended access method is with the matminer Python package using the datasets module.Note on citations: If you found this dataset useful and would like to cite it in your work, please be sure to cite its original sources below rather than or in addition to this page.Dataset discussed in and sourced from the supplementary information of:Predicting the Band Gaps of Inorganic Solids by Machine LearningYa Zhuo , Aria Mansouri Tehrani , and Jakoah Brgoch* Department of Chemistry, University of Houston, Houston, Texas 77204, United StatesJ. Phys. Chem. Lett., 2018, 9 (7), pp 1668–1673DOI: 10.1021/acs.jpclett.8b00124Publication Date (Web): March 13, 2018Copyright © 2018 American Chemical Society
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
2574 (2494) materials used for training regressors that predict shear and bulk modulus. The xlsx file provided consists of the original data used to train models described in reference 1 below. The json.gz file includes structural and composition based data from the Materials Project as well as mpid values. Several entries have been marked suspect in this file as they could not be properly cross referenced on the Materials Project database. An additional goup of materials have been marked suspect due to large discrepancies in shear and bulk modulus from the source file and current MP values.Data is available as Monty Encoder encoded JSON and as a XLSX file. Recommended access method is with the matminer Python package using the datasets module.Note on citations: If you found this dataset useful and would like to cite it in your work, please be sure to cite its original sources below rather than or in addition to this page.Dataset discussed in:Machine Learning Directed Search for Ultraincompressible, Superhard Materials Aria Mansouri Tehrani, Anton O. Oliynyk, Marcus Parry, Zeshan Rizvi, Samantha Couper, Feng Lin, Lowell Miyagi, Taylor D. Sparks, and Jakoah Brgoch Journal of the American Chemical Society
2018
140
(31),
9844-9853
DOI: 10.1021/jacs.8b02717
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Band gap of 1306 double perovskites (a_1-b_1-a_2-b_2-O6) calculated using Gritsenko, van Leeuwen, van Lenthe and Baerends potential (gllbsc) in GPAW. Collected here for prediction of material band gap and comparison to work done by Pilania et al. Lumo supplementary dataset holds data on lowest unoccupied molecular orbital for parovskite constituent atoms.Available as Monty Encoder encoded JSON and as XLSX. Recommended access method is with the matminer Python package using the datasets module.Note on citations: If you found this dataset useful and would like to cite it in your work, please be sure to cite its original sources below rather than or in addition to this page.Dataset discussed in:Pilania, G. et al. Machine learning bandgaps of double perovskites. Sci. Rep. 6, 19375; doi: 10.1038/srep19375 (2016).Dataset sourced from:https://cmr.fysik.dtu.dk/
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
4,914 perovskite oxides containing composition data, lattice constants, and formation + vacancy formation energies. All perovskites are of the form ABO3. Adapted from a dataset presented by Emery and Wolverton.Available as Monty Encoder encoded JSON and as CSV. Recommended access method is with the matminer Python package using the datasets module.Note on citations: If you found this dataset useful and would like to cite it in your work, please be sure to cite its original sources below rather than or in addition to this page.Dataset described in:Emery, A. A. & Wolverton, C. High-throughput DFT calculations of formation energy, stability and oxygen vacancy formation energy of ABO3 perovskites. Sci. Data 4:170153 doi: 10.1038/sdata.2017.153 (2017).Data sourced from:Emery, A. A., & Wolverton, C. Figshare http://dx.doi.org/10.6084/m9.figshare.5334142 (2017)
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Phonon (lattice/atoms vibrations) and dielectric properties of 1296 compounds computed via ABINIT software package in the harmonic approximation based on density functional perturbation theory.Available as Monty Encoder encoded JSON and as CSV. Recommended access method is with the matminer Python package using the datasets module. Note:* Only one of the three targets should be used in a ML training setting to prevent data leakage.* For training, retrieval of formulas and structures via mpids can be done hence the usage of composition and structure featurizers is recommended.Note on citations: If you found this dataset useful and would like to cite it in your work, please be sure to cite its original sources below rather than or in addition to this page.Dataset described in:Petretto, G. et al. High-throughput density functional perturbation theory phonons for inorganic materials. Sci. Data 5:180065 doi: 10.1038/sdata.2018.65 (2018).Dataset modified from files available on Figshare (see references 3-4):Petretto, G. et al. High-throughput density functional perturbation theory phonons for inorganic materials. Sci. Data 5:180065 doi: 10.1038/sdata.2018.65 (2018).
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Metallic glass formation data for binary alloys, collected from various experimental techniques such as melt-spinning or mechanical alloying. This dataset covers all compositions with an interval of 5 at.% in 59 binary systems, containing a total of 5959 alloys in the dataset. The target property of this dataset is the glass forming ability (GFA), i.e. whether the composition can form monolithic glass or not, which is either 1 for glass forming or 0 for non-full glass forming.The V2 versions of this dataset have been cleaned to remove duplicate data points. Any entries with identical formula and both negative and positive GFA classes were combined to a single entry with a positive GFA class.Data is available as Monty Encoder encoded JSON and as the source CSV file. Recommended access method is with the matminer Python package using the datasets module.Note on citations: If you found this dataset useful and would like to cite it in your work, please be sure to cite its original sources below rather than or in addition to this page.Dataset discussed in:Machine Learning Approach for Prediction and Understanding of Glass-Forming AbilityY. T. Sun†§ , H. Y. Bai†§, M. Z. Li*‡, and W. H. Wang*†§† Institute of Physics, Chinese Academy of Sciences, Beijing 100190, People’s Republic of China‡ Department of Physics, Beijing Key Laboratory of Optoelectronic Functional Materials & Micro-nano Devices, Renmin University of China, Beijing 100872, People’s Republic of China§ University of Chinese Academy of Science, Beijing 100049, People’s Republic of ChinaJ. Phys. Chem. Lett., 2017, 8 (14), pp 3434–3439DOI: 10.1021/acs.jpclett.7b01046Publication Date (Web): July 11, 2017
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Dataset required for the analysis at this repository. The link to the original publication pertaining to this dataset is available in the repository. The dataset consists of:
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Metallic glass formation dataset for ternary alloys, collected from the high-throughput sputtering experiments measuring whether it is possible to form a glass using sputtering.The hipt experimental data are of the Co-Fe-Zr, Co-Ti-Zr, Co-V-Zr and Fe-Ti-Nb ternary systems.Available as Monty Encoder encoded JSON and as CSV. Recommended access method is with the matminer Python package using the datasets module.Note on citations: If you found this dataset useful and would like to cite it in your work, please be sure to cite its original sources below rather than or in addition to this page.Dataset discussed in:Accelerated discovery of metallic glasses through iteration of machine learning and high-throughput experiments By Fang Ren, Logan Ward, Travis Williams, Kevin J. Laws, Christopher Wolverton, Jason Hattrick-Simpers, Apurva Mehta Science Advances 13 Apr 2018 : eaaq1566
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
The dataset in this study was processed and visualized using the pandas, matplotlib, and matminer libraries within the Scikit-learn framework. This dataset primarily includes data from the processes of machine learning model training, high-throughput material generation and screening, and first-principles calculation validation. The dataset consists of two main files: data_set and data of supplementary materials. The data_set file contains the original data (janus.csv), feature-engineered original data (janus_featured.csv), element-substituted data (prd_janus.csv), feature-engineered element-substituted data (prd_janus_featured.csv), and the model screening results (stable_high_magnetization.csv). The data of supplementary materials file includes Figure S1 and Table S1. Figure S1 presents an analysis of feature importance for lattice constants a = b, lattice constant c, formation energy, and magnetic moment categories during model training. Table S1 provides the formation energy and magnetic moment obtained from direct static self-consistent calculations for 13 unoptimized Janus structures.(1) janus.csv is the original dataset obtained from the Materials Project database, containing information on the chemical composition (elements and stoichiometry), crystal space group, lattice constants, formation energy, and total magnetic moment of 1,179 two-dimensional hexagonal ABC-type Janus materials.(2) janus_featured.csv is the dataset obtained by applying feature engineering based on elemental composition information to the original dataset.(3) prd_janus.csv is a dataset of 82,018 ABC-type two-dimensional Janus materials, not yet experimentally synthesized, generated by random substitution of elements A, B, and C from the periodic table based on the two-dimensional hexagonal ABC-type Janus structures in the original dataset.(4) prd_janus_featured.csv is the feature-engineered dataset of the element-substituted materials.(5) stable_high_magnetization.csv is the dataset obtained by applying a trained machine learning model to the feature-engineered element-substituted data, containing 4,204 Janus structures with lattice information, thermal stability, and high magnetic moment.
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
The SuperCon database, initiated in 1987 following the discovery of high-temperature superconductors, compiled data on various superconducting materials, including their material composition, structure, properties, and processes. The database initially focused on high-Tc oxide superconductors, metal-alloy superconductors, and organic superconductors, with data extracted from academic papers and other sources. However, with the decommissioning of its web-based interface in 2021, it was now published as a datasheet publicly available at MDR, a materials data repository managed by the National Institute for Materials Science, Japan.
This dataset contains the chemical formula, critical temperature, and the associated citation for superconductor materials. Here, we provided the raw.tsv file that comes directly from the source with 26k entries, plus a preprocessed and cleaned featurized.csv. The latter is processed by magpie, among many other featurizers in the matminer package for feature generation based on the chemical formula, containing about 16k entries and 140+ features and ready for analysis and machine learning.
The features in featurized.csv include:
For more details on these features, refer to this paper.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Various properties of 24,759 bulk and 2D materials computed with the OptB88vdW and TBmBJ functionals taken from the JARVIS DFT database. This dataset was modified from the JARVIS ML training set developed by NIST (1-2). The custom descriptors have been removed, the column naming scheme revised, and a composition column created. This leaves the training set as a dataset of composition and structure descriptors mapped to a diverse set of materials properties.Available as Monty Encoder encoded JSON and as the source Monty Encoder encoded JSON file. Recommended access method is with the matminer Python package using the datasets module.Note on citations: If you found this dataset useful and would like to cite it in your work, please be sure to cite its original sources below rather than or in addition to this page.Dataset discussed in: Machine learning with force-field-inspired descriptors for materials: Fast screening and mapping energy landscape Kamal Choudhary, Brian DeCost, and Francesca Tavazza Phys. Rev. Materials 2, 083801Original Data file sourced from:choudhary, kamal (2018): JARVIS-ML-CFID-descriptors and material properties. figshare. Dataset.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Experimental formation enthalpies for inorganic compounds, collected from years of calorimetric experiments. There are 1,276 entries in this dataset, mostly binary compounds. Matching mpids or oqmdids as well as the DFT-computed formation energies are also added (if any).Data is available as Monty Encoder encoded JSON and as a CSV file. Recommended access method is with the matminer Python package using the datasets module.Note on citations: If you found this dataset useful and would like to cite it in your work, please be sure to cite its original sources below rather than or in addition to this page.Dataset discussed in:Kim, G. et al. Experimental formation enthalpies for intermetallic phases and other inorganic compounds. Sci. Data 4:170162 doi: 10.1038/sdata.2017.162 (2017).Dataset sourced and modified from:Kim, George; Meschel, Susan; Nash, Philip; Chen, Wei (2017): Experimental formation enthalpies for intermetallic phases and other inorganic compounds. figshare. Collection.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Effective mass and thermoelectric properties of 8924 compounds in The Materials Project database that are calculated by the BoltzTraP software package run on the GGA-PBE or GGA+U density functional theory calculation results. The properties are reported at the temperature of 300 Kelvin and the carrier concentration of 1e18 1/cm3.Available as Monty Encoder encoded JSON and as CSV. Recommended access method is with the matminer Python package using the datasets module. Note:* When doing machine learning, to avoid data leakage, one may want to only use the formula and structure data as features. For example, S_n is strongly correlated with PF_n and usually when one is available the other one is available too.* It is recommended that dos and bandstructure objects are retrieved from Materials Project and then use dos, bandstructure and composition featurizers to generate input features.Note on citations: If you found this dataset useful and would like to cite it in your work, please be sure to cite its original sources below rather than or in addition to this page.Dataset described in:Ricci, F. et al. An ab initio electronic transport database for inorganic materials. Sci. Data 4:170085 doi: 10.1038/sdata.2017.85 (2017).Data converted from json files available on Dryad (see references 3-4):Ricci F, Chen W, Aydemir U, Snyder J, Rignanese G, Jain A, Hautier G (2017) Data from: An ab initio electronic transport database for inorganic materials. Dryad Digital Repository. https://doi.org/10.5061/dryad.gn001
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
1153 Heusler alloys with DFT-calculated magnetic and electronic properties. The 1153 alloys include 576 full, 449 half and 128 inverse Heusler alloys. The data are extracted and cleaned (including de-duplicating) from Citrine. See reference 1 below.Available as Monty Encoder encoded JSON and as CSV. Recommended access method is with the matminer Python package using the datasets module.Note on citations: If you found this dataset useful and would like to cite it in your work, please be sure to cite its original sources below rather than or in addition to this page.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Piezoelectric tensor data. Available as Monty Encoder encoded JSON and CSV files. The recommended access method is through the matminer Python package using the datasets module.Note on citations: If you found this dataset useful and would like to cite it in your work, please be sure to cite its original sources below rather than or in addition to this page.Dataset is described in the following:De Jong M, Chen W, Geerlings H, Asta M, Persson K (2015) A database to enable discovery and design of piezoelectric materials. Scientific Data 2: 150053. https://doi.org/10.1038/sdata.2015.53Data adapted from JSON files available here:De Jong M, Chen W, Geerlings H, Asta M, Persson K (2015) Data from: A database to enable discovery and design of piezoelectric materials. Dryad Digital Repository. https://doi.org/10.5061/dryad.n63m4
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Thermal conductivity of 872 compounds measured experimentally and retrieved from Citrine database from various references. The reported values are measured at various temperatures of which 295 are at room temperature.Available as Monty Encoder encoded JSON and as CSV. Recommended access method for these particular files is with the matminer Python package using the datasets module.Note on citations: If you found this dataset useful and would like to cite it in your work, please be sure to cite Citrine Informatics and their citrination client rather than or in addition to this page.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
312 steels with experimental yield strength and ultimate tensile strength, extracted and cleaned (including de-duplicating) from Citrine. See reference 1.Available as Monty Encoder encoded JSON and as CSV. Recommended access method is with the matminer Python package using the datasets module.Note on citations: If you found this dataset useful and would like to cite it in your work, please be sure to cite its original sources below rather than or in addition to this page.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
A complete copy of the Materials Project database as of 10/18/2018. Mp_all files contain structure data for each material while mp_nostruct does not.Available as Monty Encoder encoded JSON and as CSV. Recommended access method for these particular files is with the matminer Python package using the datasets module. Access to the current Materials Project is recommended through their API (good), pymatgen (better), or matminer (best).Note on citations: If you found this dataset useful and would like to cite it in your work, please be sure to cite its original sources below rather than or in addition to this page.Dataset discussed in:A. Jain*, S.P. Ong*, G. Hautier, W. Chen, W.D. Richards, S. Dacek, S. Cholia, D. Gunter, D. Skinner, G. Ceder, K.A. Persson (*=equal contributions) The Materials Project: A materials genome approach to accelerating materials innovation APL Materials, 2013, 1(1), 011002.Dataset sourced from:https://materialsproject.org/Citations for specific material properties available here:https://materialsproject.org/citing
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Metallic glass formation dataset for ternary alloys, collected from the "Nonequilibrium Phase Diagrams of Ternary Amorphous Alloys,’ a volume of the Landolt– Börnstein collection. This dataset contains experimental measurements of whether it is possible to form a glass using a variety of processing techniques at thousands of compositions from hundreds of ternary systems. The processing techniques are designated in the "processing" column.There are originally 7191 experiments in this dataset, will be reduced to 6203 after deduplicated, and will be further reduced to 6118 if combining multiple data for one composition. There are originally 6780 melt-spinning experiments in this dataset, will be reduced to 5800 if deduplicated, and will be further reduced to 5736 if combining multiple experimental data for one composition.Available as Monty Encoder encoded JSON and as CSV. Recommended access method for these particular files is with the matminer Python package using the datasets module.Note on citations: If you found this dataset useful and would like to cite it in your work, please be sure to cite its original sources below rather than or in addition to this page.Dataset sourced from:
Y. Kawazoe, T. Masumoto, A.-P. Tsai, J.-Z. Yu, T. Aihara Jr. (1997)
Y. Kawazoe, J.-Z. Yu, A.-P. Tsai, T. Masumoto (ed.)
SpringerMaterials
Nonequilibrium Phase Diagrams of Ternary Amorphous Alloys · 1 Introduction
Landolt-Börnstein - Group III Condensed Matter
37A
(Nonequilibrium Phase Diagrams of Ternary Amorphous Alloys)
https://materials.springer.com/lb/docs/sm_lbs_978-3-540-47679-5_2
10.1007/10510374_2 (Springer-Verlag Berlin Heidelberg © 1997)
Accessed: 20-10-2018
Dataset provided for comparison to work in the following paper: A general-purpose machine learning framework for predicting properties of inorganic materials Logan Ward, Ankit Agrawal, Alok Choudhary & Christopher Wolverton
npj Computational Materials volume 2, Article number: 16028 (2016)
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
18,928 perovskites generated with ABX combinatorics, calculating gbllsc band gap and pbe structure, and also reporting absolute band edge positions and heat of formation. Available as Monty Encoder encoded JSON and CSV files. The recommended access method is through the matminer Python package using the datasets module.Note on citations: If you found this dataset useful and would like to cite it in your work, please be sure to cite its original sources below rather than or in addition to this page.Dataset is described in the following:Ivano E. Castelli, David D. Landis, Kristian S. Thygesen, Søren Dahl, Ib Chorkendorff, Thomas F. Jaramillo and Karsten W. Jacobsen (2012) New cubic perovskites for one- and two-photon water splitting using the computational materials repository. Energy Environ. Sci., 2012,5, 9034-9043 https://doi.org/10.1039/C2EE22341D