MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Experimental band gap of 6354 inorganic semiconductors.Data is available as Monty Encoder encoded JSON and as the source CSV file. Recommended access method is with the matminer Python package using the datasets module.Note on citations: If you found this dataset useful and would like to cite it in your work, please be sure to cite its original sources below rather than or in addition to this page.Dataset discussed in and sourced from the supplementary information of:Predicting the Band Gaps of Inorganic Solids by Machine LearningYa Zhuo , Aria Mansouri Tehrani , and Jakoah Brgoch* Department of Chemistry, University of Houston, Houston, Texas 77204, United StatesJ. Phys. Chem. Lett., 2018, 9 (7), pp 1668–1673DOI: 10.1021/acs.jpclett.8b00124Publication Date (Web): March 13, 2018Copyright © 2018 American Chemical Society
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
18,928 perovskites generated with ABX combinatorics, calculating gbllsc band gap and pbe structure, and also reporting absolute band edge positions and heat of formation. Available as Monty Encoder encoded JSON and CSV files. The recommended access method is through the matminer Python package using the datasets module.Note on citations: If you found this dataset useful and would like to cite it in your work, please be sure to cite its original sources below rather than or in addition to this page.Dataset is described in the following:Ivano E. Castelli, David D. Landis, Kristian S. Thygesen, Søren Dahl, Ib Chorkendorff, Thomas F. Jaramillo and Karsten W. Jacobsen (2012) New cubic perovskites for one- and two-photon water splitting using the computational materials repository. Energy Environ. Sci., 2012,5, 9034-9043 https://doi.org/10.1039/C2EE22341D
Materials Datasets with 273 compositional and structural features extracted from Matminer. Materials datasets are retrieved using the python package jarvis-tools.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
2574 (2494) materials used for training regressors that predict shear and bulk modulus. The xlsx file provided consists of the original data used to train models described in reference 1 below. The json.gz file includes structural and composition based data from the Materials Project as well as mpid values. Several entries have been marked suspect in this file as they could not be properly cross referenced on the Materials Project database. An additional goup of materials have been marked suspect due to large discrepancies in shear and bulk modulus from the source file and current MP values.Data is available as Monty Encoder encoded JSON and as a XLSX file. Recommended access method is with the matminer Python package using the datasets module.Note on citations: If you found this dataset useful and would like to cite it in your work, please be sure to cite its original sources below rather than or in addition to this page.Dataset discussed in:Machine Learning Directed Search for Ultraincompressible, Superhard Materials Aria Mansouri Tehrani, Anton O. Oliynyk, Marcus Parry, Zeshan Rizvi, Samantha Couper, Feng Lin, Lowell Miyagi, Taylor D. Sparks, and Jakoah Brgoch Journal of the American Chemical Society
2018
140
(31),
9844-9853
DOI: 10.1021/jacs.8b02717
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Band gap of 1306 double perovskites (a_1-b_1-a_2-b_2-O6) calculated using Gritsenko, van Leeuwen, van Lenthe and Baerends potential (gllbsc) in GPAW. Collected here for prediction of material band gap and comparison to work done by Pilania et al. Lumo supplementary dataset holds data on lowest unoccupied molecular orbital for parovskite constituent atoms.Available as Monty Encoder encoded JSON and as XLSX. Recommended access method is with the matminer Python package using the datasets module.Note on citations: If you found this dataset useful and would like to cite it in your work, please be sure to cite its original sources below rather than or in addition to this page.Dataset discussed in:Pilania, G. et al. Machine learning bandgaps of double perovskites. Sci. Rep. 6, 19375; doi: 10.1038/srep19375 (2016).Dataset sourced from:https://cmr.fysik.dtu.dk/
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
4,914 perovskite oxides containing composition data, lattice constants, and formation + vacancy formation energies. All perovskites are of the form ABO3. Adapted from a dataset presented by Emery and Wolverton.Available as Monty Encoder encoded JSON and as CSV. Recommended access method is with the matminer Python package using the datasets module.Note on citations: If you found this dataset useful and would like to cite it in your work, please be sure to cite its original sources below rather than or in addition to this page.Dataset described in:Emery, A. A. & Wolverton, C. High-throughput DFT calculations of formation energy, stability and oxygen vacancy formation energy of ABO3 perovskites. Sci. Data 4:170153 doi: 10.1038/sdata.2017.153 (2017).Data sourced from:Emery, A. A., & Wolverton, C. Figshare http://dx.doi.org/10.6084/m9.figshare.5334142 (2017)
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Dataset of material properties used to predict dielectric constants. Available as MontyEncoder encoded compressed JSON and as CSV. The recommended download method is using the matminer.datasets module. Note on citations: If you found this dataset useful and would like to cite it in your work, please be sure to cite its original sources below rather than or in addition to this page.Dataset described in the following publication:Petousis I, Mrdjenovich D, Ballouz E, Liu M, Winston D, Chen W, Graf T, Schladt TD, Persson KA, Prinz FB (2017) High-throughput screening of inorganic compounds for the discovery of novel dielectric and optical materials. Scientific Data 4: 160134. https://doi.org/10.1038/sdata.2016.134 Dataset was adapted by Hacking Materials group from json files originally sourced from Dryad (see references 3-4 below).Petousis I, Mrdjenovich D, Ballouz E, Liu M, Chen W, Graf T, Schladt TD, Persson KA, Prinz FB (2017) Data from: High-throughput screening of inorganic compounds for dielectric and optical properties to enable the discovery of novel materials. Dryad Digital Repository. https://doi.org/10.5061/dryad.ph81h
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
1153 Heusler alloys with DFT-calculated magnetic and electronic properties. The 1153 alloys include 576 full, 449 half and 128 inverse Heusler alloys. The data are extracted and cleaned (including de-duplicating) from Citrine. See reference 1 below.Available as Monty Encoder encoded JSON and as CSV. Recommended access method is with the matminer Python package using the datasets module.Note on citations: If you found this dataset useful and would like to cite it in your work, please be sure to cite its original sources below rather than or in addition to this page.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Dataset required for the analysis at this repository. The link to the original publication pertaining to this dataset is available in the repository. The dataset consists of:
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Phonon (lattice/atoms vibrations) and dielectric properties of 1296 compounds computed via ABINIT software package in the harmonic approximation based on density functional perturbation theory.Available as Monty Encoder encoded JSON and as CSV. Recommended access method is with the matminer Python package using the datasets module. Note:* Only one of the three targets should be used in a ML training setting to prevent data leakage.* For training, retrieval of formulas and structures via mpids can be done hence the usage of composition and structure featurizers is recommended.Note on citations: If you found this dataset useful and would like to cite it in your work, please be sure to cite its original sources below rather than or in addition to this page.Dataset described in:Petretto, G. et al. High-throughput density functional perturbation theory phonons for inorganic materials. Sci. Data 5:180065 doi: 10.1038/sdata.2018.65 (2018).Dataset modified from files available on Figshare (see references 3-4):Petretto, G. et al. High-throughput density functional perturbation theory phonons for inorganic materials. Sci. Data 5:180065 doi: 10.1038/sdata.2018.65 (2018).
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
The dataset in this study was processed and visualized using the pandas, matplotlib, and matminer libraries within the Scikit-learn framework. This dataset primarily includes data from the processes of machine learning model training, high-throughput material generation and screening, and first-principles calculation validation. The dataset consists of two main files: data_set and data of supplementary materials. The data_set file contains the original data (janus.csv), feature-engineered original data (janus_featured.csv), element-substituted data (prd_janus.csv), feature-engineered element-substituted data (prd_janus_featured.csv), and the model screening results (stable_high_magnetization.csv). The data of supplementary materials file includes Figure S1 and Table S1. Figure S1 presents an analysis of feature importance for lattice constants a = b, lattice constant c, formation energy, and magnetic moment categories during model training. Table S1 provides the formation energy and magnetic moment obtained from direct static self-consistent calculations for 13 unoptimized Janus structures.(1) janus.csv is the original dataset obtained from the Materials Project database, containing information on the chemical composition (elements and stoichiometry), crystal space group, lattice constants, formation energy, and total magnetic moment of 1,179 two-dimensional hexagonal ABC-type Janus materials.(2) janus_featured.csv is the dataset obtained by applying feature engineering based on elemental composition information to the original dataset.(3) prd_janus.csv is a dataset of 82,018 ABC-type two-dimensional Janus materials, not yet experimentally synthesized, generated by random substitution of elements A, B, and C from the periodic table based on the two-dimensional hexagonal ABC-type Janus structures in the original dataset.(4) prd_janus_featured.csv is the feature-engineered dataset of the element-substituted materials.(5) stable_high_magnetization.csv is the dataset obtained by applying a trained machine learning model to the feature-engineered element-substituted data, containing 4,204 Janus structures with lattice information, thermal stability, and high magnetic moment.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Effective mass and thermoelectric properties of 8924 compounds in The Materials Project database that are calculated by the BoltzTraP software package run on the GGA-PBE or GGA+U density functional theory calculation results. The properties are reported at the temperature of 300 Kelvin and the carrier concentration of 1e18 1/cm3.Available as Monty Encoder encoded JSON and as CSV. Recommended access method is with the matminer Python package using the datasets module. Note:* When doing machine learning, to avoid data leakage, one may want to only use the formula and structure data as features. For example, S_n is strongly correlated with PF_n and usually when one is available the other one is available too.* It is recommended that dos and bandstructure objects are retrieved from Materials Project and then use dos, bandstructure and composition featurizers to generate input features.Note on citations: If you found this dataset useful and would like to cite it in your work, please be sure to cite its original sources below rather than or in addition to this page.Dataset described in:Ricci, F. et al. An ab initio electronic transport database for inorganic materials. Sci. Data 4:170085 doi: 10.1038/sdata.2017.85 (2017).Data converted from json files available on Dryad (see references 3-4):Ricci F, Chen W, Aydemir U, Snyder J, Rignanese G, Jain A, Hautier G (2017) Data from: An ab initio electronic transport database for inorganic materials. Dryad Digital Repository. https://doi.org/10.5061/dryad.gn001
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Metallic glass formation data for binary alloys, collected from various experimental techniques such as melt-spinning or mechanical alloying. This dataset covers all compositions with an interval of 5 at.% in 59 binary systems, containing a total of 5959 alloys in the dataset. The target property of this dataset is the glass forming ability (GFA), i.e. whether the composition can form monolithic glass or not, which is either 1 for glass forming or 0 for non-full glass forming.The V2 versions of this dataset have been cleaned to remove duplicate data points. Any entries with identical formula and both negative and positive GFA classes were combined to a single entry with a positive GFA class.Data is available as Monty Encoder encoded JSON and as the source CSV file. Recommended access method is with the matminer Python package using the datasets module.Note on citations: If you found this dataset useful and would like to cite it in your work, please be sure to cite its original sources below rather than or in addition to this page.Dataset discussed in:Machine Learning Approach for Prediction and Understanding of Glass-Forming AbilityY. T. Sun†§ , H. Y. Bai†§, M. Z. Li*‡, and W. H. Wang*†§† Institute of Physics, Chinese Academy of Sciences, Beijing 100190, People’s Republic of China‡ Department of Physics, Beijing Key Laboratory of Optoelectronic Functional Materials & Micro-nano Devices, Renmin University of China, Beijing 100872, People’s Republic of China§ University of Chinese Academy of Science, Beijing 100049, People’s Republic of ChinaJ. Phys. Chem. Lett., 2017, 8 (14), pp 3434–3439DOI: 10.1021/acs.jpclett.7b01046Publication Date (Web): July 11, 2017
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Experimental formation enthalpies for inorganic compounds, collected from years of calorimetric experiments. There are 1,276 entries in this dataset, mostly binary compounds. Matching mpids or oqmdids as well as the DFT-computed formation energies are also added (if any).Data is available as Monty Encoder encoded JSON and as a CSV file. Recommended access method is with the matminer Python package using the datasets module.Note on citations: If you found this dataset useful and would like to cite it in your work, please be sure to cite its original sources below rather than or in addition to this page.Dataset discussed in:Kim, G. et al. Experimental formation enthalpies for intermetallic phases and other inorganic compounds. Sci. Data 4:170162 doi: 10.1038/sdata.2017.162 (2017).Dataset sourced and modified from:Kim, George; Meschel, Susan; Nash, Philip; Chen, Wei (2017): Experimental formation enthalpies for intermetallic phases and other inorganic compounds. figshare. Collection.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Various properties of 24,759 bulk and 2D materials computed with the OptB88vdW and TBmBJ functionals taken from the JARVIS DFT database. This dataset was modified from the JARVIS ML training set developed by NIST (1-2). The custom descriptors have been removed, the column naming scheme revised, and a composition column created. This leaves the training set as a dataset of composition and structure descriptors mapped to a diverse set of materials properties.Available as Monty Encoder encoded JSON and as the source Monty Encoder encoded JSON file. Recommended access method is with the matminer Python package using the datasets module.Note on citations: If you found this dataset useful and would like to cite it in your work, please be sure to cite its original sources below rather than or in addition to this page.Dataset discussed in: Machine learning with force-field-inspired descriptors for materials: Fast screening and mapping energy landscape Kamal Choudhary, Brian DeCost, and Francesca Tavazza Phys. Rev. Materials 2, 083801Original Data file sourced from:choudhary, kamal (2018): JARVIS-ML-CFID-descriptors and material properties. figshare. Dataset.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Thermal conductivity of 872 compounds measured experimentally and retrieved from Citrine database from various references. The reported values are measured at various temperatures of which 295 are at room temperature.Available as Monty Encoder encoded JSON and as CSV. Recommended access method for these particular files is with the matminer Python package using the datasets module.Note on citations: If you found this dataset useful and would like to cite it in your work, please be sure to cite Citrine Informatics and their citrination client rather than or in addition to this page.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Metallic glass formation dataset for ternary alloys, collected from the high-throughput sputtering experiments measuring whether it is possible to form a glass using sputtering.The hipt experimental data are of the Co-Fe-Zr, Co-Ti-Zr, Co-V-Zr and Fe-Ti-Nb ternary systems.Available as Monty Encoder encoded JSON and as CSV. Recommended access method is with the matminer Python package using the datasets module.Note on citations: If you found this dataset useful and would like to cite it in your work, please be sure to cite its original sources below rather than or in addition to this page.Dataset discussed in:Accelerated discovery of metallic glasses through iteration of machine learning and high-throughput experiments By Fang Ren, Logan Ward, Travis Williams, Kevin J. Laws, Christopher Wolverton, Jason Hattrick-Simpers, Apurva Mehta Science Advances 13 Apr 2018 : eaaq1566
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Piezoelectric tensor data. Available as Monty Encoder encoded JSON and CSV files. The recommended access method is through the matminer Python package using the datasets module.Note on citations: If you found this dataset useful and would like to cite it in your work, please be sure to cite its original sources below rather than or in addition to this page.Dataset is described in the following:De Jong M, Chen W, Geerlings H, Asta M, Persson K (2015) A database to enable discovery and design of piezoelectric materials. Scientific Data 2: 150053. https://doi.org/10.1038/sdata.2015.53Data adapted from JSON files available here:De Jong M, Chen W, Geerlings H, Asta M, Persson K (2015) Data from: A database to enable discovery and design of piezoelectric materials. Dryad Digital Repository. https://doi.org/10.5061/dryad.n63m4
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Prepared using: https://github.com/usnistgov/jarvis_leaderboard/blob/main/jarvis_leaderboard/benchmarks/matminer_lgbm/run.py
JARVIS-DFT: https://jarvis.nist.gov/jarvisdft/
MatMiner: https://github.com/hackingmaterials/matminer
JARVIS-Leaderboard: https://pages.nist.gov/jarvis_leaderboard/
Code reference:
https://github.com/mathsphy/paper-ml-robustness-material-property
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
312 steels with experimental yield strength and ultimate tensile strength, extracted and cleaned (including de-duplicating) from Citrine. See reference 1.Available as Monty Encoder encoded JSON and as CSV. Recommended access method is with the matminer Python package using the datasets module.Note on citations: If you found this dataset useful and would like to cite it in your work, please be sure to cite its original sources below rather than or in addition to this page.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Experimental band gap of 6354 inorganic semiconductors.Data is available as Monty Encoder encoded JSON and as the source CSV file. Recommended access method is with the matminer Python package using the datasets module.Note on citations: If you found this dataset useful and would like to cite it in your work, please be sure to cite its original sources below rather than or in addition to this page.Dataset discussed in and sourced from the supplementary information of:Predicting the Band Gaps of Inorganic Solids by Machine LearningYa Zhuo , Aria Mansouri Tehrani , and Jakoah Brgoch* Department of Chemistry, University of Houston, Houston, Texas 77204, United StatesJ. Phys. Chem. Lett., 2018, 9 (7), pp 1668–1673DOI: 10.1021/acs.jpclett.8b00124Publication Date (Web): March 13, 2018Copyright © 2018 American Chemical Society