13 datasets found

Z
FireProtDB + PDB Structural Protein Stability Dataset
data.niaid.nih.gov
Updated Jan 30, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Brocidiacono, Michael (2024). FireProtDB + PDB Structural Protein Stability Dataset [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_8169288
Explore at:
Dataset updated
Jan 30, 2024
Dataset provided by
Brocidiacono, Michael
Dieckhaus, Henry
Randolph, Nicholas
Kuhlman, Brian
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Dataset compiled and curated for use in the ThermoMPNN paper: https://doi.org/10.1073/pnas.2314853121:

Dataset for training models for prediction of thermodynamic stability changes (ddG) of protein point mutations given a wildtype protein structure (PDB) file. Data was assembled by matching sequence-based ddG measurements in FireProtDB to structures from the RCSB Protein Data Bank (PDB). For details, see the Methods section of our manuscript.

Citing this work: If you choose to use this dataset for your own research, please cite this repository and the ThermoMPNN paper: https://doi.org/10.1073/pnas.2314853121.

Contents:

pdbs/ directory contains all PDB files

csvs/ directory contains all CSVs with mutation data

csvs/4_fireprotDB_bestpH.csv is the main (full) dataset file with 3,438 mutations across 100 proteins.

csvs/fireprot_splits.pkl contains the dataset splits (train/val/test) used in our study

csvs/splits/ contains csvs for each of the splits (train/val/test/homologue-free) indexed from the full dataset csv.

Important CSV columns:

pdb_id_corrected: corresponds to the PDB in the pdbs/ directory (after curation and disambiguation)

ddG: ddG value for mutation (mutant - WT)

wild_type: wild-type amino acid (1-letter code)

mutation: mutant amino acid (1-letter code)

pdb_position: 0-based index of the mutated residue in the PDB file (may be different from position in the original FireProtDB sequence entry)
t
Protein Data Bank (PDB) dataset for peptide design - Dataset - LDM
service.tib.eu
Updated Dec 16, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). Protein Data Bank (PDB) dataset for peptide design - Dataset - LDM [Dataset]. https://service.tib.eu/ldmservice/dataset/protein-data-bank--pdb--dataset-for-peptide-design
Explore at:
Dataset updated
Dec 16, 2024
Description
A dataset of protein-peptide complexes for training a generative model for full-atom peptide design with Geometric Latent Diffusion.
h
pdb-rna_secondary_structure
huggingface.co
Updated Apr 18, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
MultiMolecule (2025). pdb-rna_secondary_structure [Dataset]. https://huggingface.co/datasets/multimolecule/pdb-rna_secondary_structure
Explore at:
Dataset updated
Apr 18, 2025
Dataset authored and provided by
MultiMolecule
License
https://choosealicense.com/licenses/agpl-3.0/https://choosealicense.com/licenses/agpl-3.0/
Description
pdb-rna_secondary_structure

[!IMPORTANT]The pdb-rna_secondary_structure dataset is in beta test. This dataset card may not accurately reflects the data content. The data content and this dataset card may subject to change. Please contact the MultiMolecule team on GitHub issues should you have any feedback.

[!CAUTION] This dataset is converted from the dataset released by the authors of SPOT-RNA. The MultiMolecule is aware of a potential issue in data quality. We are working on… See the full description on the dataset page: https://huggingface.co/datasets/multimolecule/pdb-rna_secondary_structure.
Network Visualization Map Data
springernature.figshare.com
txt
Updated Jun 2, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Luigi Di Costanzo; Christopher Markosian (2023). Network Visualization Map Data [Dataset]. http://doi.org/10.6084/m9.figshare.6121436.v1
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.6121436.v1
Dataset updated
Jun 2, 2023
Dataset provided by
Figsharehttp://figshare.com/
Authors
Luigi Di Costanzo; Christopher Markosian
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
Data used to generate co-occurrence network map of publication data keywords using the VOSviewer server (Version 1.6.5). Approximately 227,000 keywords were extracted from citation titles and abstracts from the Web of Science. A network was computed for a total of 2,460 terms selected by the full-counting method and relevance scoring as implemented within VOSviewer. For analysis, we reviewed co-occurrence network maps for thresholds between 5 and 40. The default cutoff of 30 as the number of term co-occurrence is shown.
n
pdb-data
neuinfo.org
dknet.org
+1more
Updated Oct 16, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2019). pdb-data [Dataset]. http://identifiers.org/RRID:SCR_000386
Explore at:
Unique identifier
https://identifiers.org/RRID:SCR_000386
Dataset updated
Oct 16, 2019
Description
Search for carbohydrate containing PDB entries by criteria like species or the compound / classification terms. You can choose predefined, frequent terms from the pull-down-menus or enter your own queries manually.
India Pdb Export | List of Pdb Exporters & Suppliers
seair.co.in
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Seair Exim, India Pdb Export | List of Pdb Exporters & Suppliers [Dataset]. https://www.seair.co.in
Explore at:
.bin, .xml, .csv, .xlsAvailable download formats
Dataset provided by
Seair Exim Solutions
Authors
Seair Exim
Area covered
India
Description
Subscribers can find out export and import data of 23 countries by HS code or product’s name. This demo is helpful for market analysis.
Z
Data from: Redocking the PDB
data.niaid.nih.gov
zenodo.org
Updated Dec 6, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Flachsenberg, Florian (2023). Redocking the PDB [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_7579501
Explore at:
Dataset updated
Dec 6, 2023
Dataset provided by
Ehrt, Christiane
Rarey, Matthias
Flachsenberg, Florian
Gutermuth, Torben
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This repository contains supplementary data to the journal article 'Redocking the PDB' by Flachsenberg et al. (https://doi.org/10.1021/acs.jcim.3c01573)[1]. In this paper, we described two datasets: The PDBScan22 dataset with a large set of 322,051 macromolecule–ligand binding sites generally suitable for redocking and the PDBScan22-HQ dataset with 21,355 binding sites passing different structure quality filters. These datasets were further characterized by calculating properties of the ligand (e.g., molecular weight), properties of the binding site (e.g., volume), and structure quality descriptors (e.g., crystal structure resolution). Additionally, we performed redocking experiments with our novel JAMDA structure preparation and docking workflow[1] and with AutoDock Vina[2,3]. Details for all these experiments and the dataset composition can be found in the journal article[1]. Here, we provide all the datasets, i.e., the PDBScan22 and PDBScan22-HQ datasets as well as the docking results and the additionally calculated properties (for the ligand, the binding sites, and structure quality descriptors). Furthermore, we give a detailed description of their content (i.e., the data types and a description of the column values). All datasets consist of CSV files with the actual data and associated metadata JSON files describing their content. The CSV/JSON files are compliant with the CSV on the web standard (https://csvw.org/). General hints

All docking experiment results consist of two CSV files, one with general information about the docking run (e.g., was it successful?) and one with individual pose results (i.e., score and RMSD to the crystal structure). All files (except for the docking pose tables) can be indexed uniquely by the column tuple '(pdb, name)' containing the PDB code of the complex (e.g., 1gm8) and the name ligand (in the format '_', e.g., 'SOX_B_1559'). All files (except for the docking pose tables) have exactly the same number of rows as the dataset they were calculated on (e.g., PDBScan22 or PDBScan22-HQ). However, some CSV files may have missing values (see also the JSON metadata files) in some or even all columns (except for 'pdb' and 'name'). The docking pose tables also contain the 'pdb' and 'name' columns. However, these alone are not unique but only together with the 'rank' column (i.e., there might be multiple poses for each docking run or none). Example usage Using the pandas library (https://pandas.pydata.org/) in Python, we can calculate the number of protein-ligand complexes in the PDBScan22-HQ dataset with a top-ranked pose RMSD to the crystal structure ≤ 2.0 Å in the JAMDA redocking experiment and a molecular weight between 100 Da and 200 Da:

import pandas as pd df = pd.read_csv('PDBScan22-HQ.csv') df_poses = pd.read_csv('PDBScan22-HQ_JAMDA_NL_NR_poses.csv') df_properties = pd.read_csv('PDBScan22_ligand_properties.csv') merged = df.merge(df_properties, how='left', on=['pdb', 'name']) merged = merged[(merged['MW'] >= 100) & (merged['MW'] <= 200)].merge(df_poses[df_poses['rank'] == 1], how='left', on=['pdb', 'name']) nof_successful_top_ranked = (merged['rmsd_ai'] <= 2.0).sum() nof_no_top_ranked = merged['rmsd_ai'].isna().sum() Datasets

PDBScan22.csv: This is the PDBScan22 dataset[1]. This dataset was derived from the PDB4. It contains macromolecule–ligand binding sites (defined by PDB code and ligand identifier) that can be read by the NAOMI library[5,6] and pass basic consistency filters. PDBScan22-HQ.csv: This is the PDBScan22-HQ dataset[1]. It contains macromolecule–ligand binding sites from the PDBScan22 dataset that pass certain structure quality filters described in our publication[1]. PDBScan22-HQ-ADV-Success.csv: This is a subset of the PDBScan22-HQ dataset without 336 binding sites where AutoDock Vina[2,3] fails. PDBScan22-HQ-Macrocycles.csv: This is a subset of the PDBScan22-HQ dataset without 336 binding sites where AutoDock Vina[2,3] fails and only contains molecules with macrocycles with at least ten atoms. Properties for PDBScan22

PDBScan22_ligand_properties.csv: Conformation-independent properties of all ligand molecules in the PDBScan22 dataset. Properties were calculated using an in-house tool developed with the NAOMI library[5,6]. PDBScan22_StructureProfiler_quality_descriptors.csv: Structure quality descriptors for the binding sites in the PDBScan22 dataset calculated using the StructureProfiler tool[7]. PDBScan22_basic_complex_properties.csv: Simple properties of the binding sites in the PDBScan22 dataset. Properties were calculated using an in-house tool developed with the NAOMI library[5,6]. Properties for PDBScan22-HQ

PDBScan22-HQ_DoGSite3_pocket_descriptors.csv: Binding site descriptors calculated for the binding sites in the PDBScan22-HQ dataset using the DoGSite3 tool[8]. PDBScan22-HQ_molecule_types.csv: Assignment of ligands in the PDBScan22-HQ dataset (without 336 binding sites where AutoDock Vina fails) to different molecular classes (i.e., drug-like, fragment-like oligosaccharide, oligopeptide, cofactor, macrocyclic). A detailed description of the assignment can be found in our publication[1]. Docking results on PDBScan22

PDBScan22_JAMDA_NL_NR.csv: Docking results of JAMDA[1] on the PDBScan22 dataset. This is the general overview for the docking runs; the pose results are given in 'PDBScan22_JAMDA_NL_NR_poses.csv'. For this experiment, the ligand was not considered during preprocessing of the binding site, and the binding site restriction mode (i.e., biasing the docking towards the crystal ligand position) was disabled. PDBScan22_JAMDA_NL_NR_poses.csv: Pose scores and RMSDs for the docking results of JAMDA[1] on the PDBScan22 dataset. For this experiment, the ligand was not considered during preprocessing of the binding site, and the binding site restriction mode (i.e., biasing the docking towards the crystal ligand position) was disabled. Docking results on PDBScan22-HQ

PDBScan22-HQ_JAMDA_NL_NR.csv: Docking results of JAMDA[1] on the PDBScan22-HQ dataset. This is the general overview for the docking runs; the pose results are given in 'PDBScan22-HQ_JAMDA_NL_NR_poses.csv'. For this experiment, the ligand was not considered during preprocessing of the binding site, and the binding site restriction mode (i.e., biasing the docking towards the crystal ligand position) was disabled. PDBScan22-HQ_JAMDA_NL_NR_poses.csv: Pose scores and RMSDs for the docking results of JAMDA[1] on the PDBScan22-HQ dataset. For this experiment, the ligand was not considered during preprocessing of the binding site, and the binding site restriction mode (i.e., biasing the docking towards the crystal ligand position) was disabled. PDBScan22-HQ_JAMDA_NL_WR.csv: Docking results of JAMDA[1] on the PDBScan22-HQ dataset. This is the general overview for the docking runs; the pose results are given in 'PDBScan22-HQ_JAMDA_NL_WR_poses.csv'. For this experiment, the ligand was not considered during preprocessing of the binding site, and the binding site restriction mode (i.e., biasing the docking towards the crystal ligand position) was enabled. PDBScan22-HQ_JAMDA_NL_WR_poses.csv: Pose scores and RMSDs for the docking results of JAMDA[1] on the PDBScan22-HQ dataset. For this experiment, the ligand was not considered during preprocessing of the binding site and the binding site restriction mode (i.e., biasing the docking towards the crystal ligand position) was enabled. PDBScan22-HQ_JAMDA_NW_NR.csv: Docking results of JAMDA[1] on the PDBScan22-HQ dataset. This is the general overview for the docking runs; the pose results are given in 'PDBScan22-HQ_JAMDA_NW_NR_poses.csv'. For this experiment, the ligand was not considered during preprocessing of the binding site, all water molecules were removed from the binding site during preprocessing, and the binding site restriction mode (i.e., biasing the docking towards the crystal ligand position) was disabled. PDBScan22-HQ_JAMDA_NW_NR_poses.csv: Pose scores and RMSDs for the docking results of JAMDA[1] on the PDBScan22-HQ dataset. For this experiment, the ligand was not considered during preprocessing of the binding site, all water molecules were removed from the binding site during preprocessing, and the binding site restriction mode (i.e., biasing the docking towards the crystal ligand position) was disabled. PDBScan22-HQ_JAMDA_NW_WR.csv: Docking results of JAMDA[1] on the PDBScan22-HQ dataset. This is the general overview for the docking runs; the pose results are given in 'PDBScan22-HQ_JAMDA_NW_WR_poses.csv'. For this experiment, the ligand was not considered during preprocessing of the binding site, all water molecules were removed from the binding site during preprocessing, and the binding site restriction mode (i.e., biasing the docking towards the crystal ligand position) was enabled. PDBScan22-HQ_JAMDA_NW_WR_poses.csv: Pose scores and RMSDs for the docking results of JAMDA[1] on the PDBScan22-HQ dataset. For this experiment, the ligand was not considered during preprocessing of the binding site, all water molecules were removed from the binding site during preprocessing, and the binding site restriction mode (i.e., biasing the docking towards the crystal ligand position) was enabled. PDBScan22-HQ_JAMDA_WL_NR.csv: Docking results of JAMDA[1] on the PDBScan22-HQ dataset. This is the general overview for the docking runs; the pose results are given in 'PDBScan22-HQ_JAMDA_WL_NR_poses.csv'. For this experiment, the ligand was considered during preprocessing of the binding site, and the binding site restriction mode (i.e., biasing the docking towards the crystal ligand position) was disabled. PDBScan22-HQ_JAMDA_WL_NR_poses.csv: Pose scores and RMSDs for the docking results of JAMDA[1] on the PDBScan22-HQ dataset. For this experiment, the ligand was considered during preprocessing of the binding site, and the binding site restriction mode (i.e., biasing the docking towards the crystal ligand
s
PDB 8TYZ
data.sbgrid.org
Updated Jul 9, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). PDB 8TYZ [Dataset]. http://doi.org/10.2210/pdb8TYZ/pdb
Explore at:
Unique identifier
https://doi.org/10.2210/pdb8TYZ/pdb
Dataset updated
Jul 9, 2024
Description
Protein Data Bank Entry 8TYZ is listed as the structure corresponding to this dataset
Z
Project files provided as supporting information to the manuscript "A deep...
data.niaid.nih.gov
Updated Jan 24, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Marco Giulini (2020). Project files provided as supporting information to the manuscript "A deep learning approach to the structural analysis of proteins" [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_3356842
Explore at:
Dataset updated
Jan 24, 2020
Dataset provided by
Marco Giulini
Raffaello Potestio
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
README file to the project files provided as supporting information to the manuscript “A deep learning approach to the structural analysis of proteins”

Dec. 30, 2018

Authors: Marco Giulini and Raffaello Potestio

==================================

The dataset contains the following files:

datasets.zip: archive containing five .csv files, namely:

- decoys_cm.csv : all the data for 10728 protein decoys, training set - evaluation_cm.csv : all data for 146 proteins in the evaluation set - random_CG.csv : 1200 Coulomb matrices. 100 CG models for each protein with 120 amino acids - 1e5g_centered_sphere.csv : 100 CG models in which the central atoms in 1e5g are not removed - 1e5g_random_sphere.csv : 10 CG models for 10 different (random) locations for the sphere that includes atoms that have to be retained. 100 CG models in total

decoys_labels.lab containing the labels associated to the 10728 decoys present in the training set

evaluation_labels.lab containing the labels associated to the 146 pdb files in the evaluation set

random_CG_labels.lab containing the labels associated to the 6 proteins with 120 amino acids

network_development_training: a python script that performs cross validation and full training of the model

saved_networks.zip FOLDER containing 10 networks: the architecture is included in .json files while weight parameters are inside .hs files

pdb_files.zip FOLDER containing the PDB files that have been employed in the project, namely:

- pdb_files_len100 : pdb files with 100 amino acids - pdb_files_len101-110 : pdb files with a number of amino acids between 101 and 110 - decoys : decoys of length 100 extracted from the above folder: name syntax == PDBNAME_decoy_STARTRES_ENDRES.pdb EXAMPLE 6gsp.pdb will give rise to 6gsp_decoy_0_100.pdb , 6gsp_decoy_1_101.pdb , 6gsp_decoy_2_102.pdb , 6gsp_decoy_3_103.pdb , 6gsp_decoy_4_104.pdb - pdb_files_len100 : 6 pdb files with 120 amino acids
e
SPERM WHALE MYOGLOBIN H64A N-BUTYL ISOCYANIDE AT PH 9.0
ebi.ac.uk
Updated Nov 4, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2019). SPERM WHALE MYOGLOBIN H64A N-BUTYL ISOCYANIDE AT PH 9.0 [Dataset]. https://www.ebi.ac.uk/interpro/structure/PDB/
Explore at:
Dataset updated
Nov 4, 2019
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Data item of the type ? from the database pdb with accession 103m and name SPERM WHALE MYOGLOBIN H64A N-BUTYL ISOCYANIDE AT PH 9.0
A
‘PDB Electric Power Load History’ analyzed by Analyst-2
analyst-2.ai
Updated Feb 13, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com) (2022). ‘PDB Electric Power Load History’ analyzed by Analyst-2 [Dataset]. https://analyst-2.ai/analysis/kaggle-pdb-electric-power-load-history-f3b9/69b765ba/?iid=004-966&v=presentation
Explore at:
Dataset updated
Feb 13, 2022
Dataset authored and provided by
Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com)
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Analysis of ‘PDB Electric Power Load History’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/ashfakyeafi/pbd-load-history on 13 February 2022.

--- Dataset description provided by original source is as follows ---

Inspiration

With this data, many works can be done in the Electrical Engineering sector.

--- Original source retains full ownership of the source dataset ---
s
PDB 8TYX
data.sbgrid.org
Updated Jul 9, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). PDB 8TYX [Dataset]. http://doi.org/10.2210/pdb8TYX/pdb
Explore at:
Unique identifier
https://doi.org/10.2210/pdb8TYX/pdb
Dataset updated
Jul 9, 2024
Description
Protein Data Bank Entry 8TYX is listed as the structure corresponding to this dataset
o
NR2F1 modeling and simulation data
explore.openaire.eu
Updated Jan 22, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
VALERIO MARINO (2024). NR2F1 modeling and simulation data [Dataset]. http://doi.org/10.5281/zenodo.10551664
Explore at:
Unique identifier
https://doi.org/10.5281/zenodo.10551664
Dataset updated
Jan 22, 2024
Authors
VALERIO MARINO
Description
Homology modeling: NR2F1 Active form: NR2F1_act.pdb Auto-repressed form: NR2F1_rep.pdb Molecular Dynamics simulations Original models: nr2f1_lbd_wt.pdb, nr2f1_lbd_q244x.pdb, nr2f1_lbd_e400x.pdb Structures after 4 ns equilibration: nr2f1_lbd_wt_start.pdb, nr2f1_lbd_q244x_start.pdb, nr2f1_lbd_e400x_start.pdb Trajectories in gromacs compressed format aligned with the equilibrated structure: nr2f1_lbd_wt_clean.xtc, nr2f1_lbd_q244x_clean.xtc, nr2f1_lbd_e400x_clean.xtc Final structure after 500 ns productive MD simulations: nr2f1_lbd_wt_500ns.pdb, nr2f1_lbd_q244x_500ns.pdb, nr2f1_lbd_e400x_500ns.pdb Docking simulations For each docking simulation performed with PIPER we provide the best solution as detailed in the mansucript Homodimer: NR2F1_act_dimer.pdb (active), NR2F1_rep_dimer.pdb (auto-repressed) Heterodimer with NR2F2: NR2F1_act_NR2F2.pdb (active), NR2F1_rep_NR2F2.pdb (auto-repressed) Heterodimer with RXRa: NR2F1_act_RXRa.pdb (active), NR2F1_rep_RXRa.pdb (auto-repressed) Heterodimer with CRABP2: NR2F1_act_CRABP2_apo.pdb (apo), NR2F1_act_CRABP2_holo.pdb (holo)
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

Brocidiacono, Michael (2024). FireProtDB + PDB Structural Protein Stability Dataset [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_8169288

FireProtDB + PDB Structural Protein Stability Dataset

Explore at:

Dataset updated

Jan 30, 2024

Dataset provided by

Brocidiacono, Michael
Dieckhaus, Henry
Randolph, Nicholas
Kuhlman, Brian

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Dataset compiled and curated for use in the ThermoMPNN paper: https://doi.org/10.1073/pnas.2314853121:

Dataset for training models for prediction of thermodynamic stability changes (ddG) of protein point mutations given a wildtype protein structure (PDB) file. Data was assembled by matching sequence-based ddG measurements in FireProtDB to structures from the RCSB Protein Data Bank (PDB). For details, see the Methods section of our manuscript.

Citing this work: If you choose to use this dataset for your own research, please cite this repository and the ThermoMPNN paper: https://doi.org/10.1073/pnas.2314853121.

Contents:

pdbs/ directory contains all PDB files

csvs/ directory contains all CSVs with mutation data

csvs/4_fireprotDB_bestpH.csv is the main (full) dataset file with 3,438 mutations across 100 proteins.

csvs/fireprot_splits.pkl contains the dataset splits (train/val/test) used in our study

csvs/splits/ contains csvs for each of the splits (train/val/test/homologue-free) indexed from the full dataset csv.

Important CSV columns:

pdb_id_corrected: corresponds to the PDB in the pdbs/ directory (after curation and disambiguation)

ddG: ddG value for mutation (mutant - WT)

wild_type: wild-type amino acid (1-letter code)

mutation: mutant amino acid (1-letter code)

pdb_position: 0-based index of the mutated residue in the PDB file (may be different from position in the original FireProtDB sequence entry)

Clear search

Close search

Google apps

Main menu

FireProtDB + PDB Structural Protein Stability Dataset

Protein Data Bank (PDB) dataset for peptide design - Dataset - LDM

pdb-rna_secondary_structure

Network Visualization Map Data

pdb-data

India Pdb Export | List of Pdb Exporters & Suppliers

Data from: Redocking the PDB

PDB 8TYZ

Project files provided as supporting information to the manuscript "A deep...

SPERM WHALE MYOGLOBIN H64A N-BUTYL ISOCYANIDE AT PH 9.0

‘PDB Electric Power Load History’ analyzed by Analyst-2

Inspiration

PDB 8TYX

NR2F1 modeling and simulation data

FireProtDB + PDB Structural Protein Stability Dataset