6 datasets found

Characterizing Changes in the Rate of Protein-Protein Dissociation upon...
plos.figshare.com
txt
Updated Jun 1, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Rudi Agius; Mieczyslaw Torchala; Iain H. Moal; Juan Fernández-Recio; Paul A. Bates (2023). Characterizing Changes in the Rate of Protein-Protein Dissociation upon Interface Mutation Using Hotspot Energy and Organization [Dataset]. http://doi.org/10.1371/journal.pcbi.1003216
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pcbi.1003216
Dataset updated
Jun 1, 2023
Dataset provided by
PLOShttp://plos.org/
Authors
Rudi Agius; Mieczyslaw Torchala; Iain H. Moal; Juan Fernández-Recio; Paul A. Bates
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Predicting the effects of mutations on the kinetic rate constants of protein-protein interactions is central to both the modeling of complex diseases and the design of effective peptide drug inhibitors. However, while most studies have concentrated on the determination of association rate constants, dissociation rates have received less attention. In this work we take a novel approach by relating the changes in dissociation rates upon mutation to the energetics and architecture of hotspots and hotregions, by performing alanine scans pre- and post-mutation. From these scans, we design a set of descriptors that capture the change in hotspot energy and distribution. The method is benchmarked on 713 kinetically characterized mutations from the SKEMPI database. Our investigations show that, with the use of hotspot descriptors, energies from single-point alanine mutations may be used for the estimation of off-rate mutations to any residue type and also multi-point mutations. A number of machine learning models are built from a combination of molecular and hotspot descriptors, with the best models achieving a Pearson's Correlation Coefficient of 0.79 with experimental off-rates and a Matthew's Correlation Coefficient of 0.6 in the detection of rare stabilizing mutations. Using specialized feature selection models we identify descriptors that are highly specific and, conversely, broadly important to predicting the effects of different classes of mutations, interface regions and complexes. Our results also indicate that the distribution of the critical stability regions across protein-protein interfaces is a function of complex size more strongly than interface area. In addition, mutations at the rim are critical for the stability of small complexes, but consistently harder to characterize. The relationship between hotregion size and the dissociation rate is also investigated and, using hotspot descriptors which model cooperative effects within hotregions, we show how the contribution of hotregions of different sizes, changes under different cooperative effects.
MuToN dataset
zenodo.org
zip
Updated May 30, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Pengpai Li; Pengpai Li (2024). MuToN dataset [Dataset]. http://doi.org/10.5281/zenodo.10445253
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.10445253
Dataset updated
May 30, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Pengpai Li; Pengpai Li
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
Dec 31, 2023
Description
This dataset is associated with the MuToN project hosted on GitHub (https://github.com/zpliulab/MuToN). It includes mutation records, PPI complexes, and pre-computed LLM embeddings for the SKEMPI dataset.

Contents:

├── data/
│ ├── skempi_v2.csv # contains the mutation records in SKEMPI dataset.
│ ├── SKEMPI/
│ │ ├── raws/ # contains the PPI complexes in SKEMPI dataset.
│ │ │ ├── 1CSE.pdb
│ │ ├── raw_pdb/ # contains the single wild and mutant structure.
│ │ │ ├── 1CSE_E.pdb # extrated from 1CSE.pdb
│ │ │ ├── 1CSE_I.pdb
│ │ │ ├── 1CSE_I.mut.38_E.pdb # computed mutant structure of 1CSE_E.pdb
│ │ ├── llm_embedding/ # contains the pre-computed LLM embeddings using ESM-2.
│ │ │ ├── 1CSE_E.npy # LLM embedding of 1CSE_E.pdb, shape=(L, 1280)
│ │ │ ├── 1CSE_I.npy
│ │ │ ├── 1CSE_I.mut.38_E.npy
h
atom3d-msp
huggingface.co
Updated Aug 26, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Vector Institute (2024). atom3d-msp [Dataset]. https://huggingface.co/datasets/vector-institute/atom3d-msp
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Aug 26, 2024
Dataset authored and provided by
Vector Institute
Description
Mutation Stability Prediction

Overview

The Mutation Stability Prediction (MSP) task involves classifying whether mutations in the SKEMPI 2.0 database (J. Jankauskaite, B. Jiménez-García et al., 2019) are stabilizing or not using the provided protein structures. Each mutation in the MSP task includes a PDB file with the residue of interest transformed to the specified mutant amino acid as well as the native PDB file. A total of 4148 mutant structures accompanied by their… See the full description on the dataset page: https://huggingface.co/datasets/vector-institute/atom3d-msp.
Relationship between experimental ΔΔG, Δlog10(koff), Δlog10(kon) and change...
plos.figshare.com
xls
Updated Jun 2, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Rudi Agius; Mieczyslaw Torchala; Iain H. Moal; Juan Fernández-Recio; Paul A. Bates (2023). Relationship between experimental ΔΔG, Δlog10(koff), Δlog10(kon) and change in interface hotspot energy (Int_HS_Energy) for 713 mutations in SKEMPI. [Dataset]. http://doi.org/10.1371/journal.pcbi.1003216.t003
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pcbi.1003216.t003
Dataset updated
Jun 2, 2023
Dataset provided by
PLOShttp://plos.org/
Authors
Rudi Agius; Mieczyslaw Torchala; Iain H. Moal; Juan Fernández-Recio; Paul A. Bates
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
(A) Shows PCC between experimental ΔΔG with the respective Δlog10(koff) and Δlog10(kon) for single-point alanine, single-point non-alanine, multi-point and all 713 mutations. (B) Shows PCC between Int_HS_Energy with the respective ΔΔG, Δlog10(koff) and Δlog10(kon) for single-point alanine, single-point non-alanine, multi-point and all 713 mutations. Experimental values for the 713 mutations used here are extracted from SKEMPI [41] and are presented in Dataset S1.
Pearson's Correlation Coefficient (PCC) of hotspot descriptors with...
plos.figshare.com
figshare.com
xls
Updated May 30, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Rudi Agius; Mieczyslaw Torchala; Iain H. Moal; Juan Fernández-Recio; Paul A. Bates (2023). Pearson's Correlation Coefficient (PCC) of hotspot descriptors with experimental Δlog10(koff) for the 713 off-rate mutations in SKEMPI. [Dataset]. http://doi.org/10.1371/journal.pcbi.1003216.t002
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pcbi.1003216.t002
Dataset updated
May 30, 2023
Dataset provided by
PLOShttp://plos.org/
Authors
Rudi Agius; Mieczyslaw Torchala; Iain H. Moal; Juan Fernández-Recio; Paul A. Bates
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Pearson's Correlation Coefficient (PCC) of hotspot descriptors with experimental Δlog10(koff) for the 713 off-rate mutations in SKEMPI.
f
Composition of MIX set.
plos.figshare.com
bin
Updated Sep 18, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Youzhi Zhang; Sijie Yao; Peng Chen (2023). Composition of MIX set. [Dataset]. http://doi.org/10.1371/journal.pone.0290899.t001
Explore at:
binAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0290899.t001
Dataset updated
Sep 18, 2023
Dataset provided by
PLOS ONE
Authors
Youzhi Zhang; Sijie Yao; Peng Chen
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Protein hotspot residues are key sites that mediate protein-protein interactions. Accurate identification of these residues is essential for understanding the mechanism from protein to function and for designing drug targets. Current research has mostly focused on using machine learning methods to predict hot spots from known interface residues, which artificially extract the corresponding features of amino acid residues from sequence, structure, evolution, energy, and other information to train and test machine learning models. The process is cumbersome, time-consuming and laborious to some extent. This paper proposes a novel idea that develops a pre-trained protein sequence embedding model combined with a one-dimensional convolutional neural network, called Embed-1dCNN, to predict protein hotspot residues. In order to obtain large data samples, this work integrates and extracts data from the datasets of ASEdb, BID, SKEMPI and dbMPIKT to generate a new dataset, and adopts the SMOTE algorithm to expand positive samples to form the training set. The experimental results show that the method achieves an F1 score of 0.82 on the test set. Compared with other hot spot prediction methods, our model achieved better prediction performance.
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

Rudi Agius; Mieczyslaw Torchala; Iain H. Moal; Juan Fernández-Recio; Paul A. Bates (2023). Characterizing Changes in the Rate of Protein-Protein Dissociation upon Interface Mutation Using Hotspot Energy and Organization [Dataset]. http://doi.org/10.1371/journal.pcbi.1003216

Characterizing Changes in the Rate of Protein-Protein Dissociation upon Interface Mutation Using Hotspot Energy and Organization

Explore at:

20 scholarly articles cite this dataset (View in Google Scholar)

txtAvailable download formats

Unique identifier

https://doi.org/10.1371/journal.pcbi.1003216

Dataset updated

Jun 1, 2023

Dataset provided by

PLOShttp://plos.org/

Authors

Rudi Agius; Mieczyslaw Torchala; Iain H. Moal; Juan Fernández-Recio; Paul A. Bates

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Predicting the effects of mutations on the kinetic rate constants of protein-protein interactions is central to both the modeling of complex diseases and the design of effective peptide drug inhibitors. However, while most studies have concentrated on the determination of association rate constants, dissociation rates have received less attention. In this work we take a novel approach by relating the changes in dissociation rates upon mutation to the energetics and architecture of hotspots and hotregions, by performing alanine scans pre- and post-mutation. From these scans, we design a set of descriptors that capture the change in hotspot energy and distribution. The method is benchmarked on 713 kinetically characterized mutations from the SKEMPI database. Our investigations show that, with the use of hotspot descriptors, energies from single-point alanine mutations may be used for the estimation of off-rate mutations to any residue type and also multi-point mutations. A number of machine learning models are built from a combination of molecular and hotspot descriptors, with the best models achieving a Pearson's Correlation Coefficient of 0.79 with experimental off-rates and a Matthew's Correlation Coefficient of 0.6 in the detection of rare stabilizing mutations. Using specialized feature selection models we identify descriptors that are highly specific and, conversely, broadly important to predicting the effects of different classes of mutations, interface regions and complexes. Our results also indicate that the distribution of the critical stability regions across protein-protein interfaces is a function of complex size more strongly than interface area. In addition, mutations at the rim are critical for the stability of small complexes, but consistently harder to characterize. The relationship between hotregion size and the dissociation rate is also investigated and, using hotspot descriptors which model cooperative effects within hotregions, we show how the contribution of hotregions of different sizes, changes under different cooperative effects.

Clear search

Close search

Google apps

Main menu

Characterizing Changes in the Rate of Protein-Protein Dissociation upon...

MuToN dataset

Contents:

atom3d-msp

Relationship between experimental ΔΔG, Δlog10(koff), Δlog10(kon) and change...

Pearson's Correlation Coefficient (PCC) of hotspot descriptors with...

Composition of MIX set.

Characterizing Changes in the Rate of Protein-Protein Dissociation upon Interface Mutation Using Hotspot Energy and Organization