MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Protein-Protein, Genetic, and Chemical Interactions for Starr TN (2022):Deep mutational scans for ACE2 binding, RBD expression, and antibody escape in the SARS-CoV-2 Omicron BA.1 and BA.2 receptor-binding domains. curated by BioGRID (https://thebiogrid.org); ABSTRACT: SARS-CoV-2 continues to acquire mutations in the spike receptor-binding domain (RBD) that impact ACE2 receptor binding, folding stability, and antibody recognition. Deep mutational scanning prospectively characterizes the impacts of mutations on these biochemical properties, enabling rapid assessment of new mutations seen during viral surveillance. However, the effects of mutations can change as the virus evolves, requiring updated deep mutational scans. We determined the impacts of all single amino acid mutations in the Omicron BA.1 and BA.2 RBDs on ACE2-binding affinity, RBD folding, and escape from binding by the LY-CoV1404 (bebtelovimab) monoclonal antibody. The effects of some mutations in Omicron RBDs differ from those measured in the ancestral Wuhan-Hu-1 background. These epistatic shifts largely resemble those previously seen in the Alpha variant due to the convergent epistatically modifying N501Y substitution. However, Omicron variants show additional lineage-specific shifts, including examples of the epistatic phenomenon of entrenchment that causes the Q498R and N501Y substitutions present in Omicron to be more favorable in that background than in earlier viral strains. In contrast, the Omicron substitution Q493R exhibits no sign of entrenchment, with the derived state, R493, being as unfavorable for ACE2 binding in Omicron RBDs as in Wuhan-Hu-1. Likely for this reason, the R493Q reversion has occurred in Omicron sub-variants including BA.4/BA.5 and BA.2.75, where the affinity buffer from R493Q reversion may potentiate concurrent antigenic change. Consistent with prior studies, we find that Omicron RBDs have reduced expression, and identify candidate stabilizing mutations that ameliorate this deficit. Last, our maps highlight a broadening of the sites of escape from LY-CoV1404 antibody binding in BA.1 and BA.2 compared to the ancestral Wuhan-Hu-1 background. These BA.1 and BA.2 deep mutational scanning datasets identify shifts in the RBD mutational landscape and inform ongoing efforts in viral surveillance.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Each antibody dataset is related to a specific reference dataset. Below are samples grouped with their specific reference datasets.
Sample ID Sample Info 211116_3 N_Library4_Reference_01 211116_4 N_Library5_Reference_01 211116_5 N_Lib4_R040 211116_6 N_Lib5_R040
211215_1 N_Library5_Reference_02 211215_2 N05_R040 211215_3 N05_MM05 211215_6 N05_MM08 211215_7 N05_NAb3
220104_06 N_Lib05_Reference_03 220104_07 N05_C518 220104_08 N05_C524 220104_09 N05_C706 220104_12 N05_2F4 220104_13 N05_3C3
220215_29 N05_Reference_04 220215_02 N05_RC17602 220215_04 N05_RC17604 220215_06 N05_R004 220215_14 N05_1C1 220215_16 N05_1A7 220215_22 N05_mAb1 220215_24 N05_mAb2
220225_03 N05_Reference_05 220225_07 Ab166
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Performing a complete deep mutational scan with all single point mutations may not be practical, and may not even be required, especially if predictive computational models can be developed. Computational models are however naive to cellular response in the myriads of assay-conditions. In a realistic paradigm of assay context-aware predictive hybrid models that combine minimal experimental data from deep mutational scans with structure, sequence information and computational models, we define and evaluate different strategies for choosing this minimal set. We evaluated the trivial strategy of a systematic reduction in the number of mutational studies from 85% to 15%, along with several others about the choice of the types of mutations such as random versus site-directed with the same 15% data completeness. Interestingly, the predictive capabilities by training on a random set of mutations and using a systematic substitution of all amino acids to alanine, asparagine and histidine (ANH) were comparable. Another strategy we explored, augmenting the training data with measurements of the same mutants at multiple assay conditions, did not improve the prediction quality. For the six proteins we analyzed, the bin-wise error in prediction is optimal when 50-100 mutations per bin are used in training the computational model, suggesting that good prediction quality may be achieved with a library of 500-1000 mutations.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Underlying data for the manuscript "A deep mutational scanning platform to characterize the fitness landscape of anti-CRISPR proteins".
Contains FACS and NGS read data for the deep mutational scanning experiments carried out on the anti-CRISPR proteins AcrIIA4 and AcrIIA5.
The included tar.gz archive has the following directory structure:
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Protein structure and function can be severely altered by even a single amino acid mutation. Predictions of mutational effects using extensive artificial intelligence (AI)-based models, although accurate, remain as enigmatic as the experimental observations in terms of improving intuitions about the contributions of various factors. Inspired by Lipinski’s rules for drug-likeness, we devise simple thresholding criteria on five different descriptors such as conservation, which have so far been limited to qualitative interpretations such as high conservation implies high mutational effect. We analyze systematic deep mutational scanning data of all possible single amino acid substitutions on seven proteins (25153 mutations) to first define these thresholds and then to evaluate the scope and limits of the predictions. At this stage, the approach allows us to comment easily and with a low error rate on the subset of mutations classified as neutral or deleterious by all of the descriptors. We hope that complementary to the accurate AI predictions, these thresholding rules or their subsequent modifications will serve the purpose of codifying the knowledge about the effects of mutations.
Attribution-NoDerivs 4.0 (CC BY-ND 4.0)https://creativecommons.org/licenses/by-nd/4.0/
License information was derived automatically
Interpretation of disease-causing genetic variants remains a challenge in human genetics. Current costs and complexity of deep mutational scanning methods are obstacles for achieving genome-wide resolution of variants in disease-related genes. Our framework, Saturation Mutagenesis-Reinforced Functional assays (SMuRF), offers simple and cost-effective saturation mutagenesis paired with streamlined functional assays to enhance the interpretation of unresolved variants. Applying SMuRF to neuromuscular disease genes FKRP and LARGE1, we generated functional scores for all possible coding single nucleotide variants, which aid in resolving clinically reported variants of uncertain significance. SMuRF also demonstrates utility in predicting disease severity, resolving critical structural regions, and providing training datasets for the development of computational predictors. Overall, our approach enables variant-to-function insights for disease genes in a cost-effective manner that can be broadly implemented by standard research laboratories.
This dataset contains the designs, plasmid maps, raw results, raw data, raw pictures generated during the development of SMuRF. For application of SMuRF, please refer to the manuscript associated with this dataset.
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Protein–protein interactions (PPIs) are critical for organizing molecules in a cell and mediating signaling pathways. Dysregulation of PPIs is often a key driver of disease. To better understand the biophysical basis of such disease processesand to potentially target themit is critical to understand the molecular determinants of PPIs. Deep mutational scanning (DMS) facilitates the acquisition of large amounts of biochemical data by coupling selection with high throughput sequencing (HTS). The challenging and labor-intensive design and optimization of a relevant selection platform for DMS, however, limits the use of powerful directed evolution and selection approaches. To address this limitation, we designed a versatile new phage-assisted continuous selection (PACS) system using our previously reported proximity-dependent split RNA polymerase (RNAP) biosensors, with the aim of greatly simplifying and streamlining the design of a new selection platform for PPIs. After characterization and validation using the model KRAS/RAF PPI, we generated a library of RAF variants and subjected them to PACS and DMS. Our HTS data revealed positions along the binding interface that are both tolerant and intolerant to mutations, as well as which substitutions are tolerated at each position. Critically, the “functional scores” obtained from enrichment data through continuous selection for individual variants correlated with KD values measured in vitro, indicating that biochemical data can be extrapolated from sequencing using our new system. Due to the plug and play nature of RNAP biosensors, this method can likely be extended to a variety of other PPIs. More broadly, this, and other methods under development support the continued development of evolutionary and high-throughput approaches to address biochemical problems, moving toward a more comprehensive understanding of sequence–function relationships in proteins.
https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
In signaling networks, protein-protein interactions are often mediated by modular domains that bind short linear motifs. The motifs’ sequences affect many factors, among them affinity and specificity, or the ability to bind strongly and to bind the appropriate partners. Using Deep Mutational Scanning to create a mutant library, and protein complementation assays to measure protein-protein interactions, we determined the in vivo binding strength of a library of mutants of a binding motif on the MAP kinase kinase Pbs2, which binds the SH3 domain of the osmosensor protein Sho1 in Saccharomyces cerevisiae. These measurements were made using the full-length endogenous proteins, in their native cellular environment. We find that along with residues within the canonical motif, many mutations in the residues neighboring the motif also modulate binding strength. Interestingly, all Pbs2 mutations which increase affinity are situated outside of the Pbs2 region that interacts with the canonical SH3 binding pocket, suggesting that other surfaces on Sho1 contribute to binding. We use predicted structures to hypothesize a model of binding which involves residues neighboring the canonical Pbs2 motif binding outside of the canonical SH3 binding pocket. We compared this predicted structure with known structures of SH3 domains binding peptides through residues outside of the motif, and put forth possible mechanisms through which Pbs2 can bind specifically to Sho1. We propose that for certain SH3 domain-motif pairs, affinity and specificity are determined by a broader range of sequences than what has previously been considered, potentially allowing easier differentiation between otherwise similar partners. Methods Multiple methods were used for data collection: DHFR-Protein Complementation Assay in a pooled competition assay DHFR-Protein Complementation Assay in individual colonies Growth curves Protein structure predictions using AlphaFold-Multimer In Revised Upload: Sequence and structure alignments using mafft and MUSTANG respectively
https://doi.org/10.5061/dryad.jsxksn0hk
This directory contains initial and intermediate datasets computed by the analysis pipeline. All .rda files can be loaded into R or RStudio.
Initial Data Files
AA.SEQ.rda - Sequence code for every genotype. Row names are the sequence, with the RE listed at the beginning (E: ERE, S: SRE). Columns 1-5 give the RE or amino acid state for each site (X1-X4). Remaining columns are indicator variable for the amino acid states at each site (1: present, 0:absent)
DT.11P.CODING.rda - Initial data file from Starr et al. 2017. Contains genotype sequence; normalized counts of each genotype in each sorting bin for each RE for two replicates (RE, replicate, bin); estimate of number of colony forming units for each genotype, sorting bin, RE, and replicate; estimate of mean fluorescence for each replicate and an estimate based on c...
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Data generated for the manuscript "Protein structural consequences of DNA mutational signatures: A meta-analysis of somatic variants and deep mutational scanning data".
Please consult the README file for detailed description of data files included in this dataset.
Multi-domain enzymes can be regulated both by inter-domain interactions and structural features intrinsic to the catalytic domain. The tyrosine phosphatase SHP2 is a quintessential example of a multi-domain protein that is regulated by inter-domain interactions. This enzyme has a protein tyrosine phosphatase (PTP) domain and two phosphotyrosine-recognition domains (N-SH2 and C-SH2) that regulate phosphatase activity through autoinhibitory interactions. SHP2 is canonically activated by phosphoprotein binding to the SH2 domains, which causes large interdomain rearrangements, but autoinhibition is also disrupted by disease-associated mutations. Many details of the SHP2 activation are still unclear, the structure of the active state remains elusive, and hundreds of human variants of SHP2 have not been functionally characterized. Here, we perform scanning mutagenesis on both full-length SHP2 and its isolated PTP domain to examine mutational effects on inter-domain regulation and catalytic ac..., The molecular dynamics data were generated using the Amber Molecular Dynamics Package, as described in the associated manuscript. Data were processed using the CPPTRAJ program within AmberTools. Deep sequencing data are the result of high-throughput peptide display screens, conducted as described in the manuscript. Data were generated using an Illumina MiSeq or NextSeq instrument. Data were processed in three steps: (1) FLASh (https://ccb.jhu.edu/software/FLASH/(opens in new window)) was used for paired-end read merging, (2) CutAdapt (https://cutadapt.readthedocs.io/en/stable/)(opens in new window) was used to trim flanking sequences, and (3) trimmed sequences were translated and counted using in-house Python scripts (https://github.com/nshahlab/2024_Jiang-et-al_SHP2-DMS), , # Data from: Revealing the principles of inter- and intra-domain regulation in a signaling enzyme via scanning mutagenesis
https://doi.org/10.5061/dryad.83bk3jb18
https://doi.org/10.1101/2024.05.13.593907
March 2023 to May 2024
This dataset contains data from structural and mutational analysis of the signaling enzyme SHP2. The data are clustered into two groups, based on the type of experiment/analysis that generated them. The details of these experiments can be found in the associated preprint. Briefly:
(1) We conducted deep mutational scanning experiments in which we constructed 15 DNA libraries encoding mutations across the SHP2 gene, subjected these to selection for SHP2 function in yeast, and then analyzed the DNA before and after selection by deep sequencing. The data associated with this experiment are FASTQ-format Illumi...
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Individually validated variants of the neuraminidase gene. (XLSX 9 kb)
https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
Multi-domain enzymes can be regulated both by inter-domain interactions and structural features intrinsic to the catalytic domain. The tyrosine phosphatase SHP2 is a quintessential example of a multi-domain protein that is regulated by inter-domain interactions. This enzyme has a protein tyrosine phosphatase (PTP) domain and two phosphotyrosine-recognition domains (N-SH2 and C-SH2) that regulate phosphatase activity through autoinhibitory interactions. SHP2 is canonically activated by phosphoprotein binding to the SH2 domains, which causes large interdomain rearrangements, but autoinhibition is also disrupted by disease-associated mutations. Many details of the SHP2 activation are still unclear, the structure of the active state remains elusive, and hundreds of human variants of SHP2 have not been functionally characterized. Here, we perform scanning mutagenesis on both full-length SHP2 and its isolated PTP domain to examine mutational effects on inter-domain regulation and catalytic activity. Our experiments provide a comprehensive map of SHP2 mutational sensitivity, both in the presence and absence of interdomain regulation. Coupled with molecular dynamics simulations, our investigation reveals novel structural features that govern the stability of the autoinhibited and active states of SHP2. Our analysis also identifies key residues beyond the SHP2 active site that control PTP domain dynamics and intrinsic catalytic activity. This work expands our understanding of SHP2 regulation and provides new insights into SHP2 pathogenicity. Methods The molecular dynamics data were generated using the Amber Molecular Dynamics Package, as described in the associated manuscript. Data were processed using the CPPTRAJ program within AmberTools. The AlphaFold2 model for SHP2 was generated using ColabFold with the default settings (https://colab.research.google.com/github/sokrypton/ColabFold/blob/main/AlphaFold2.ipynb). Deep sequencing data are the result of high-throughput peptide display screens, conducted as described in the manuscript. Data were generated using an Illumina MiSeq or NextSeq instrument. Data were processed in three steps: (1) FLASh (https://ccb.jhu.edu/software/FLASH/(opens in new window)) was used for paired-end read merging, (2) CutAdapt (https://cutadapt.readthedocs.io/en/stable/)(opens in new window) was used to trim flanking sequences, and (3) trimmed sequences were translated and counted using in-house Python scripts (https://github.com/nshahlab/2024_Jiang-et-al_SHP2-DMS).
McCandlish and Stoltzfus gathered data from deep mutational scanning experiments on 12 proteins, comprising 56641 distinct amino acid replacement mutations. By converting fitnesses to within-study quantiles, they combined results from all studies to draw general conclusions about distributions of fitness effects for the 380 different types of possible amino acid changes in proteins. They found that most replacements are neither conservative nor radical, but barely different from the background distribution. The shapes of these distributions can be approximated by a maximum-entropy model with only 1 parameter. This data package makes it possible to reproduce the main calculations used by Stoltzfus and McCandlish. The data also may be useful to researchers carrying out meta-analyses of mutation-scanning experiments or DFE experiments.
https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
Insertions and deletions (InDels) are essential sources of novelty in protein evolution. In RNA viruses, InDels cause dramatic phenotypic changes that contribute to the emergence of viruses with altered immune profiles and host engagement. This work aims to comprehensively quantify the mutational tolerance of an RNA virus to insertion, deletion, and substitution. Using Enterovirus A71 (EV-A71) as a prototype for the Enterovirus A species (EV-A) of picornaviruses, we engineered approximately 45,000 insertions, 6,000 deletions, and 41,000 AA substitutions across the nearly 2,200 coding positions of the EV-A71 proteome, quantifying their effects on viral fitness. In contrast with AA changes, the vast majority of InDels are lethal to virus growth. Those that are tolerated primarily reside in a few hotspot regions. These tolerant sites highlight structurally flexible and mutationally plastic regions of EV-A71 proteins that avoid core structural and functional elements but often overlap with key sites of host- and immune recognition, suggesting a complex evolutionary role for InDels and substitutions at these sites. Phylogenetic analysis examining EV-A species isolated from diverse mammalian hosts reveals that many of the experimentally identified hotspots also correspond to sites of natural InDel diversity, suggesting these hotspots of mutational tolerance in EV-A genomes may have contributed to past phenotypic diversification of EV-A. Insights from this and future mutational scanning studies mapping viral evolutionary potential will inform better epidemiological monitoring and Enterovirus vaccine development. Methods These data were collected by sequencing the input and output libraries from deep mutational, insertional, and deletional scanning experiments. Data was processed by next-gen sequencing pipelines, in-house scripts, and published software to interpret the fitness effects of mutations engineered in the EV-A71 genome. Data was visualized using R packages, including ggplot2. All scripts for the analysis and generation and included here and also available through GitHub (see links in the Related Works section).
"This dataset provides allele counts and raw fastqs for deep mutational scanning of the HIV-1 genes tat and rev when not-overlapped with one another (placed in the nef locus) as described in Fernandes et al. ""Functional segregation of overlapping genes in HIV"" Cell 2016 (in revision). Preselection (input) and post selection (replicate 1/2) files for every possible point mutant of these two HIV proteins from the NL4-3 background are given.Tab delimited files including codon counts across the amplicons are also included and are probably the most useful thing to most researchers. The data here was used to generate Figures 3 and 4 and 7 and might be of general use for people interested in deep mutational scanning, looking for signatures of epistasis in rev or tat, or reanalyzing and mining the data. FAQ: Why do the ends of each amplicon have such variation? In order to increase diversity across the flowcell, I pooled standard primers with N, NN, and NNN extensions to throw amplicons out of phase. When aligning you should trim the ends or ignore them. This means that the overlap between PE's can vary by 3 nt. Why are the filenames not easy to deal with? The filenames are tied to separate MiSeq runs. I hope to clean up the nomenclature and update this entry in the future while preserving the run information. You can get a sense of that as different residues will vary in Q-score, and that is mostly tied to the run they were pooled on and not any interesting biology. While this is makes it a little harder to follow, I think it's good to get a sense that doing this kind of analysis in high-throughput fashion leads to a reasonable amount of failure (i.e. RNA isolation, RT, fail) that led to repetition until we had good data for every position. Can you help me deal with this dataset? Yes. Please email me at jferna10@ucsc.edu, or contact me on twitter @jdf_ev. For reagents please contact Alan Frankel at frankel@cgl.ucsf.edu. Do you have the analysis scripts you used to process the data? Yes they are on github. https://github.com/nbstrauli/allele_frequency_trajectory_sim"
ATGL is the key enzyme in intracellular lipolysis playing a critical role in metabolic and cardiovascular diseases. ATGL is tightly regulated through a known set of protein-protein interaction partners with activating or inhibiting functions in control of lipolysis. However, the binding mode and protein interaction sites of ATGL and its partners are unknown. Using deep mutational protein interaction perturbation scanning we generated comprehensive profiles of single amino acid variants effecting the interactions of ATGL with its regulatory partners: CGI-58, G0S2, PLIN1, PLIN5 and CIDEC. Twenty-three ATGL variants gave a specific interaction perturbation pattern when validated in co-immunoprecipitation experiments in mammalian cells. We identified and characterized eleven, highly selective ATGL “switch” mutations which affect the interaction of one of the five partners without affecting the others. Switch mutations thus provided distinct interaction determinants for ATGL’s key regulatory proteins at an amino acid resolution. When tested for triglyceride hydrolase activity in vitro and lipolysis in cells, the activity patterns of the ATGL switch variants traced to their protein interaction profile. In the context of structural data, the integration of variant binding and activity profiles provided important insights into lipolysis regulation and the impact of mutations in human disease.
Hepatitis B virus (HBV) is a small double-stranded DNA virus that chronically infects 296 million people. Over half of its compact genome encodes protein in two overlapping reading frames, and during evolution, multiple selective pressures can act on shared nucleotides. This study combines an RNA-based HBV cell culture system with deep mutational scanning to uncouple cis- and trans-acting sequence requirements in the HBV genome. The results support a leaky ribosome scanning model for polymerase translation, provide a fitness map of the HBV polymerase at single nucleotide resolution, and identify conserved prolines adjacent to the HBV polymerase termination codon that stall ribosomes. Further experiments indicated that stalled ribosomes tether the nascent polymerase to its template RNA, ensuring cis-preferential RNA packaging and reverse transcription of the HBV genome., , , # Data from: Deep mutational scanning of HBV reveals a mechanism for cis preferential reverse transcription
The purpose of this dataset is to provide the analysis software, the raw pre-processed experimental data files, and the processed result files used in the paper “Deep mutational scanning of HBV reveals a mechanism for cis preferential reverse transcription†.
For detailed information on the experimental design, please refer to the paper. Briefly, mutants of the hepatitis B virus (HBV) were generated (input population) and transfected into cell cultures. In cell culture, HBV mutants were either depleted or enriched based on the effects of their mutations (output population). Afterwards, both input and output populations were sequenced to quantify the enrichment or depletion of each HBV mutant. From these sequencing results, so called codoncounts files were generated using barcoded-subamplicon sequencing softw...
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Protein-Protein, Genetic, and Chemical Interactions for Peterson BG (2023):Deep mutational scanning highlights a new role for cytosolic regions in Hrd1 function. curated by BioGRID (https://thebiogrid.org); ABSTRACT: Misfolded endoplasmic reticulum proteins are degraded through a process called endoplasmic reticulum associated degradation (ERAD). Soluble, lumenal ERAD targets are recognized, retrotranslocated across the ER membrane, ubiquitinated, extracted from the membrane, and degraded by the proteasome using an ERAD pathway containing a ubiquitin ligase called Hrd1. To determine how Hrd1 mediates these processes, we developed a deep mutational scanning approach to identify residues involved in Hrd1 function, including those exclusively required for lumenal degradation. We identified several regions required for different Hrd1 functions. Most surprisingly, we found two cytosolic regions of Hrd1 required for lumenal ERAD substrate degradation. Using in vivo and in vitro approaches, we defined roles for disordered regions between structural elements that were required for Hrd1's ability to autoubiquitinate and interact with substrate. Our results demonstrate that disordered cytosolic regions promote substrate retrotranslocation by controlling Hrd1 activation and establishing directionality of retrotranslocation for lumenal substrate across the endoplasmic reticulum membrane.
Chemokine receptors CXCR4 and CCR5 regulate white blood cell trafficking, and are engaged by the HIV-1 envelope glycoprotein gp120 during infection. We combine directed evolution of CXCR4 and CCR5 libraries comprising nearly all ~7,000 single amino acid substitutions with deep sequencing to define sequence-fitness landscapes for surface expression and ligand interactions. Functional interaction sites are mapped based on conservation; for example, extracellular residues are conserved for binding HIV-1-blocking antibodies, as expected. Chemokine CXCL12 interacts with residues extending asymmetrically into the CXCR4 ligand-binding cavity, and distal mutations within allosteric and G protein coupling sites are identified that enhance chemokine binding. CCR5 residues conserved for gp120 interactions partially overlap with the chemokine-binding site, and gp120 binding is increased by acidic substitutions in the CCR5 N-terminus and extracellular loops. Furthermore, general features are apparent from sequence patterns, including membrane regions that are intolerant to polar mutations, and deleterious cysteine substitutions within extracellular loops. Single-site saturation mutagenesis libraries were constructed of human CXCR4 and CCR5, and expressed in human Expi293F cells (a HEK293 derivative). Cells were evolved by FACS for surface expression and binding to protein ligands. Frequencies of variants in the sorted population (measured from RNA transcripts) were compared to frequencies in the DNA libraries to calculate log(base2) enrichment ratios for all amino acid substitutions. All evolution experiments were in duplicate.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Protein-Protein, Genetic, and Chemical Interactions for Starr TN (2022):Deep mutational scans for ACE2 binding, RBD expression, and antibody escape in the SARS-CoV-2 Omicron BA.1 and BA.2 receptor-binding domains. curated by BioGRID (https://thebiogrid.org); ABSTRACT: SARS-CoV-2 continues to acquire mutations in the spike receptor-binding domain (RBD) that impact ACE2 receptor binding, folding stability, and antibody recognition. Deep mutational scanning prospectively characterizes the impacts of mutations on these biochemical properties, enabling rapid assessment of new mutations seen during viral surveillance. However, the effects of mutations can change as the virus evolves, requiring updated deep mutational scans. We determined the impacts of all single amino acid mutations in the Omicron BA.1 and BA.2 RBDs on ACE2-binding affinity, RBD folding, and escape from binding by the LY-CoV1404 (bebtelovimab) monoclonal antibody. The effects of some mutations in Omicron RBDs differ from those measured in the ancestral Wuhan-Hu-1 background. These epistatic shifts largely resemble those previously seen in the Alpha variant due to the convergent epistatically modifying N501Y substitution. However, Omicron variants show additional lineage-specific shifts, including examples of the epistatic phenomenon of entrenchment that causes the Q498R and N501Y substitutions present in Omicron to be more favorable in that background than in earlier viral strains. In contrast, the Omicron substitution Q493R exhibits no sign of entrenchment, with the derived state, R493, being as unfavorable for ACE2 binding in Omicron RBDs as in Wuhan-Hu-1. Likely for this reason, the R493Q reversion has occurred in Omicron sub-variants including BA.4/BA.5 and BA.2.75, where the affinity buffer from R493Q reversion may potentiate concurrent antigenic change. Consistent with prior studies, we find that Omicron RBDs have reduced expression, and identify candidate stabilizing mutations that ameliorate this deficit. Last, our maps highlight a broadening of the sites of escape from LY-CoV1404 antibody binding in BA.1 and BA.2 compared to the ancestral Wuhan-Hu-1 background. These BA.1 and BA.2 deep mutational scanning datasets identify shifts in the RBD mutational landscape and inform ongoing efforts in viral surveillance.