Facebook
TwitterAttribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Phosphorylation-driven cell signaling governs most biological functions and is widely studied using mass-spectrometry-based phosphoproteomics. Identifying the peptides and localizing the phosphorylation sites within them from the raw data is challenging and can be performed by several algorithms that return scores that are not directly comparable. This increases the heterogeneity among published phosphoproteomics data sets and prevents their direct integration. Here we compare 22 pipelines implemented in the main software tools used for bottom-up phosphoproteomics analysis (MaxQuant, Proteome Discoverer, PeptideShaker). We test six search engines (Andromeda, Comet, Mascot, MS Amanda, SequestHT, and X!Tandem) in combination with several localization scoring algorithms (delta score, D-score, PTM-score, phosphoRS, and Ascore). We show that these follow very different score distributions, which can lead to different false localization rates for the same threshold. We provide a strategy to discriminate correctly from incorrectly localized phosphorylation sites in a consistent manner across the tested pipelines. The results presented here can help users choose the most appropriate pipeline and cutoffs for their phosphoproteomics analysis.
Facebook
TwitterThis project contains raw data, intermediate files and results used to create the PRIDE human phosphoproteome map. The map is based on joint reanalysis of 110 publicly available human datasets. All relevant datasets were retrieved from the PRIDE database, and after manual curation, only assays that employed dedicated phospho-enrichment sample preparation strategies (e. g. metal oxide affinity chromatography, anti-P-Tyr antibodies, etc.) were included. Raw files were jointly processed with MaxQuant computational platform using standard settings (see Data Processing Protocol). In total, the joint analysis allowed identification of 252,189 phosphosites at 1% peptide spectrum match false discovery rate (PSM FDR) (MQ search results available in ‘txt-100PTM’ folder), of which 121,896 passed the additional 1% site localization FDR threshold (MQ search results available in ‘txt-001PTM’ folder).
Facebook
TwitterAttribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Phosphoproteomics routinely quantifies changes in the levels of thousands of phosphorylation sites, but functional analysis of such data remains a major challenge. While databases like PhosphoSitePlus contain information about many phosphorylation sites, the vast majority of known sites is not assigned to any protein kinase. Assigning changes in the phosphoproteome to the activity of individual kinases therefore remains a key challenge. A recent large-scale study systematically identified in vitro substrates for most human protein kinases. Here, we reprocessed and filtered these data to generate an in vitro Kinase-to-Phosphosite database (iKiP-DB). We show that iKiP-DB can accurately predict changes in kinase activity in published phosphoproteomic data sets for both well-studied and poorly characterized kinases. We apply iKiP-DB to a newly generated phosphoproteomic analysis of SARS-CoV-2 infected human lung epithelial cells and provide evidence for coronavirus-induced changes in host cell kinase activity. In summary, we show that iKiP-DB is widely applicable to facilitate the functional analysis of phosphoproteomic data sets.
Facebook
TwitterAchieving sufficient coverage of regulatory phosphorylation sites by mass spectrometry (MS)-based phosphoproteomics for signaling pathway reconstitution is challenging when analyzing tiny sample amounts. We present a novel hybrid data-independent acquisition (DIA) strategy (hybrid-DIA) that combines targeted and discovery proteomics through an Application Programming Interface (API) to dynamically intercalate DIA scans with accurate triggering of multiplexed tandem MS scans of predefined (phospho)peptide targets. By spiking-in heavy stable isotope labeled phosphopeptide standards covering seven major signaling pathways, we benchmarked hybrid-DIA against state-of-the-art targeted MS methods (i.e. SureQuant) using EGF-stimulated HeLa cells and found the quantitative accuracy and sensitivity to be comparable while hybrid-DIA also profiled the global phosphoproteome. To demonstrate the robustness, sensitivity and potential of hybrid-DIA, we profiled chemotherapeutic agents in single colon carcinoma multicellular spheroids and evaluated the difference of cancer cells in 2D vs 3D culture. Altogether, we showed that hybrid-DIA is the way-to-go method in highly sensitive phospho-proteomics experiments.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Phosphoproteomics dataset containing all phosphoproteins detected (Quant tab), mass spectrometry data for all phosphopeptides detected (pep_quant tab), and phosphoproteins with differential phosphorylation patterns (phosphorylation change of 1.5-fold increase or decrease) between unfed and post-blood meal mosquitoes (remaining tabs). (XLSX)
Facebook
TwitterProtein (de)phosphorylation plays an important role in plants. To provide a robust foundation for subcellular phosphorylation signaling network analysis and kinase-substrate relationships, we performed a meta-analysis of 27 published and unpublished in-house mass spectrometry–based phospho-proteome data sets for Arabidopsis thaliana covering a range of processes, (non)photosynthetic tissue types, and cell cultures. This resulted in an assembly of 60,366 phospho-peptides matching to 8141 nonredundant proteins. Filtering the data for quality and consistency generated a set of medium and a set of high confidence phospho-proteins and their assigned phospho-sites. The relation between single and multiphosphorylated peptides is discussed. The distribution of p-proteins across cellular functions and subcellular compartments was determined and showed overrepresentation of protein kinases. Extensive differences in frequency of pY were found between individual studies due to proteomics and mass s...
Facebook
TwitterHaematococcus pluvialis is a green microalga of commercial interests due to its ability to produce a high value ketocarotenoid, astaxanthin. As a non-model species that lacks a well annotated genome, omics analyses such as transcriptomics and proteomics analysis have often been used together with physiological and biochemical analysis to explore pathways of interest. However, interpretation of these datasets remains challenging. In this work, TMT-based proteomics and phosphoproteomics analyses were conducted on Haematococcus cells grown under favorable conditions (green stage biomass) and high-light stress conditions (red stage biomass). Phosphoproteins were enriched using titanium dioxide before LC-MS/MS analysis. Our proteomics and phosphoproteomics analyses identified 1394 proteins and 569 phosphosites on 366 phosphoproteins, respectively. Of these, 1315 proteins and 396 phosphosites on 314 phosphoproteins were quantifiable, among which 370 proteins and 121 phosphosites on 94 phosphoproteins were differentially expressed. Using an improved analysis pipeline that combines Blast2GO, KEGG, and DAVID to analyze differentially expressed proteins and phosphoproteins, total identified proteins increased from 255 to 322 and total identified phosphoproteins increased from 59 to 70, which were 26.28% and 18.64%, respectively, higher than with the UniProt analysis alone. Using this pipeline, a previously uncharacterized protein and phosphoprotein were identified as an ATPase subunit B and a phosphofructokinase, respectively, and further confirmed with translated genomic and transcriptomic data. This work provides the first example of phosphoproteomics analysis in H. pluvialis, while the proteomics and phosphoproteomics analysis pipelines described here may be useful to analyze omics data from other non-model algal species.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Analysis of high-throughput omics data is one of the most important approaches for obtaining information regarding interactions between proteins/genes. Time-series omics data are a series of omics data points indexed in time order and normally contain more abundant information about the interactions between biological macromolecules than static omics data. In addition, phosphorylation is a key posttranslational modification (PTM) that is indicative of possible protein function changes in cellular processes. Analysis of time-series phosphoproteomic data should provide more meaningful information about protein interactions. However, although many algorithms, databases, and websites have been developed to analyze omics data, the tools dedicated to discovering molecular interactions from time-series omics data, especially from time-series phosphoproteomic data, are still scarce. Moreover, most reported tools ignore the lag between functional alterations and the corresponding changes in protein synthesis/PTM and are highly dependent on previous knowledge, resulting in high false-positive rates and difficulties in finding newly discovered protein–protein interactions (PPIs). Therefore, in the present study, we developed a new method to discover protein–protein interactions with the delayed comparison and Apriori algorithm (DCAA) to address the aforementioned problems. DCAA is based on the idea that there is a lag between functional alterations and the corresponding changes in protein synthesis/PTM. The Apriori algorithm was used to mine association rules from the relationships between items in a dataset and find PPIs based on time-series phosphoproteomic data. The advantage of DCAA is that it does not rely on previous knowledge and the PPI database. The analysis of actual time-series phosphoproteomic data showed that more than 68% of the protein interactions/regulatory relationships predicted by DCAA were accurate. As an analytical tool for PPIs that does not rely on a priori knowledge, DCAA should be useful to predict PPIs from time-series omics data, and this approach is not limited to phosphoproteomic data.
Facebook
TwitterMass spectrometry has revolutionized cell signaling research by vastly simplifying the analysis of many thousands of phosphorylation sites in the human proteome. Defining the cellular response to perturbations in space and time is crucial for further illuminating functionality of the phosphoproteome. Here we describe µPhos (‘microPhos’), an accessible phosphoproteomics platform that permits phosphopeptide enrichment from 96-well cell culture experiments in <8 hours total processing time. By minimizing transfer steps and reducing liquid volumes to <100 µL, we demonstrate increased sensitivity, >90% selectivity, and excellent quantitative reproducibility. Employing highly sensitive trapped ion mobility mass spectrometry, we quantify >30,000 unique phosphopeptides in a human cancer cell line using 20 µg starting material, and confidently localize >10,000 phosphsites from 1 µg. This depth covers key signaling pathways, rendering sample-limited applications and perturbation experiments with hundreds of samples viable as we demonstrate by profiling the time-resolved response of leukemia cells to tyrosine kinase inhibitors.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Descriptions of protein function were compiled from UniProt [80] and/or the GeneCards Human Gene Database (https://www.genecards.org) [81]. (XLSX)
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
ns, not significant;-, not applicableComparison of CLUE with alternative approaches on the two phosphoproteomics datasets.
Facebook
TwitterMass spectrometry-based phosphoproteomics has transformed our ability to profile phosphorylation-based signalling in tissues and cells on a global scale. To infer the action of kinases and signalling pathways in phosphoproteomic experiments, we present PhosR, a set of tools and methodologies implemented in a suite of R packages for the comprehensive analysis of phosphoproteomic data. By applying to both published and new phosphoproteomic datasets, we illustrate PhosR in data imputation and normalisation using a novel set of ‘stably phosphorylated sites’ and in functional analysis for inferring kinase activities and signalling pathways. In particular, we introduce a ‘signalome’ construction method for identifying a collection of signalling modules that allow us to summarise and visualise the interaction of kinases and their collective action on signal transduction. Together, our data and findings demonstrate the utility of PhosR in processing and generating novel biological knowledge from MS-based phosphoproteomic data.
Facebook
TwitterQuantitative proteomics generates large datasets with increasing depth and quantitative information. Even after data processing and statistical analysis, interpreting the results and relating their significance back to the system of study remains challenging. Often, this process is performed by scientists with expertise in their field, but limited experience in proteomic or phosphoproteomic analysis. We developed a set of tools for simple, interactive exploration of phosphoproteomics data that can be easily interpreted into biological knowledge. These tools are designed to expedite the processes of reviewing raw data from statistical output, identifying and verifying enriched sequence motifs, and viewing the data from the perspective of functional pathways. Here, we present the workflow and demonstrate its functionality by analyzing a phosphoproteomic data set from two lymphoma cell lines treated with kinase inhibitors.
Facebook
TwitterFollowing the kinase assay linked phosphoproteomics strategy (KALIP) (Xue, L et al., 2012), we used extracellular signal-regulated kinases 1 (ERK1) to phosphorylate the HEK293 cell lysate under the in vitro kinase assay condition. The phosphorylated proteins were then isolated and identified by mass spectrometry. The in vitro phosphorylated proteins with new phosphates were further overlapped with reported in vivo ERK1-dependent phosphoproteomics data for the identification of bona fide direct substrates of ERK1. In total, we identified 27 direct substrates of ERK1. Data analysis procedure: Raw MS files from the LTQ-Orbitrap-Velos were analyzed by Proteome Discoverer 1.3. MS/MS spectra were searched against the IPI-human database (version 3.83) containing both forward and reverse protein sequences by the SEQUEST search engine. The false discovery rate (FDR) was set to 0.01 on the peptide level. Ingenuity Pathway Analysis (IPA) was applied for the functional annotation.
Facebook
TwitterAlzheimer’s disease (AD) is a form of dementia characterized by amyloid-β plaques and Tau neurofibrillary tangles that progressively disrupt neural circuits in the brain. Using mass spectrometry, we performed a combined analysis of the tyrosine, serine, and threonine phosphoproteome, and proteome of post-mortem brain tissue from AD patients and aged matched controls. We used a data-centric approach to identify co-correlated signaling networks associated with cellular and pathological changes. We identified two independent pathology clusters that were associated with Tau and oligodendrocyte pathologies. We observed phosphorylation sites on known Tau-kinases as well as other novel signaling factors that were associated with these clusters. Together, these results build a map of pathology-associated phosphorylation signaling activation events occurring in AD.
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Protein phosphatase 2A (PP2A) is a family of conserved serine/threonine phosphatases involved in several essential aspects of cell growth and proliferation. PP2ACdc55 phosphatase has been extensively related to cell cycle events in budding yeast, however few PP2ACdc55 substrates have been identified. Here, we performed a quantitative mass spectrometry approach to reveal new substrates of PP2ACdc55 phosphatase and new PP2A-related processes in mitotic arrested cells. We identified 62 statistically significant PP2ACdc55 substrates involved mainly in actincytoskeleton organization. In addition, we validated new PP2ACdc55 substrates such as Slk19 and Lte1, involved in early and late anaphase pathways, and Zeo1, a component of the cell wall integrity pathway. Finally, we constructed docking models of Cdc55 and its substrate Mob1. We found that the predominant interface on Cdc55 is mediated by a protruding loop consisting of residues 84-90, thus highlighting the relevance of these aminoacids for substrate interaction. We used phosphoproteomics of Cdc55 deficient cells to uncover new PP2ACdc55substrates and functions in mitosis. As expected, several hyperphosphorylated proteins corresponded to Cdk1-dependent substrates, although other kinases' consensus motifs were also enriched in our dataset, suggesting that PP2ACdc55 counteracts and regulates other kinases distinct from Cdk1. Indeed, Pkc1 emerged as a novel node of PP2ACdc55 regulation, highlighting a major role of PP2ACdc55 in actin cytoskeleton and cytokinesis, gene ontology terms significantly enriched in the PP2ACdc55- dependent phosphoproteome.
Facebook
TwitterMass spectrometry has transformed the field of cell signalling by enabling global studies of dynamic protein phosphorylation (‘phosphoproteomics’). Recent developments are enabling increasingly sophisticated phosphoproteomics studies, but practical challenges remain. The EasyPhos workflow addresses these, and is sufficiently streamlined to enable analysis of hundreds of phosphoproteomes at a depth of >10,000 quantified phosphorylation sites. Here we present a detailed and updated protocol that further ensures high performance in sample-limited conditions, while also reducing sample preparation time. By eliminating protein precipitation steps and performing the entire protocol including digestion in a single 96-well plate, we now greatly minimize opportunities for sample loss and variability. This results in very high reproducibility and low sample requirements of 200 μg of protein starting material or less. After cell culture or tissue collection, the protocol takes 1 d, whereas mass spectrometry measurements require 1 h per sample. Applied to glioblastoma cells acutely treated with EGF, EasyPhos quantified 20,132 distinct phosphopeptides from 200 μg protein in less than one day of measurement time, revealing thousands of EGF-regulated phosphorylation events.
Facebook
TwitterSignals that control response to stimuli and cellular function are transmitted through dynamic phosphorylation of thousands of proteins by protein kinases. Many techniques have been developed to study phosphorylation dynamics, including several mass spectrometry (MS)-based methods. Over the last few decades, substantial developments have been made in MS techniques for the large-scale identification of proteins and their post-translational modifications. Nevertheless, all of the current MS-based techniques for quantifying protein phosphorylation dynamics rely on the measurement of changes in peptide abundance levels and many methods suffer from low confidence in phosphopeptide identification due to poor fragmentation. Here we have optimized an approach for the Stable Isotope Labeling of Amino acids by Phosphate (SILAP) using [-18O4]ATP in nucleo to determine global site-specific phosphorylation rates. The advantages of this metabolic labeling technique are: increased confidence in phosphorylated peptide identification, direct labeling of phosphorylation sites, measurement phosphorylation rates, and the identification of actively phosphorylated sites in a cell-like environment. In this study we calculated approximate rate constants for over 500 phosphorylation sites based on labeling progress curves. We measured a wide range of phosphorylation rate constants from 0.34 min-1 to 0.001 min-1. Finally, we applied SILAP to determine sites that have different phosphorylation kinetics during G1/S and M phase. We found that most sites have very similar phosphorylation rates under both conditions; however a small subset of sites on proteins involved in the mitotic spindle were more actively phosphorylated during M phase, while proteins involved in DNA replication and transcription were more actively phosphorylated during G1/S phase.
Facebook
TwitterIncreasing number of studies report the relevance of protein Ser/Thr/Tyr phosphorylation in bacterial physiology, yet the analysis of this type of modification in bacteria still presents a considerable challenge. Unlike in eukaryotes, where tens of thousands of phosphorylation events likely occupy more than two thirds of the proteome, the abundance of protein phosphorylation is much lower in bacteria. Even the state-of-the-art phosphopeptide enrichment protocols fail to remove the high background of abundant unmodified peptides, leading to low signal intensity and undersampling of phosphopeptide precursor ions in consecutive data-dependent MS runs. Consequently, large-scale bacterial phosphoproteomic datasets often suffer from poor reproducibility and a high number of missing values. Here we explore the application of parallel reaction monitoring (PRM) on a Q Exactive mass spectrometer in bacterial phosphoproteome analysis, focusing especially on run-to-run sampling reproducibility. In multiple measurements of identical phosphopeptide-enriched samples, we show that PRM outperforms DDA in terms of detection frequency, reaching almost complete sampling efficiency, compared to 20% in DDA. We observe a similar trend over multiple rounds of (heterogeneous) phosphopeptide-enriched samples and conclude that PRM is a method of choice in bacterial phosphoproteomics projects where reproducible detection and quantification of a relatively small set of phosphopeptides is desired.
Facebook
TwitterAttribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Liquid chromatography tandem mass spectrometry (LC–MS/MS) has been the most widely used technology for phosphoproteomics studies. As an alternative to database searching and probability-based phosphorylation site localization approaches, spectral library searching has been proved to be effective in the identification of phosphopeptides. However, incompletion of experimental spectral libraries limits the identification capability. Herein, we utilize MS/MS spectrum prediction coupled with spectral matching for site localization of phosphopeptides. In silico MS/MS spectra are generated from peptide sequences by deep learning/machine learning models trained with nonphosphopeptides. Then, mass shift according to phosphorylation sites, phosphoric acid neutral loss, and a “budding” strategy are adopted to adjust the in silico mass spectra. In silico MS/MS spectra can also be generated in one step for phosphopeptides using models trained with phosphopeptides. The method is benchmarked on data sets of synthetic phosphopeptides and is used to process real biological samples. It is demonstrated to be a method requiring only computational resources that supplements the probability-based approaches for phosphorylation site localization of singly and multiply phosphorylated peptides.
Facebook
TwitterAttribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Phosphorylation-driven cell signaling governs most biological functions and is widely studied using mass-spectrometry-based phosphoproteomics. Identifying the peptides and localizing the phosphorylation sites within them from the raw data is challenging and can be performed by several algorithms that return scores that are not directly comparable. This increases the heterogeneity among published phosphoproteomics data sets and prevents their direct integration. Here we compare 22 pipelines implemented in the main software tools used for bottom-up phosphoproteomics analysis (MaxQuant, Proteome Discoverer, PeptideShaker). We test six search engines (Andromeda, Comet, Mascot, MS Amanda, SequestHT, and X!Tandem) in combination with several localization scoring algorithms (delta score, D-score, PTM-score, phosphoRS, and Ascore). We show that these follow very different score distributions, which can lead to different false localization rates for the same threshold. We provide a strategy to discriminate correctly from incorrectly localized phosphorylation sites in a consistent manner across the tested pipelines. The results presented here can help users choose the most appropriate pipeline and cutoffs for their phosphoproteomics analysis.