Facebook
TwitterAttribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Mass spectrometry-based proteomics coupled to liquid chromatography has matured into an automatized, high-throughput technology, producing data on the scale of multiple gigabytes per instrument per day. Consequently, an automated quality control (QC) and quality analysis (QA) capable of detecting measurement bias, verifying consistency, and avoiding propagation of error is paramount for instrument operators and scientists in charge of downstream analysis. We have developed an R-based QC pipeline called Proteomics Quality Control (PTXQC) for bottom-up LC–MS data generated by the MaxQuant software pipeline. PTXQC creates a QC report containing a comprehensive and powerful set of QC metrics, augmented with automated scoring functions. The automated scores are collated to create an overview heatmap at the beginning of the report, giving valuable guidance also to nonspecialists. Our software supports a wide range of experimental designs, including stable isotope labeling by amino acids in cell culture (SILAC), tandem mass tags (TMT), and label-free data. Furthermore, we introduce new metrics to score MaxQuant’s Match-between-runs (MBR) functionality by which peptide identifications can be transferred across Raw files based on accurate retention time and m/z. Last but not least, PTXQC is easy to install and use and represents the first QC software capable of processing MaxQuant result tables. PTXQC is freely available at https://github.com/cbielow/PTXQC.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Mass-spectrometry data, MaxQuant ProteinGroups output.
Facebook
TwitterScientific services in the area of OMICS research is becoming increasingly popular. In gen- eral omics research can produce a massive amount of data that can pose a challenge for computing infrastructure. While in the genomics area, many applications can run on Linux nodes the situation in proteomics is different. In proteomics, many applications are optimized to run on Windows computer only. As a sci- entific service provider, a core facility needs reliable, reproducible and easy to use integrated solutions. Liquid chromatography mass spectrometry intensity based label-free quantifica- tion using data-dependent acquisition is a popular approach in proteomics to perform relative quantification of proteins in complex samples. MaxQuant is a widely used software for this type of analysis which has a complex graphical user interface and provides information-rich outputs. We run it in which also includes Scaffold for search result validation and visualization and an R based quality control report generation. Data analysis workflows consists of several components: a workflow engine, compute hosts, and archives. In particular, applications can run on compute hosts, while the data is kept in an archive server. Therefore, the input and output need to be staged to the compute host and the results need to be staged back to the archive. This complexity can be overwhelming for a most common user. These different components have all been integrated into a robust and user-friendly application to process standardized label-free quantification experiments. We integrated MaxQuant as an in-house Software as a Service application so it can be used by any workflow engine in a platform-independent manner. In this manuscript, we provide a technical description of how MaxQuant as software service has been integrated into our heterogeneous compute environment for reproducible and automatic large scale high throughput data processing of label-free quantification experiments. In this Pride dataset we provide four raw files along with the full MaxQuant results, the Scaffold file, the QC-pdf report to have a concrete idea of the potential of our workflow. These data are generated in the FGCZ-course in Nov. 2016 (for further information see: http://www.fgcz.ch/education/genomics-courses01.html).
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Table S3A: Raw MaxQuant data related to all identified and quantified proteins in partially purified CTAP-ERβ nuclear complexes before and after RNAse treatment Table S3B: Statistical analysis of protein levels in CTAP-ERβ nuclear complexes before and after RNase treatment Table S3C: Differentially represented proteins in CTAP-ERβ nuclear complexes before and after RNase treatment (statistically significant).
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
List of identified proteins and their counts; using MaxQuant software for selected MOA proteomics pool identifiers.
Facebook
TwitterMaxDIA is a universal platform for analyzing data-independent acquisition proteomics data within the MaxQuant software environment. Using spectral libraries, MaxDIA achieves cutting-edge proteome coverage with significantly better coefficients of variation in protein quantification than other software. MaxDIA is equipped with accurate false discovery rate estimates on both library-to-DIA match and protein levels, also when using whole-proteome predicted spectral libraries. This is the foundation of discovery DIA – a framework for the hypothesis-free analysis of DIA samples without library and with reliable FDR control. MaxDIA performs three- or four-dimensional feature detection of fragment data and scoring of matches is augmented by machine learning on the features of an identification. MaxDIA’s novel bootstrap-DIA workflow performs multiple rounds of matching with increasing quality of recalibration and stringency of matching to the library. Combining MaxDIA with two new technologies, BoxCar acquisition and trapped ion mobility spectrometry, both lead to deep and accurate proteome quantification.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Table S2, derived from MaxQuant output table Peptides, contains all identified peptide sequences belonging to the proteins of Table S1 in alphabetic order starting with the first amino acid: It also contains other relevant data such as scores, charge states, and mass accuracy. (XLSX 2472Â kb)
Facebook
TwitterAttribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Isobaric labeling has the promise of combining high sample multiplexing with precise quantification. However, normalization issues and the missing value problem of complete n-plexes hamper quantification across more than one n-plex. Here, we introduce two novel algorithms implemented in MaxQuant that substantially improve the data analysis with multiple n-plexes. First, isobaric matching between runs makes use of the three-dimensional MS1 features to transfer identifications from identified to unidentified MS/MS spectra between liquid chromatography–mass spectrometry runs in order to utilize reporter ion intensities in unidentified spectra for quantification. On typical datasets, we observe a significant gain in MS/MS spectra that can be used for quantification. Second, we introduce a novel PSM-level normalization, applicable to data with and without the common reference channel. It is a weighted median-based method, in which the weights reflect the number of ions that were used for fragmentation. On a typical dataset, we observe complete removal of batch effects and dominance of the biological sample grouping after normalization. Furthermore, we provide many novel processing and normalization options in Perseus, the companion software for the downstream analysis of quantitative proteomics results. All novel tools and algorithms are available with the regular MaxQuant and Perseus releases, which are downloadable at http://maxquant.org.
Facebook
TwitterMass spectrometry (MS)-based proteomics is generally performed in a shotgun format, in which as many peptide precursors as possible are selected from full or MS1 scans so that their fragment spectra can be recorded in MS2 scans. While achieving great proteome depths, shotgun proteomics cannot guarantee that each precursor will be measured in each run. In contrast, targeted proteomics aims to reproducibly and sensitively fragment a restricted number of precursors in each run, based on pre-scheduled mass-to-charge and retention time windows. Here we set out to merge these two concepts by a global targeting approach in which an arbitrary number of previously measured precursors is detected in real-time, followed by standard fragmentation or advanced peptide-specific analyses. We made use of a fast application programming interface to a quadrupole Orbitrap instrument and recalibration in mass, retention time and intensity dimensions to predict peptide identity. MaxQuant.Live is freely available and has a graphical user interface to specify many pre-defined data acquisition strategies. Controlling the acquisition with MaxQuant.Live rather than the vendor software, we observed no decline in acquisition speed. The power of our approach is demonstrated with the acquisition of breakdown curves for thousands of precursors of interest. It is also possible to uncover precursors that are not even visible in MS1 scans, using elution time prediction based on co-eluting isotope standards or the auto-adjusted, predicted retention time alone. Finally, we demonstrate that more than 25,000 precursors can be successfully recognized and targeted in single LC-MS runs. We conclude that global targeting combines the advantages of two classical approaches in MS-based proteomics, while expanding the analytical toolbox with many new possibilities.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The files serve as input and intermediate results for a MaxQuant and Msstats training on skin cancer tissues (https://doi.org/10.1016/j.matbio.2017.11.004) in the Galaxy training network (https://training.galaxyproject.org).
Input files: human FASTA database for Maxquant. Annotation file and comparison matrix file for Msstats.
Intermediate result files: MaxQuant protein groups, evidence and PTXQC.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Raw protein lists after MaxQuant identification. Input files for quantitative and qualitative data processing.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The files serve as input and intermediate results for a MaxQuant and MsstatsTMT training on lysine methyl transferase 9 knockdown and control cell proteomics (https://doi.org/10.1186/s12935-020-1141-2) in the Galaxy training network (https://training.galaxyproject.org).
Input files: human FASTA protein database for Maxquant. MaxQuant experimental design template, MSstatsTMT annotation file
Intermediate result files: MaxQuant protein groups and evidence
Facebook
TwitterCumulative malaria parasite exposure in endemic regions often results in the acquisition of partial immunity and asymptomatic infections. There is limited information on how host-parasite interactions mediate maintenance of chronic symptomless infections that sustain malaria transmission. Here, we have determined the gene expression profiles of the parasite population and the corresponding host peripheral blood mononuclear cells (PBMCs) from 21 children (<15 years). We compared children who were defined as uninfected, asymptomatic and those with febrile malaria. Children with asymptomatic infections had a parasite transcriptional profile characterized by a bias toward trophozoite stage (~12 hours-post invasion) parasites and low parasite levels, while earlier ring stage parasites were characteristic of febrile malaria. The host response of asymptomatic children was characterized by downregulated transcription of genes associated with inflammatory responses, compared to children with ..., Proteins were extracted from PBMCs by resuspending the pellet with 5µl of 6M UREA (Thermo scientific). The protein samples were then adjusted with 50mM Triethylamonium bicarbonate (TEAB, Sigma-Aldrich) to 100µl and the protein concentration determined using the Bicinchoninic acid (BCA) protein assay (Thermo scientific). The protein samples were then reduced with 40mM dithiothretol, alkylated with 80mM iodoacetamide in the dark, and quenched with 80mM iodoacetamide at room temperature, followed by digestion with1µg/µl of trypsin (57). Nine pools, each containing 9 samples and 1 control for batch correction, were prepared by combining 1µl aliquots from each sample. The samples were pooled using a custom randomization R script. The pooled samples were then individually labelled using the Tandem Mass Tag (TMT) 10-plex kit (Thermo Scientific) according to the manufacturer’s instructions. One isobaric tag was used solely for the pooled samples and combined with peptides samples labelled with ..., The files can be opened using MaxQuant software, specifically version 2.0.3.0 was used for analysis. Differential protein abundance analysis of MaxQuant output was done using PERSEUS version 2.05.0 software. Protein-protein interaction and Gene ontology analyses was perforened using STRING database version 11.5 (https://string-db.org/)., # Proteome of peripheral mononuclear cells (PBMCs) from asymptomatic malaria and uninfected individuals and the ensuing febrile malaria episodes
Proteins were extracted from peripheral mononuclear cells (PBMCs), pooled using Tandem Mass Tags (TMT) (10-plex) and injected into the LC-MS/MS for proteomics analysis. The output raw files were loaded into MaxQuant software v2.0.3.0 for protein quantification. The output from MaxQuant was then read using PERSEUS software v2.05.0 and differential protein abundance analysis performed. The Proteomics_metadata file contains the metadata that links each sample to the raw data files and the treatment group (condition).
The RAW data files provided contains the output data from the LC-MS/MS per each pool. The pools serve as the input data for MaxQuant software.
The Proteomics_metadata contains the metadata information that links each sample to the condition/treatment group (i.e. asymptomatic, uninfecte...
Facebook
TwitterReanalysis of submission PXD005573 using MaxQuant-DIA and MaxQuant. First complete submission in mzTab for Data independent adquisition.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This table is derived from the MaxQuant output table ProteinGroups and contains the complete list of identified proteins/protein groups including the ones that we did not accept after application of criteria described in Materials and Methods. This table also contains additional data such as the complete set of accession numbers forming one group, the distribution of peptides among the 20 PAGE sections analyzed separately, the calculated molecular weight of each entry, the iBAQ intensity and the percentages calculated from it. (XLSX 265Â kb)
Facebook
TwitterThe GASH/Sal hamster (Genetic audiogenic seizure, Salamanca) is a model of audiogenic seizures with the epileptogenic focus localized in the inferior colliculus (IC). The sound-induced seizures exhibit a short latency (7-9 seconds), which implies innate protein disturbances in the IC as a basis for seizure susceptibility and generation. Here, we aim to study the protein profile in the GASH/Sal IC in comparison to controls. Protein samples from the IC were processed for enzymatic digestion and then analyzed by mass spectrometry in Data-Independent Acquisition mode. After identifying the proteins using the UniProt database, we selected those with differential expression. We identified 5254 proteins, of which 184 were differentially expressed, 126 upregulated and 58 downregulated. Moreover, a small number of proteins were uniquely found in the GASH/Sal or the control. The resuls indicated a protein profile alteration in the epileptogenic nucleus that might underlie the innate occuring audiogenic seizures in the GASH/Sal model.
Facebook
TwitterAttribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Computational analysis of shotgun proteomics data can now be performed in a completely automated and statistically rigorous way, as exemplified by the freely available MaxQuant environment. The sophisticated algorithms involved and the sheer amount of data translate into very high computational demands. Here we describe parallelization and memory optimization of the MaxQuant software with the aim of executing it on a large computer cluster. We analyze and mitigate bottlenecks in overall performance and find that the most time-consuming algorithms are those detecting peptide features in the MS1 data as well as the fragment spectrum search. These tasks scale with the number of raw files and can readily be distributed over many CPUs as long as memory access is properly managed. Here we compared the performance of a parallelized version of MaxQuant running on a standard desktop, an I/O performance optimized desktop computer (“game computer”), and a cluster environment. The modified gaming computer and the cluster vastly outperformed a standard desktop computer when analyzing more than 1000 raw files. We apply our high performance platform to investigate incremental coverage of the human proteome by high resolution MS data originating from in-depth cell line and cancer tissue proteome measurements.
Facebook
TwitterNW-NN-CW-CN (24 h): rice roots that grew at 25°C and 4°C for 24 h after treatment with water and nystose. NW-NN-CW-CN (48 h): rice roots that grew at 25°C and 4°C for 48 h after treatment with water and nystose. NW-NN-CW-CN (recovery): rice roots that grew at 25°C for 7 d and 4°C for 2 d and then at 25°C for 5 d after treatment with water and nystose. (XLSX)
Facebook
TwitterSmall ubiquitin-like modifiers (SUMOs) and ubiquitin are frequent post-translational modifications of proteins that play pivotal roles in all cellular processes. We previously reported mass spectrometry-based proteomics methods that enable profiling of lysines modified by endogenous SUMO or ubiquitin in an unbiased manner, without requiring genetic engineering. Here we investigated the applicability of precursor mass filtering enabled by MaxQuant.Live (MQL) to our SUMO and ubiquitin proteomics workflows, which efficiently avoided sequencing of precursors too small to be modified but otherwise indistinguishable by mass-to-charge ratio. Using peptide mass filtering, we achieved much higher precursor selectivity, ultimately resulting in up to 30% more SUMO and ubiquitin sites identified from replicate samples. Real-time ‘untargeting’ of unmodified peptides by MQL resulted in 90% SUMO-modified precursor selectivity from a 25% pure sample, demonstrating great applicability for digging deeper into ubiquitin-like modificomes. We adapted the mass filtering strategy to the new Exploris 480 mass spectrometer, achieving comparable gains in SUMO precursor selectivity and identification rates. Collectively, mass filtering via MQL significantly increased identification rates of SUMO- and ubiquitin-modified peptides from the exact same samples, without the requirement for prior knowledge or spectral libraries.
Facebook
TwitterAttribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Quantitative mass spectrometry-based proteomics has become a high-throughput technology for the identification and quantification of thousands of proteins in complex biological samples. Two frequently used tools, MaxQuant and MSstats, allow for the analysis of raw data and finding proteins with differential abundance between conditions of interest. To enable accessible and reproducible quantitative proteomics analyses in a cloud environment, we have integrated MaxQuant (including TMTpro 16/18plex), Proteomics Quality Control (PTXQC), MSstats, and MSstatsTMT into the open-source Galaxy framework. This enables the web-based analysis of label-free and isobaric labeling proteomics experiments via Galaxy’s graphical user interface on public clouds. MaxQuant and MSstats in Galaxy can be applied in conjunction with thousands of existing Galaxy tools and integrated into standardized, sharable workflows. Galaxy tracks all metadata and intermediate results in analysis histories, which can be shared privately for collaborations or publicly, allowing full reproducibility and transparency of published analysis. To further increase accessibility, we provide detailed hands-on training materials. The integration of MaxQuant and MSstats into the Galaxy framework enables their usage in a reproducible way on accessible large computational infrastructures, hence realizing the foundation for high-throughput proteomics data science for everyone.
Facebook
TwitterAttribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Mass spectrometry-based proteomics coupled to liquid chromatography has matured into an automatized, high-throughput technology, producing data on the scale of multiple gigabytes per instrument per day. Consequently, an automated quality control (QC) and quality analysis (QA) capable of detecting measurement bias, verifying consistency, and avoiding propagation of error is paramount for instrument operators and scientists in charge of downstream analysis. We have developed an R-based QC pipeline called Proteomics Quality Control (PTXQC) for bottom-up LC–MS data generated by the MaxQuant software pipeline. PTXQC creates a QC report containing a comprehensive and powerful set of QC metrics, augmented with automated scoring functions. The automated scores are collated to create an overview heatmap at the beginning of the report, giving valuable guidance also to nonspecialists. Our software supports a wide range of experimental designs, including stable isotope labeling by amino acids in cell culture (SILAC), tandem mass tags (TMT), and label-free data. Furthermore, we introduce new metrics to score MaxQuant’s Match-between-runs (MBR) functionality by which peptide identifications can be transferred across Raw files based on accurate retention time and m/z. Last but not least, PTXQC is easy to install and use and represents the first QC software capable of processing MaxQuant result tables. PTXQC is freely available at https://github.com/cbielow/PTXQC.