Data dependent acquisition (DDA) is the method of choice for mass spectrometry based proteomics discovery experiments, data-independent acquisition (DIA) is steadily becoming more important. One of the most important requirement to perform a DIA analysis is the availability of spectral libraries for the peptide identification and quantification. Several researches were already conducted regarding the creation of spectral libraries from DDA analyses and obtaining identifications with these in DIA measurements. But so far only few experiments were conducted, to estimate the effect of these libraries on the quantitative level. In this work we created a spike-in gold standard dataset with known contents and ratios of proteins in a complex sample matrix. With this dataset, we first created spectral libraries using different sample preparation approaches with and without sample prefractionation on peptide and protein level. Two different search engines were used for protein identification. In total, five different spike-in states were compared with DIA analyses, comparing eight different spectral libraries generated by varying approaches and one library free method, as well as one default DDA analysis. Not only the number of identifications on peptide and protein level in the spectral libraries and the corresponding analyses was inspected, but also the number of expected and identified significant quantifications and their ratios were thoroughly examined. We found, that while libraries of prefractionationed samples are generally larger, the actually yielded identifications are not increased compared to repetitive non-fractionated measurements. Furthermore, we show that the accuracy of the quantifications is also highly dependent on the applied spectra library and also whether the peptide or protein level is analysed. Overall, the reproducibility and accuracy of DIA is superior to DDA in all analysed approaches.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Analysis of DDA data searched with MaxQuant default values except ≥2 unique peptides.
State-of-the-art proteomics-grade mass spectrometers can measure peptide precursors and their fragments with ppm mass accuracy at sequencing speeds of tens of peptides per second with attomolar sensitivity. Here we describe a compact and robust quadrupole-orbitrap mass spectrometer equipped with a front-end High Field Asymmetric Waveform Ion Mobility Spectrometry (FAIMS) Interface. The performance of the Orbitrap Exploris 480 mass spectrometer is evaluated in data-dependent acquisition (DDA) and data-independent acquisition (DIA) modes in combination with FAIMS. We demonstrate that different compensation voltages (CVs) for FAIMS are optimal for DDA and DIA, respectively. Combining DIA with FAIMS using single CVs, the instrument surpasses 2500 peptides identified per minute. This enables quantification of >5000 proteins with short online LC gradients delivered by the Evosep One LC system allowing acquisition of 60 samples per day. The raw sensitivity of the instrument is evaluated by analyzing 5 ng of a HeLa digest from which >1000 proteins were reproducibly identified with 5 minute LC gradients using DIA-FAIMS. To demonstrate the versatility of the instrument we recorded an organ-wide map of proteome expression across 12 rat tissues quantified by tandem mass tags and label-free quantification using DIA with FAIMS to a depth of >10,000 proteins.
Mass spectrometry (MS)-based proteomics aims to characterize comprehensive proteomes in a fast and reproducible manner. Here, we present an ultra-fast scanning data-independent acquisition (DIA) strategy consisting on 2-Th precursor isolation windows, dissolving the differences between data-dependent and independent methods. This is achieved by pairing a Quadrupole Orbitrap mass spectrometer with the asymmetric track lossless (Astral) analyzer that provides >200 Hz MS/MS scanning speed, high resolving power and sensitivity, as well as low ppm-mass accuracy. Narrowwindow DIA enables profiling of up to 100 full yeast proteomes per day, or ~10,000 human proteins in half-an-hour. Moreover, multi-shot acquisition of fractionated samples allows comprehensive coverage of human proteomes in ~3h, showing comparable depth to next-generation RNA sequencing and with 10x higher throughput compared to current state-of-the-art MS. High quantitative precision and accuracy is demonstrated with high peptide coverage in a 3-species proteome mixture, quantifying 14,000+ proteins in a single run in half-an-hour.
Shotgun proteomics is one of the key "omics" methods, the methodology of which is rapidly developing. Mass spectrometers of the TimsToF series (Bruker) with ion mobility are one of the young technological platforms for shotgun proteomics in which both data dependent (DDA) and data independent acquisition (DIA) proteomics methods might be performed. However, only a few comparisons of the effectiveness of DDA and DIA proteomics on TimsToF have been published, carried out mainly on the test samples. From the other hand, peculiarities of osteogenic differentiation of human valve interstitial cells (VICs) are fruitful therapeutic target for calcific aortic valve disease (CAVD) treatment. Still, there is no data whether pathological osteogenic differentiation of VICs similar to normal ossification. Combining this technical and biological tasks we performed comparative proteomics analysis of osteogenic differentiation of human VICs and osteoblasts by DIA- and DDA-PASEF proteomics on TimsToF Pro.
This dataset consists of 44 raw MS files, comprising 27 DIA (SWATH) and 15 DDA runs on a TripleTOF 5600 and of two raw mass spectrometry files acquired on a Q Exactive. The composition of the dataset is described in the manuscript by Tsou et al., titled: "DIA-Umpire: comprehensive computational framework for data independent acquisition proteomics", Nature Methods, in press Raw files are deposited here in ProteomeXchange and are associated with the DIA-Umpire processed data. All DIA-Umpire processed results for each sample together with DDA results are deposited in separated folders. Also see the "DataSampleID.xlsx" associated with this Readme file. Internal reference from the Gingras lab ProHits implementation: Project 94, Export version VS2 (Tsou_DIA-Umpire)
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Analysis of DDA data searched with MaxQuant default values.
MaxDIA is a universal platform for analyzing data-independent acquisition proteomics data within the MaxQuant software environment. Using spectral libraries, MaxDIA achieves cutting-edge proteome coverage with significantly better coefficients of variation in protein quantification than other software. MaxDIA is equipped with accurate false discovery rate estimates on both library-to-DIA match and protein levels, also when using whole-proteome predicted spectral libraries. This is the foundation of discovery DIA – a framework for the hypothesis-free analysis of DIA samples without library and with reliable FDR control. MaxDIA performs three- or four-dimensional feature detection of fragment data and scoring of matches is augmented by machine learning on the features of an identification. MaxDIA’s novel bootstrap-DIA workflow performs multiple rounds of matching with increasing quality of recalibration and stringency of matching to the library. Combining MaxDIA with two new technologies, BoxCar acquisition and trapped ion mobility spectrometry, both lead to deep and accurate proteome quantification.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Analysis of DIA data from 2000 most abundant DDA-identified proteins.
To examine the different mass spectrometry approaches to monitoring kinases after enrichment with desthiobiotinylating probes for activity-based protein profiling (ABPP), two experiments were performed with H1993 lung cancer cells. First, cell lysates were pre-treated with DMSO vehicle, dasatinib, or erlotinib prior to addition of the ATP probe for ABPP to compare the differences in kinase labeling associated with examples of kinase inhibitors that vary in target selectivity. Then, to examine changes in cellular signaling, H1993 cells were treated with vehicle controls, BEX-235 (PI3K inhibitor), or Crizotinib. LC-MS/MS using data dependent acquisition, data-independent acquisition (pSMART), parallel reaction monitoring, and selected reaction monitoring (or multiple reaction monitoring) mass spectrometry were used to detect and relatively quantify the desthiobiotinylated kinase peptides.
Quality control (QC) in mass spectrometry (MS)-based proteomics is mainly based on data-dependent acquisition (DDA) analysis of standard samples. Here, we collected 2638 files acquired by data independent acquisition (DIA) and paired DDA files from mouse liver digests using 21 mass spectrometers across nine laboratories over 31 months. Our data showed that DIA-based LC-MS/MS related consensus QC metric is more sensitive than DDA-based QC in detecting MS status changes. We then optimized 15 DIA-QC metrics, and invited to manually assess the quality of 2638 DIA files generated by 21 mass spectrometers based on each metric. Based on the annotation results, we developed an AI model for DIA-based QC in the training set of 2059 DIA files, and predicted the liquid chromatography (LC) performance with an AUC of 0.91 and the MS performance with an AUC of 0.97 in an independent validation dataset (n = 523). Finally, we developed an offline software called iDIA-QC for convenient adoption of this methodology for LC-MS QC
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Data-independent acquisition (DIA) is a promising technique for the proteomic analysis of complex protein samples. A number of studies have claimed that DIA experiments are more reproducible than data-dependent acquisition (DDA), but these claims are unsubstantiated since different data analysis methods are used in the two methods. Data analysis in most DIA workflows depends on spectral library searches, whereas DDA typically employs sequence database searches. In this study, we examined the reproducibility of the DIA and DDA results using both sequence database and spectral library search. The comparison was first performed using a cell lysate and then extended to an interactome study. Protein overlap among the technical replicates in both DDA and DIA experiments was 30% higher with library-based identifications than with sequence database identifications. The reproducibility of quantification was also improved with library search compared to database search, with the mean of the coefficient of variation decreasing more than 30% and a reduction in the number of missing values of more than 35%. Our results show that regardless of the acquisition method, higher identification and quantification reproducibility is observed when library search was used.
Data dependent acquisition (DDA) is the method of choice for mass spectrometry based proteomics discovery experiments, data-independent acquisition (DIA) is steadily becoming more important. One of the most important requirement to perform a DIA analysis is the availability of spectral libraries for the peptide identification and quantification. Several researches were already conducted regarding the creation of spectral libraries from DDA analyses and obtaining identifications with these in DIA measurements. But so far only few experiments were conducted, to estimate the effect of these libraries on the quantitative level. In this work we created a spike-in gold standard dataset with known contents and ratios of proteins in a complex sample matrix. With this dataset, we first created spectral libraries using different sample preparation approaches with and without sample prefractionation on peptide and protein level. Two different search engines were used for protein identification. In total, five different spike-in states were compared with DIA analyses, comparing eight different spectral libraries generated by varying approaches and one library free method, as well as one default DDA analysis. Not only the number of identifications on peptide and protein level in the spectral libraries and the corresponding analyses was inspected, but also the number of expected and identified significant quantifications and their ratios were thoroughly examined. We found, that while libraries of prefractionationed samples are generally larger, the actually yielded identifications are not increased compared to repetitive non-fractionated measurements. Furthermore, we show that the accuracy of the quantifications is also highly dependent on the applied spectra library and also whether the peptide or protein level is analysed. Overall, the reproducibility and accuracy of DIA is superior to DDA in all analysed approaches.
In Dendritic cells (DC), the MHC II eluted immunopeptidome reflects the antigenic composition of the microenvironment. Proteins are transported and processed into peptides in endosomal MHC II compartments through autophagy or phagocytosis; extracellular peptides can also directly bind MHC II proteins at the cell surface. Altogether, these mechanisms allow DC to sample both the intra and extracellular environment. With an increase in mass spectrometry sensitivity and accuracy, we can now finally tackle important questions on the nature and plasticity of the MHC-II immunopeptidome in health and disease. Presented epitopes, neoepitopes, and PTM-modified epitopes can be quantitatively and qualitatively analyzed to provide a comprehensive picture of DC role in immunosurveillance. To determine whether the redox metabolic conditions induce an altered spectrum of presented peptides, we eluted immunoaffinity-purified I-Ab from conventional dendritic cells isolated from control B6 or obese Ob/Ob mice, and analyzed MHC-II-associated peptides by LC/MS/MS using combined data-dependent (DDA) and data-independent acquisition (DIA) approaches. We analyzed the DIA data by employing a reference spectral library consisting of all peptides identified by database matching in the pool of spectra from combined DDA dataset, thus allowing a direct label-free quantitation of relative abundances between the two sample categories. The quantitative analysis of the I-Ab-eluted immunopeptidomes pinpoint important differences in peptide presentation and epitope selection in obese mice.
We generated two comprehensive large-scale proteomics datasets with deliberate batch effects using the latest parallel accumulation-serial fragmentation in both Data-Dependent and Data-Indepentdent Acquisition modes. This dataset contain a balanced two-class design (cell lines: A549 vs K562), allowing for investigating mixed effects from class, batch and acquisition method. Investigators can also compare and integrate DDA and DIA platforms, delve into the various patterns and mechanisms of missing values, benchmark batch effects correction algorithms and assess confounding between different technical issues.
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
The promises of data-independent acquisition (DIA) strategies are a comprehensive and reproducible digital qualitative and quantitative record of the proteins present in a sample. We developed a fast and robust DIA method for comprehensive mapping of the urinary proteome that enables large scale urine proteomics studies. Compared to a data-dependent acquisition (DDA) experiments, our DIA assay doubled the number of identified peptides and proteins per sample at half the coefficients of variation observed for DDA data (DIA = ∼8%; DDA = ∼16%). We also tested different spectral libraries and their effects on overall protein and peptide identifications and their reproducibilities, which provided clear evidence that sample type-specific spectral libraries are preferred for reliable data analysis. To show applicability for biomarker discovery experiments, we analyzed a sample set of 87 urine samples from children seen in the emergency department with abdominal pain. The whole set was analyzed with high proteome coverage (∼1300 proteins/sample) in less than 4 days. The data set revealed excellent biomarker candidates for ovarian cyst and urinary tract infection. The improved throughput and quantitative performance of our optimized DIA workflow allow for the efficient simultaneous discovery and verification of biomarker candidates without the requirement for an early bias toward selected proteins.
Source code and example dataset for LipidMS v3.0.3: a commercially available pooled human serum sample was analyzed in positive and negative detection modes and using MS1, DIA and DDA approaches. The obtained datasets were processed using LipidMS v3.0, MS-DIAL v4.80 or a combination of data pre-processing in XCMS v3.16 and lipid annotation in LipidMS v3.0. This repository contains: - Raw data for positive and negative polarities using MS scan, DIA and DDA acquisition modes. - R scripts for processing with LipidMS v3.0.3 and XCMS v3.16.1 and parameters used for processing with MS-DIAL v4.80. - Source code for LipidMS v3.0.3. - Results obtained for the 3 different softwares employed. - Tutorials for LipidMS R package and online application. - Human pooled serum analysis Raw data for positive and negative polarities using MS scan, DIA and DDA acquisition modes for a human pooled serum sample with or without the addition of 68 lipid standars Results for the data processing and annotation of the lipid standards using LipidMS 3.0, XCMS 3.16 and MS-DIAL 4.80 Results for the manual curation of the total lipid annotations provided by both LipidMS 3.0 and MS-DIAL 4.80
The integration of proteomic datasets, generated by non-cooperating laboratories using different LC-MS/MS setups can overcome limitations in statistically underpowered sample cohorts but has not been demonstrated to this day. In proteomics, differences in sample preservation and preparation strategies, chromatography and mass spectrometry approaches and the used quantification strategy distort protein abundance distributions in integrated datasets. The Removal of these technical batch effects requires setup-specific normalization and strategies that can deal with missing at random (MAR) and missing not at random (MNAR) type values at a time. Algorithms for batch effect removal, such as the ComBat-algorithm, commonly used for other omics types, disregard proteins with MNAR missing values and reduce the informational yield and the effect size for combined datasets significantly. Here, we present a strategy for data harmonization across different tissue preservation techniques, LC-MS/MS instrumentation setups and quantification approaches. To enable batch effect removal without the need for data reduction or error-prone imputation we developed an extension to the ComBat algorithm, ´ComBat HarmonizR, that performs data harmonization with appropriate handling of MAR and MNAR missing values by matrix dissection The ComBat HarmonizR based strategy enables the combined analysis of independently generated proteomic datasets for the first time. Furthermore, we found ComBat HarmonizR to be superior for removing batch effects between different Tandem Mass Tag (TMT)-plexes, compared to commonly used internal reference scaling (iRS). Due to the matrix dissection approach without the need of data imputation, the HarmonizR algorithm can be applied to any type of -omics data while assuring minimal data loss
Opportunistic infections of the respiratory tract often succeed under a weakened immune response caused by an underlying illness or hospitalization. The human fungal pathogen, Cryptococcus neoformans, and the bacterial pathogen, Klebsiella pneumoniae, are both well-characterized microbes that cause severe infections within immunocompromised individuals. In this study, we simulate a concentration-dependent pulmonary co infection of a bacterial and fungal pathogen, and profile the proteomic changes by DDA vs. DIA. Dual perspective profiling provides new insights into host defense regulation of infection and pathogenic mechanisms of invasion.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
mProphet features for model describing DIA peptide data for 2000 most abundant DDA-identified proteins.
Data dependent acquisition (DDA) is the method of choice for mass spectrometry based proteomics discovery experiments, data-independent acquisition (DIA) is steadily becoming more important. One of the most important requirement to perform a DIA analysis is the availability of spectral libraries for the peptide identification and quantification. Several researches were already conducted regarding the creation of spectral libraries from DDA analyses and obtaining identifications with these in DIA measurements. But so far only few experiments were conducted, to estimate the effect of these libraries on the quantitative level. In this work we created a spike-in gold standard dataset with known contents and ratios of proteins in a complex sample matrix. With this dataset, we first created spectral libraries using different sample preparation approaches with and without sample prefractionation on peptide and protein level. Two different search engines were used for protein identification. In total, five different spike-in states were compared with DIA analyses, comparing eight different spectral libraries generated by varying approaches and one library free method, as well as one default DDA analysis. Not only the number of identifications on peptide and protein level in the spectral libraries and the corresponding analyses was inspected, but also the number of expected and identified significant quantifications and their ratios were thoroughly examined. We found, that while libraries of prefractionationed samples are generally larger, the actually yielded identifications are not increased compared to repetitive non-fractionated measurements. Furthermore, we show that the accuracy of the quantifications is also highly dependent on the applied spectra library and also whether the peptide or protein level is analysed. Overall, the reproducibility and accuracy of DIA is superior to DDA in all analysed approaches.