Data dependent acquisition (DDA) is the method of choice for mass spectrometry based proteomics discovery experiments, data-independent acquisition (DIA) is steadily becoming more important. One of the most important requirement to perform a DIA analysis is the availability of spectral libraries for the peptide identification and quantification. Several researches were already conducted regarding the creation of spectral libraries from DDA analyses and obtaining identifications with these in DIA measurements. But so far only few experiments were conducted, to estimate the effect of these libraries on the quantitative level. In this work we created a spike-in gold standard dataset with known contents and ratios of proteins in a complex sample matrix. With this dataset, we first created spectral libraries using different sample preparation approaches with and without sample prefractionation on peptide and protein level. Two different search engines were used for protein identification. In total, five different spike-in states were compared with DIA analyses, comparing eight different spectral libraries generated by varying approaches and one library free method, as well as one default DDA analysis. Not only the number of identifications on peptide and protein level in the spectral libraries and the corresponding analyses was inspected, but also the number of expected and identified significant quantifications and their ratios were thoroughly examined. We found, that while libraries of prefractionationed samples are generally larger, the actually yielded identifications are not increased compared to repetitive non-fractionated measurements. Furthermore, we show that the accuracy of the quantifications is also highly dependent on the applied spectra library and also whether the peptide or protein level is analysed. Overall, the reproducibility and accuracy of DIA is superior to DDA in all analysed approaches.
Public Domain Mark 1.0https://creativecommons.org/publicdomain/mark/1.0/
License information was derived automatically
Data for DIA-MS analyses referred to in the manuscript, "Longitudinal Analysis of the Lung Proteome Reveals Persistent Repair Months after Mild to Moderate COVID-19".
Proteomic workflows generate vastly complex peptide mixtures that are analyzed by liquid chromatography–tandem mass spectrometry, creating thousands of spectra, most of which are chimeric and contain fragment ions from more than one peptide. Because of differences in data acquisition strategies such as data-dependent, data-independent or parallel reaction monitoring, separate software packages employing different analysis concepts are used for peptide identification and quantification, even though the underlying information is principally the same. Here, we introduce CHIMERYS, a spectrum-centric search algorithm designed for the deconvolution of chimeric spectra that unifies proteomic data analysis. Using accurate predictions of peptide retention time, fragment ion intensities and applying regularized linear regression, it explains as much fragment ion intensity as possible with as few peptides as possible. Together with rigorous false discovery rate control, CHIMERYS accurately identifies and quantifies multiple peptides per tandem mass spectrum in data-dependent, data-independent or parallel reaction monitoring experiments.
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Proteomics research today no longer simply seeks exhaustive protein identification; increasingly, it is also desirable to obtain robust, large-scale quantitative information. To accomplish this, data-independent acquisition (DIA) has emerged as a promising strategy largely owing to developments in advanced mass spectrometers and sophisticated data analysis methods. Nevertheless, the highly complex multiplexed MS/MS spectra produced by DIA remain challenging to interpret. Here, we present a novel strategy to analyze DIA data, based on unambiguous precursor mass assignment through the mPE-MMR (multiplexed post-experimental monoisotopic mass refinement) procedure and combined with complementary multistage database searching. Compared to conventional spectral library searching, the accuracy and sensitivity of peptide identification were significantly increased by incorporating precise precursor masses in DIA data. We demonstrate identification of additional peptides absent from spectral libraries, including sample-specific mutated peptides and post-translationally modified peptides using MS-GF+ and MODa/MODi multistage database searching. This first use of unambiguously determined precursor masses to mine DIA data demonstrates considerable potential for further exploitation of this type of experimental data.
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Data-independent acquisition (DIA) at the shortened data acquisition time is becoming a method of choice for quantitative proteomic applications requiring high throughput analysis of large cohorts of samples. With the advent of the combination of high resolution mass spectrometry with an asymmetric track lossless analyzer, these DIA capabilities were further extended with the recent demonstration of quantitative analyses at the speed of up to hundreds of samples per day. In particular, the proteomic data for the brain samples related to multiple system atrophy disease were acquired using 7 and 28 min chromatography gradients (Guzman et al., Nat. Biotech. 2024). In this work, we applied the recently introduced DirectMS1 method to reanalysis of these data using only MS1 spectra. Both DirectMS1 and DIA results were matched against long gradient DDA analysis from the earlier study of the same sample cohort. While the quantitation efficiency of DirectMS1 was comparable with DIA on the same data sets, we found an additional five proteins of biological significance relevant to the analyzed tissue samples. Among the findings, DirectMS1 was able to detect decreased caspase activity for Vimentin protein in the multiple system atrophy samples missed by the MS/MS-based quantitation methods. Our study suggests that DirectMS1 can be an efficient MS1-only addition to the analysis of DIA data in high-throughput quantitative proteomic studies.
Proteome analysis by data-independent acquisition (DIA) has become a powerful approach to obtain deep proteome coverage, and has gained recent traction for label-free analysis of single cells. However, optimal experimental design for DIA-based single-cell proteomics has not been fully explored, and performance metrics of subsequent data analysis tools remain to be evaluated. Therefore, we here present DIA-ME, a data analysis strategy that exploits the co-analysis of low-input samples with a so-called matching enhancer (ME) of higher input, to increase sensitivity, proteome coverage, and data completeness. We evaluate the matching specificity of DIA-ME by a two-proteome model, and demonstrate that false discovery and false transfers are maintained at low levels when using DIA-NN software, while preserving quantification accuracy. We apply DIA-ME to investigate the proteome response of U-2 OS cells to interferon gamma (IFN-γ) in single cells, and recapitulate the time-resolved induction of IFN-γ response proteins as observed in bulk material. Moreover, we observe co- and anti-correlating patterns of protein expression within the same cell, indicating mutually exclusive protein modules and the co-existence of different cell states. Collectively our data show that DIA-ME is a powerful, scalable, and easy-to-implement strategy for single-cell proteomics.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The uploaded files serve as a concise but meaningful training data set in the Galaxy training network (https://galaxyproject.github.io/training-material/).
HEK and E.coli cell pellets were lysed with 5 % SDS, 50 mM triethylammonium bicarbonate (TEAB), pH 7.55. The obtained protein extracts were reduced by adding f.c. 5 mM TCEP and alkylated by the addition of f.c. 10 mM iodacetamide. Protein digestion and purification was performed on S-Trap columns. To ensure protein binding to the S-Trap columns, samples were acidified to a final concentration of 1.2 % phosphoric acid (~ pH 2). Six times the sample volume S-Trap buffer (90% aqueous methanol containing a final concentration of 100 mM TEAB, pH 7.1) was added to the samples which were then loaded on the columns and washed with S-Trap buffer. Protein digestion was performed with trypsin and LysC for one hour at 47 °C. Peptides were eluted in three steps with (1) 50 mM TEAB, (2) 0.2 % aqueous formic acid and (3) 50 % acetonitrile containing 0.2 % formic acid. Eluted peptides of HEK and E.coli were mixed in two different ratios and four replicates of each Spike/in ratio were measured and analysed using OpenSwathWorkflow in Galaxy. Results were exported using PyProphet and can be used for the statistical analysis and detection of the two different Spike-in Ratios. The Spike-in ratios were the following:
Sample HEK E.coli
Spike_in_1 2.5 0.15
Spike_in_2 2.5 0.80
Besides the two PyProphet export files, we uploaded a sample annotation file as well as a comparison matrix file. Additionally, we uploaded the Galaxy MSstats training result files: MSstats_ComparisonResult_export_tabular and MSstats_ComparisonResult_msstats_input.
Untargeted data independent acquisition proteomics analysis using Orbitrap mass spectrometers and DIA-Umpire
Liquid chromatography coupled to tandem mass spectrometry has become the main method for high-throughput identification and quantification of peptides and the inferred proteins. Discovery proteomics commonly employs data-dependent acquisition in combination with spectrum-centric analysis. The accumulation of data generated from thousands of samples by this method has approached saturation coverage of different proteomes. Recently, as a result of technological advances, methods based on data acquisition strategies compatible with peptide-centric scoring have also reached similar proteome coverage in individual runs, and scalability. This is exemplified by SWATH-MS, which combines data-independent acquisition (DIA) with targeted data extraction of groups of transitions uniquely detecting a peptide. As the data matrices generated by these experiments continue to grow with respect to both the number of peptides identified per sample and the number of samples analyzed per study, challenges for error rate control have emerged. Here, we discuss the adaptation of statistical concepts developed for discovery proteomics based on spectrum-centric scoring to large-scale DIA experiments analyzed with peptide-centric scoring strategies, and provide some guidance on their application. We propose that, in order to increase the quality and reproducibility of published proteomic results, well-established confidence criteria should be reported at each level as we progress from spectral evidence to identified or detected peptides and inferred proteins. These confidence criteria should equally be applied to proteomic analyses based on spectrum- and peptide-centric scoring strategies.
Comprehensive, reproducible and precise analysis of large sample cohorts is one of the key objectives of quantitative proteomics. Here, we present an implementation of data-independent acquisition using its parallel acquisition nature that surpasses the limitation of serial MS2 acquisition of data-dependent acquisition on a quadrupole ultra-high field Orbitrap mass spectrometer. In deep single shot data-independent acquisition, we identified and quantified 6,383 proteins in human cell lines using 2-or-more peptides/protein and over 7,100 proteins when including the 717 proteins that were identified on the basis of a single peptide sequence. 7,739 proteins were identified in mouse tissues using 2-or-more peptides/protein and 8,121 when including the 382 proteins that were identified on the basis of a single peptide sequence. Missing values for proteins were within 0.3 to 2.1% and median coefficients of variation of 4.7 to 6.2% among technical triplicates. In very complex mixtures, we could quantify 10,780 proteins and 12,192 proteins when including the 1,412 proteins that were identified on the basis of a single peptide sequence. Using this optimized DIA, we investigated large-protein networks before and after the critical period for whisker experience-induced synaptic strength in the murine somatosensory cortex 1 barrel field. This work shows that parallel mass spectrometry enables proteome profiling for discovery with high coverage, reproducibility, precision and scalability.
Data dependent acquisition (DDA) is the method of choice for mass spectrometry based proteomics discovery experiments, data-independent acquisition (DIA) is steadily becoming more important. One of the most important requirement to perform a DIA analysis is the availability of spectral libraries for the peptide identification and quantification. Several researches were already conducted regarding the creation of spectral libraries from DDA analyses and obtaining identifications with these in DIA measurements. But so far only few experiments were conducted, to estimate the effect of these libraries on the quantitative level. In this work we created a spike-in gold standard dataset with known contents and ratios of proteins in a complex sample matrix. With this dataset, we first created spectral libraries using different sample preparation approaches with and without sample prefractionation on peptide and protein level. Two different search engines were used for protein identification. In total, five different spike-in states were compared with DIA analyses, comparing eight different spectral libraries generated by varying approaches and one library free method, as well as one default DDA analysis. Not only the number of identifications on peptide and protein level in the spectral libraries and the corresponding analyses was inspected, but also the number of expected and identified significant quantifications and their ratios were thoroughly examined. We found, that while libraries of prefractionationed samples are generally larger, the actually yielded identifications are not increased compared to repetitive non-fractionated measurements. Furthermore, we show that the accuracy of the quantifications is also highly dependent on the applied spectra library and also whether the peptide or protein level is analysed. Overall, the reproducibility and accuracy of DIA is superior to DDA in all analysed approaches.
In this study, we investigated the benefits of adding high-field asymmetric ion mobility spectrometry (FAIMS) separation prior to data dependent acquisition (DDA) and gas phase fractionation (GPF) prior to data independent acquisition (DIA) LC-MS/MS analysis. Native digestion followed by LC-MS/MS with FAIMS allowed the identification of 221 HCPs among which 158 were reliably quantified for a global amount of 880 ng/mg of NIST mAb Reference Material. Our methods have also been applied to commercial DPs and demonstrate their ability to dig deeper into the HCP landscape with the identification of 60 and 67 HCPs, and accurate quantification of 29 and 31 of these impurities in nivolumab and trastuzumab respectively, with sensitivity down to the sub-ng/mg of mAb level.
Rheumatoid arthritis (RA) is a systemic autoimmune and inflammatory disease. Plasma biomarkers are critical for understanding disease mechanisms, treatment effects, and diagnosis. Mass spectrometry-based proteomics is a powerful tool for unbiased biomarker discovery. However, plasma proteomics is significantly hampered by signal interference from high-abundance proteins, low overall protein coverage, and high levels of missing data from data-dependent acquisition (DDA). To achieve quantitative proteomic analysis for plasma samples with a balance of throughput, performance, and cost, we developed a workflow incorporating plate-based high abundance protein depletion and sample preparation, comprehensive peptide spectral library building, and data-independent acquisition (DIA) SWATH mass spectrometry-based methodology. In this study, we analyzed plasma samples from both RA patients and healthy donors. The results showed that the new workflow performance exceeded that of the current state-of-the-art depletion-based plasma proteomic platforms in terms of both data quality and proteome coverage. Proteins from biological processes related to the activation of systemic inflammation, suppression of platelet function, and loss of muscle mass were enriched and differentially expressed in RA. Some plasma proteins, particularly acute phase reactant proteins, showed great power to distinguish between RA patients and healthy donors. Moreover, protein isoforms in the plasma were also analyzed, providing even deeper proteome coverage. This workflow can serve as a basis for further application in discovering plasma biomarkers of other diseases.
Initial discovery-based analysis using data independent acquisition (DIA) can obtain deep proteome coverage with high data completeness; however, the development of targeted PRM assays based on subsequent bioinformatic predictions can be tedious and time-consuming because of the complexity of the output. We address this limitation with a Python script that rapidly generates a PRM method for the TIMS-QTOF platform using DIA data and a user-defined target list.
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Targeted analysis of data-independent acquisition (DIA) mass spectrometry data requires elegant software tools and strict statistical control. OpenSWATH-PyProphet-TRIC is a widely used DIA data analysis workflow. The OpenSWATH-PyProphet-TRIC workflow is typically executed by running command lines. Here, we present QuantPipe, which is a graphic interface software tool based on the OpenSWATH-PyProphet-TRIC workflow. In addition to OpenSWATH-PyProphet-TRIC functions, QuantPipe can convert the spectral library to the assay library and output peptides and protein intensities. We demonstrated that QuantPipe can be used to analyze SWATH-MS data from TripleTOF 5600 and TripleTOF 6600, phospho-SWATH-MS data, DIA data from Orbitrap instrument, and diaPASEF data from TimsTOF Pro instrument. The executable files, user manual, and source code of QuantPipe are freely available at https://github.com/tachengxmu/QuantPipe/releases.
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Data independent acquisition-mass spectrometry (DIA-MS) coupled with liquid chromatography is a promising approach for rapid, automatic sampling of MS/MS data in untargeted metabolomics. However, wide isolation windows in DIA-MS generate MS/MS spectra containing a mixed population of fragment ions together with their precursor ions. This precursor-fragment ion map in a comprehensive MS/MS spectral library is crucial for relative quantification of fragment ions uniquely representative of each precursor ion. However, existing reference libraries are not sufficient for this purpose since the fragmentation patterns of small molecules can vary in different instrument setups. Here we developed a bioinformatics workflow called MetaboDIA to build customized MS/MS spectral libraries using a user’s own data dependent acquisition (DDA) data and to perform MS/MS-based quantification with DIA data, thus complementing conventional MS1-based quantification. MetaboDIA also allows users to build a spectral library directly from DIA data in studies of a large sample size. Using a marine algae data set, we show that quantification of fragment ions extracted with a customized MS/MS library can provide as reliable quantitative data as the direct quantification of precursor ions based on MS1 data. To test its applicability in complex samples, we applied MetaboDIA to a clinical serum metabolomics data set, where we built a DDA-based spectral library containing consensus spectra for 1829 compounds. We performed fragment ion quantification using DIA data using this library, yielding sensitive differential expression analysis.
Alzheimer’s disease (AD) is the most common neurodegenerative disorder in the human population, for which there is currently no cure. The cause of AD is unknown, however, the toxic effects of amyloid-β (Aβ) are believed to play a role in its onset. To investigate this, we examined changes in global protein levels in a hippocampal synaptosome fraction of the APP/PS1 mouse model of AD at 6 and 12 months of age (moa). Data independent acquisition (DIA), or SWATH, was used for a quantitative label-free proteomics analysis. We first assessed the usefulness of a recently improved directDIA workflow as alternative to conventional DIA data analysis using a project specific spectral library. Subsequently, we applied directDIA to the 6- and 12-moa APP/PS1 datasets and applied the Mass Spectrometry Downstream Analysis Pipeline (MS-DAP) for differential expression analysis and candidate discovery. We observed most regulation at 12-moa, in particular of proteins involved in Aβ homeostasis and microglial dependent synaptic pruning and/or immune response, such as APOE, CLU and C1QA-C
Data-Independent Acquisition (DIA) is a mass spectrometry-based method to reliably identify and reproducibly quantify large fractions of a target proteome. The peptide-centric data analysis strategy employed in DIA requires a priori generated spectral assay libraries. Such assay libraries allow to extract quantitative data in a targeted approach and have been generated for human, mouse, zebrafish, E. coli and few other organisms. However, a spectral assay library for the extreme halophilic archaeon Halobacterium salinarum NRC-1, a model organism that contributed to several notable discoveries, is not publicly available yet. Here, we report a comprehensive spectral assay library to measure 2,563 of 2,646 annotated H. salinarum NRC-1 proteins. We demonstrate the utility of this library by measuring global protein abundances over time under standard growth conditions. The H. salinarum NRC-1 library includes 21,074 distinct peptides representing 97% of the predicted proteome and provides a new, valuable resource to confidently measure and quantify any protein of this archaeon.
Raw data from the nLC-MS/MS analysis of microproteomic samples on rat brain cerebellum sagittal section. Raw data from the nLC-MS/MS analysis of microproteomic samples on rat brain horizontal section.
https://coinunited.io/termshttps://coinunited.io/terms
Detailed price prediction analysis for DIA on Jul 22, 2025, including bearish case ($0.384), base case ($0.409), and bullish case ($0.436) scenarios with Buy trading signal based on technical analysis and market sentiment indicators.
Data dependent acquisition (DDA) is the method of choice for mass spectrometry based proteomics discovery experiments, data-independent acquisition (DIA) is steadily becoming more important. One of the most important requirement to perform a DIA analysis is the availability of spectral libraries for the peptide identification and quantification. Several researches were already conducted regarding the creation of spectral libraries from DDA analyses and obtaining identifications with these in DIA measurements. But so far only few experiments were conducted, to estimate the effect of these libraries on the quantitative level. In this work we created a spike-in gold standard dataset with known contents and ratios of proteins in a complex sample matrix. With this dataset, we first created spectral libraries using different sample preparation approaches with and without sample prefractionation on peptide and protein level. Two different search engines were used for protein identification. In total, five different spike-in states were compared with DIA analyses, comparing eight different spectral libraries generated by varying approaches and one library free method, as well as one default DDA analysis. Not only the number of identifications on peptide and protein level in the spectral libraries and the corresponding analyses was inspected, but also the number of expected and identified significant quantifications and their ratios were thoroughly examined. We found, that while libraries of prefractionationed samples are generally larger, the actually yielded identifications are not increased compared to repetitive non-fractionated measurements. Furthermore, we show that the accuracy of the quantifications is also highly dependent on the applied spectra library and also whether the peptide or protein level is analysed. Overall, the reproducibility and accuracy of DIA is superior to DDA in all analysed approaches.