A data repository for proteomic data sets. The ProteomeExchange consortium, as a whole, aims to provide a coordinated submission of MS proteomics data to the main existing proteomics repositories, as well as to encourage optimal data dissemination. ProteomeXchange provides access to a number of public databases, and users can access and submit data sets to the consortium's PRIDE database and PASSEL/PeptideAtlas.
The ProteomeXchange provides a single point of submission of Mass Spectrometry (MS) proteomics data for the main existing proteomics repositories, and encourages the data exchange between them for optimal data dissemination.
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
The cerebrospinal fluid (CSF) proteome data set presented herein was obtained after immunodepletion of abundant proteins and off-gel electrophoresis fractionation of a commercial pool of normal human CSF; liquid chromatography tandem mass spectrometry analysis was performed with a linear ion trap-Orbitrap Elite. We report the identification of 12 344 peptides mapping on 2281 proteins. In the context of the Chromosome-centric Human Proteome Project (C-HPP), the existence of seven missing proteins is proposed to be validated. This data set is available to the ProteomeXchange Consortium (http://www.proteomexchange.org/) with the data set identifier PXD008029.
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
For the C-HPP consortium, dark proteins include not only uPE1, but also missing proteins (MPs, PE2–4), smORFs, proteins from lncRNAs, and products from uncharacterized transcripts. Here, we investigated the expression of dark proteins in the human testis by combining public mRNA and protein expression data for several tissues and performing LC–MS/MS analysis of testis protein extracts. Most uncharacterized proteins are highly expressed in the testis. Thirty could be identified in our data set, of which two were selected for further analyses: (1) A0AOU1RQG5, a putative cancer/testis antigen specifically expressed in the testis, where it accumulates in the cytoplasm of elongated spermatids; and (2) PNMA6E, which is enriched in the testis, where it is found in the germ cell nuclei during most stages of spermatogenesis. Both proteins are coded on Chromosome X. Finally, we studied the expression of other dark proteins, uPE1 and MPs, in a series of human tissues. Most were highly expressed in the testis at both the mRNA and protein levels. The testis appears to be a relevant organ to study the dark proteome, which may have a function related to spermatogenesis and germ cell differentiation. The mass spectrometry proteomics data have been deposited with the ProteomeXchange Consortium under the data set identifier PXD009598.
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Background: Distinct hippocampal subfields are known to get affected during aging, psychiatric disorders, and various neurological and neurodegenerative conditions. To understand the biological processes associated with each subfield, it is important to understand its heterogeneity at the molecular level. To address this lacuna, we investigated the proteomic analysis of hippocampal subfieldsthe cornu ammonis sectors (CA1, CA2, CA3, CA4) and dentate gyrus (DG) from healthy adult human cohorts. Findings: Microdissection of hippocampal subfields from archived formalin-fixed paraffin-embedded tissue sections followed by TMT-based multiplexed proteomic analysis resulted in the identification of 5,593 proteins. Out of these, 890 proteins were found to be differentially abundant among the subfields. Further bioinformatics analysis suggested proteins related to gene splicing, transportation, myelination, structural activity, and learning processes to be differentially abundant in DG, CA4, CA3, CA2, and CA1, respectively. A subset of proteins was selected for immunohistochemistry-based validation in an independent set of hippocampal samples. Conclusions: We believe that our findings will effectively pave the way for further analysis of the hippocampal subdivisions and provide awareness of its subfield-specific association to various neurofunctional anomalies in the future. The current mass spectrometry data is deposited and publicly made available through ProteomeXchange Consortium via the PRIDE partner repository with the data set identifier PXD029697.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Data to accompany figure S2Proteomics of FRC-derived matricesIn vitro FRC cell line-derived matrices generated after 5 days in culture were subjected to proteomic analysis by mass spectrometry. Summary data in xls spreadsheet. SWATH dataset uploaded to PRIDE repository ID=PXD015816http://proteomecentral.proteomexchange.org/cgi/GetDataset?ID=PXD015816AbstractLymph nodes (LNs) work as filtering organs, constantly sampling peripheral cues. This is facilitated by the conduit network, a parenchymal tubular-like structure formed of bundles of aligned extracellular matrix (ECM) fibrils ensheathed by fibroblastic reticular cells (FRCs). LNs undergo 5-fold expansion with every adaptive immune response and yet these ECM-rich structures are not permanently damaged. Whether conduit integrity and filtering functions are affected during cycles of LN expansion and resolution is not known. Here we show that the conduit structure is disrupted during acute LN expansion but FRC-FRC contacts remain intact. In homeostasis, polarised FRCs adhere to the underlying substrate to deposit ECM ba-solaterally. ECM production by FRCs is regulated by the C-type lectin CLEC-2, expressed by dendritic cells (DCs), at transcriptional and secretory levels. Inflamed LNs maintain conduit size-exclusion, but flow becomes leaky, which allows soluble antigens to reach more antigen-presenting cells. We show how dynamic communication between peripheral tissues and LNs changes during immune responses, and describe a mechanism that enables LNs to prevent inflammation-induced fibrosis.HighlightsFRCs use polarized microtubule networks to guide matrix depositionCLEC-2/PDPN controls matrix production at transcriptional and post-transcriptional levelsFRCs halt matrix production and decouple from conduits during acute LN expansionConduits leak soluble antigen during acute LN expansion
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Milk-derived exosomes have been reported, which are involved in many biological processes. The exosomes derived from mammary glands are not known yet, and their relationship with mammary gland lactation and the origin of milk-derived exosomes are largely unclear. The present study aimed to investigate the proteome of exosomes derived from bovine mammary epithelial cells (BMECs) and compare them with milk-derived exosomes in the database. BMEC-derived exosomes were successfully separated from the culture supernatant of BMECs by a combined ultracentrifugation approach, and the purity of exosomes was identified by western blot analysis. Liquid chromatography with tandem mass spectrometry identified 638 proteins in BMEC-derived exosomes. The MS data were deposited into the PUBLIC repository ProteomeXchange, dataset identifier(s): https://www.iprox.org/page/PSV023.html;?url=1590961453176tKpa. Gene Ontology annotation and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis showed that these proteins were associated with specific biological processes and molecular functions of metabolism. Cross comparison of these proteins with the protein database of milk exosomes showed that 77 common expressed proteins (CEPs) were in both BMEC- and milk-derived exosomes. The KEGG pathway analysis for these CEPs showed that they were mainly involved in signaling pathways associated with milk biosynthesis in BMECs. Among these CEPs, six proteins have been previously reported to be associated with the lactation function. The western blot analysis detected that expression of these six proteins in BMEC-derived exosomes was increased after the stimulation of methionine and β-estradiol on BMECs. In summary, the proteome of BMEC-derived exosomes reveals that they are associated with milk biosynthesis in BMECs and might be a source of milk-derived exosomes.
This article contains consolidated proteomic data obtained from xylem sap collected from tomato plants grown in Fe- and Mn-sufficient control, as well as Fe-deficient and Mn-deficient conditions. Data presented here cover proteins identified and quantified by shotgun proteomics and Progenesis LC-MS analyses: proteins identified with at least two peptides and showing changes statistically significant (ANOVA; p ≤ 0.05) and above a biologically relevant selected threshold (fold ≥ 2) between treatments are listed. The comparison between Fe-deficient, Mn-deficient and control xylem sap samples using a multivariate statistical data analysis (Principal Component Analysis, PCA) is also included. Data included in this article are discussed in depth in "Effects of Fe and Mn deficiencies on the protein profiles of tomato (Solanum lycopersicum) xylem sap as revealed by shotgun analyses", Ceballos-Laita et al., J. Proteomics, 2018. This dataset is made available to support the cited study as well to extend analyses at a later stage. Resources in this dataset:Resource Title: ProteomeExchange submission PXD007517. Xylem sap shotgun proteomics from Fe- and Mn-deficient and Mn-toxic tomato plants. . File Name: Web Page, url: http://proteomecentral.proteomexchange.org/cgi/GetDataset?ID=PXD007517 The MS proteomics data have been deposited to the ProteomeXchange Consortium via the Pride partner repository with the data set identifier PXD007517. Also includes FTP location. Files available at https://www.ebi.ac.uk/pride/archive/projects/PXD007517 via HTML, FTP, or Fast (Aspera) download : 1 SEARCH.xml file, 1 Peak file, 24 RAW files, 1 Mascot information.xlsx file. Supplementary data associated with this article can be found in the online version at http://dx.doi.org/10.1016/j.dib.2018.01.034
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
DATA for "A learned score function improves the power of mass spectrometry database search"
These data files are associated with the following publication:
Varun Ananth, Justin Sanders, Melih Yilmaz, Sewoong Oh and William Stafford Noble. "A learned score function improves the power of mass spectrometry database search". Bioinformatics (Proceedings of the ISMB). 2024.
For the benchmarking data, we used a dataset that is publicly available on ProteomeXchange (PXD028735). The paper that introduced this dataset is:
Van Puyvelde, B., Daled, S., Willems, S., Gabriels, R., Gonzalez de Peredo, A., Chaoui, K., Mouton-Barbosa, E., Bouyssié, D., Boonen, K., Hughes, C. J., Gethings, L. A., Perez-Riverol, Y., Bloomfield, N., Tate, S., Schiltz, O., Martens, L., Deforce, D., & Dhaenens, M. (2022). A comprehensive LFQ benchmark dataset on modern day acquisition strategies in proteomics. In Scientific Data (Vol. 9, Issue 1). Springer Science and Business Media LLC. https://doi.org/10.1038/s41597-022-01216-6
More specifically, the following .raw
files were downloaded:
LFQ_Orbitrap_DDA_Ecoli_01.raw
LFQ_Orbitrap_DDA_Human_01.raw
LFQ_Orbitrap_DDA_Yeast_01.raw
Those files can be accessed via FTP here.
We upload here the annotated .mgf files created from these .raw files, as described in our paper.
The human, yeast, and E. coli .fasta files used in all database searches were downloaded from UniProt on 11/6/23, 4:30 PM.
Bateman, A., Martin, M.-J., Orchard, S., Magrane, M., Ahmad, S., Alpi, E., Bowler-Barnett, E. H., Britto, R., Bye-A-Jee, H., Cukura, A., Denny, P., Dogan, T., Ebenezer, T., Fan, J., Garmiri, P., da Costa Gonzales, L. J., Hatton-Ellis, E., Hussein, A., … Zhang, J. (2022). UniProt: the Universal Protein Knowledgebase in 2023. In Nucleic Acids Research (Vol. 51, Issue D1, pp. D523–D531). Oxford University Press (OUP). https://doi.org/10.1093/nar/gkac1052
We include these files here, with only minor modifications to replace U
amino acids with X
so that all amino acids fall into Casanovo-DB's vocabulary.
This dataset is a label-free quantitation of proteins milk and dry secretions from the end of lactation through day 21 of the dry period using liquid chromatography with tandem mass spectrometry (LC-MS/MS). The data supplied in this article supports the accompanying publication entitled “Characterization of bovine mammary gland dry secretions and their proteome from the end of lactation through day 21 of the dry period”. The Thermo mass spectrometry raw files and MaxQuant files have been deposited to the ProteomeXchange Consortium via the PRIDE partner repository with the dataset number PXD017837. Resources in this dataset:Resource Title: Characterization of Bovine Dry Secretions and their Proteome from the End of Lactation Through Day 21 of the Dry Period - ProteomeXchange Consortium via the PRIDE partner repository, Project PXD017837. File Name: Web Page, url: https://www.ebi.ac.uk/pride/archive/projects/PXD017837 Thermo raw file code for Pride raw files and supplemental Excel files. The 3 technical replicates are denoted as a letter A, B and C. The number following is the cow identification number for 11 cows used. The final two-digit number after the underscore is the day sampled where _01 = day 1, _03 = day 3, _10 = day 10 and _21 = day 21 of dry period. For example, A1313_01 is technical replicate A for cow 1313 collected on day 1. B1313_03 is technical replicate B for cow 1313 collected on day 3. Details of sample and data processing protocols are provided.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Improved identification and quantification of peptides in mass spectrometry data via chemical and random additive noise elimination (CRANE)
Availability and implementation
The software is available on Github (https://github.com/CMRI-ProCan/CRANE). The datasets were obtained from ProteomeXchange (Identifiers—PXD002952 and PXD008651). Preliminary data and intermediate files are available via ProteomeXchange (Identifiers—PXD020529 and PXD025103).
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Mass spectrometry analysis (data-independent acquisition) derived intensities are reported here for all breast tumor samples (n = 75). RAW data files for these samples are accessible via ProteomeXchange with the dataset identifiers PXD032266 (S samples) and PXD037428 (V samples). Protein intensities were Log2 transformed and scaled (samples and proteins).
This dataset was used for Figure 5 in the following manuscript: "Proteogenomics decodes the evolution of human ipsilateral breast cancer". De Marchi T, Pyl PT, Sjöström M, Reinsbach SE, DiLorenzo S, Nystedt B, Tran L, Pekar G, Wärnberg F, Fredriksson I, Malmström P, Fernö M, Malmström L, Malmström J, Nimèus E. accepted for publication
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Tandem mass tags (TMT) are widely used in proteomics to simultaneously quantify multiple samples in a single experiment. The tags can be easily added to the primary amines of peptides/proteins through chemical reactions. In addition to amines, TMT reagents also partially react with the hydroxyl groups of serine, threonine, and tyrosine residues under alkaline conditions, which significantly compromises the analytical sensitivity and precision. Under alkaline conditions, reducing the TMT molar excess can partially mitigate overlabeling of histidine-free peptides, but has a limited effect on peptides containing histidine and hydroxyl groups. Here, we present a method under acidic conditions to suppress overlabeling while efficiently labeling amines, using only one-fifth of the TMT amount recommended by the manufacturer. In a deep-scale analysis of a yeast/human two-proteome sample, we systematically evaluated our method against the manufacturer’s method and a previously reported TMT-reduced method. Our method reduced overlabeled peptides by 9-fold and 6-fold, respectively, resulting in the substantial enhancement in peptide/protein identification rates. More importantly, the quantitative accuracy and precision were improved as overlabeling was reduced, endowing our method with greater statistical power to detect 42% and 12% more statistically significant yeast proteins compared to the standard and TMT-reduced methods, respectively. Mass spectrometric data have been deposited in the ProteomeXchange Consortium via the iProX partner repository with the data set identifier PXD047052.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The repository contains three mzML and four imzML mass spectrometry datasets,
The mzML data are compiled in a single directory 'mzML' and zipped:
The imzML mass spectrometry imaging data are zipped individually:
All these datasets are publicly available from different repositories; however, If you reuse them, please attribute the original authors!
Dataset contains raw and preprocessed data for fluorescence and proteomic studies respectively. In each case, protein foldedness was probed using thiol reactivity. The raw mass spectrometry proteomics data have also been deposited to the ProteomeXchange Consortium via the PRIDE partner repository, with the dataset identifiers PXD033152.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Protein-Protein, Genetic, and Chemical Interactions for Wolfgeher D (2015):The dynamic interactome of human Aha1 upon Y223 phosphorylation. curated by BioGRID (https://thebiogrid.org); ABSTRACT: Heat Shock Protein 90 (Hsp90) is an essential chaperone that supports the function of a wide range of signaling molecules. Hsp90 binds to a suite of co-chaperone proteins that regulate Hsp90 function through alteration of intrinsic ATPase activity. Several studies have determined Aha1 to be an important co-chaperone whose binding to Hsp90 is modulated by phosphorylation, acetylation and SUMOylation of Hsp90 [1], [2]. In this study, we applied quantitative affinity-purification mass spectrometry (AP-MS) proteomics to understand how phosphorylation of hAha1 at Y223 altered global client/co-chaperone interaction [3]. Specifically, we characterized and compared the interactomes of Aha1-Y223F (phospho-mutant form) and Aha1-Y223E (phospho-mimic form). We identified 99 statistically significant interactors of hAha1, a high proportion of which (84%) demonstrated preferential binding to the phospho-mimic form of hAha1. The mass spectrometry proteomics data have been deposited to the ProteomeXchange Consortium (http://proteomecentral.proteomexchange.org) via the PRIDE partner repository [4] with the dataset identifier PXD001737.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The dataset includes total spectral counts for proteins and peptides. Four files and a link to raw data at the domain repository are included:
1) Sample metadata file with station locations, depth, time of collection and sample IDs described by this BCO-DMO page. (original file name: DeepDOM_sample_metadata_for_OPP.csv)
2) Raw mass spectral data files are available on PRIDE and ProteomeXchange:
ProteomeXchange title: Microbial Metaproteome from the Western Atlantic Ocean DeepDOM KN210-04 Expedition
ProteomeXchange accession: PXD034035
Project Webpage: http://www.ebi.ac.uk/pride/archive/projects/PXD034035
FTP Download: ftp://ftp.pride.ebi.ac.uk/pride/data/archive/2022/05/PXD034035
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The MS proteomics data have been deposited to the ProteomeXchange Consortium via the Pride partner repository with the data set identifier PXD007517. Also includes FTP location. Files available at https://www.ebi.ac.uk/pride/archive/projects/PXD007517 via HTML, FTP, or Fast (Aspera) download : 1 SEARCH.xml file, 1 Peak file, 24 RAW files, 1 Mascot information.xlsx file.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Precursor intensity-based label-free quantification software tools for proteomic and multi-omic analysis within the Galaxy Platform.
ABRF: Data was generated through the collaborative work of the ABRF Proteomics Research Group (https://abrf.org/research-group/proteomics-research-group-prg). See Reference for details: Van Riper, S. et al. ‘An ABRF-PRG study: Identification of low abundance proteins in a highly complex protein sample’ at the 64th Annual Conference of American Society of Mass Spectrometry and Allied Topics" at San Antonio, TX."
UPS: MaxLFQ Cox J, Hein MY, Luber CA, Paron I, Nagaraj N, Mann M. Accurate proteome-wide label-free quantification by delayed normalization and maximal peptide ratio extraction, termed MaxLFQ. Mol Cell Proteomics. 2014 Sep;13(9):2513-26. doi: 10.1074/mcp.M113.031591. Epub 2014 Jun 17. PubMed PMID: 24942700; PubMed Central PMCID: PMC4159666;
PRIDE #5412; ProteomeXchange repository PXD000279: ftp://ftp.pride.ebi.ac.uk/pride/data/archive/2014/09/PXD000279
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Giardia duodenalis a species-complex of gastrointestinal protists, with assemblage A and B infective to humans. To date, post-genomic proteomics are largely derived from Assemblage A, biasing understanding of parasite biology. To address this gap, we quantitatively analysed the proteomes of trophozoites from the genome reference and two clinical Assemblage B isolates, revealing lower spectrum-to-peptide matches in non-reference isolates, resulting in significant losses in peptide and protein identifications, and indicating significant intra-assemblage variation. We also explored differential protein expression between in vitro cultured subpopulations putatively enriched for dividing and feeding cells, respectively. This data is an important proteomic baseline for Assemblage B, highlighting proteomic differences between physiological states, and unique differences relative to Assemblage A. The complete raw files and search results can be accessed via the ProteomeXchange Consortium (Vizcaino et al., 2013) via the PRIDE partner repository with the dataset identifier PXD007943.
A data repository for proteomic data sets. The ProteomeExchange consortium, as a whole, aims to provide a coordinated submission of MS proteomics data to the main existing proteomics repositories, as well as to encourage optimal data dissemination. ProteomeXchange provides access to a number of public databases, and users can access and submit data sets to the consortium's PRIDE database and PASSEL/PeptideAtlas.