Mass spectrometry imaging (MSI) experiments result in complex multi-dimensional datasets, which require specialist data analysis tools. Here we have developed massPix - an R package for analysing and interpreting data from MSI of lipids in tissue. MassPix is an open-source tool for the analysis and statistical interpretation of MSI data, and is particularly useful for lipidomics applications. MassPix produces single ion images, performs multivariate statistics and provides putative lipid annotations based on accurate mass matching against generated lipid libraries. Classification of tissue regions with high spectral similarly can be carried out by principal components analysis (PCA) or k-means clustering. Mouse cerebellum was analysed using matrix assisted laser desorption ionisation (MALDI) MSI. The resulting MSI dataset forms the test data for massPix.
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Replication data for the publication: "rSIREM: an R package for MALDI spectral deconvolution" by Del Castillo Pérez et al. The deposited data are SALDI-MSI data of three consectutive thin tissue sections from mouse cerebellum measured at the different mass resolutions at the same instrument (MALDI-MSI: Spectroglyph Injector - Orbitrap Exploris). The paper describes a new R package (rSIREM) to computationally improve the mass resolution of an MSI post-measurement. The developed R package (https://github.com/EdelCastillo/rSirem ) applies a statistical treatment on the concentration of spatial images obtained by separately considering each of the m/z over all the pixels. A representative scalar is associated with each image, obtained by applying a new measure (SIREM) to it, derived from Shannon's entropy. The perturbations of this measure, when considering a sequence of consecutive images, reveal the existence of overlap, if it exists. This information serves as a seed to initialize the EM algorithm in the Gaussian Mixture Model context. The efficiency of the method has been verified using three independent procedures.
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Chemical contamination is one of the major obstacles for mechanical recycling of plastics. In this article, we built and open-sourced an in-house MS/MS library containing more than 500 plastic-related chemicals and developed mspcompiler, an R package, for the compilation of various libraries. We then proposed a workflow to process untargeted screening data acquired by liquid chromatography high-resolution mass spectrometry. These tools were subsequently employed to data originating from recycled high-density polyethylene (rHDPE) obtained from milk bottles. A total of 83 compounds were identified, with 66 easily annotated by making use of our in-house MS/MS libraries and the mspcompiler R package. In silico fragmentation combined with data obtained from gas chromatography–mass spectrometry and lists of chemicals related to plastics were used to identify those remaining unknown. A pseudo-multiple reaction monitoring method was also applied to sensitively target and screen the identified chemicals in the samples. Quantification results demonstrated that a good sorting of postconsumer materials and a better recycling technology may be necessary for food contact applications. Removal or reduction of non-volatile substances, such as octocrylene and 2-ethylhexyl-4-methoxycinnamate, is still challenging but vital for the safe use of rHDPE as food contact materials.
https://www.nist.gov/open/licensehttps://www.nist.gov/open/license
The NIST DART-MS Forensics Database is an evaluated collection of in-source collisionally-induced dissociation (is-CID) mass spectra of compounds of interest to the forensics community (e.g. seized drugs, cutting agents, etc.). The is-CID mass spectra were collected using Direct Analysis in Real-Time (DART) Mass Spectrometry (MS), either by NIST scientists or by contributing agencies noted per compound. The database is provided as a general-purpose structure data file (.SDF). For users on Windows operating systems, the .SDF format library can be converted to NIST MS Search format using Lib2NIST and then explored using NIST MS Search v2.4 for general mass spectral analysis. These software tools can be downloaded at https://chemdata.nist.gov. The database is now (09-28-2021) also provided in R data format (.RDS) for use with the R programming language. This database, also commonly referred to as a library, is one in a series of high-quality mass spectral libraries/databases produced by NIST (see NIST SRD 1a, https://dx.doi.org/10.18434/T4H594).
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
A system for creating a library of tandem mass spectra annotated with corresponding peptide sequences was described. This system was based on the annotated spectra currently available in the Global Proteome Machine Database (GPMDB). The library spectra were created by averaging together spectra that were annotated with the same peptide sequence, sequence modifications, and parent ion charge. The library was constructed so that experimental peptide tandem mass spectra could be compared with those in the library, resulting in a peptide sequence identification based on scoring the similarity of the experimental spectrum with the contents of the library. A software implementation that performs this type of library search was constructed and successfully used to obtain sequence identifications. The annotated tandem mass spectrum libraries for the Homo sapiens, Mus musculus, and Saccharomyces cerevisiae proteomes and search software were made available for download and use by other groups. Keywords: peptide spectrum library • X! Hunter • GPM • GPMDB • protein identification
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
surficialDL: A geomorpholgy deep learning dataset of alluvium and thick glacial till derived form 1:24,000 scale surficial geology data for the western portion of Massachusetts, USA
scripts.zip
arcgisTools.atbx: terrainDerivatives: make terrain derivatives from digital terrain model (Band 1 = TPI (50 m radius circle), Band 2 = square root of slope, Band 3 = TPI (annulus), Band 4 = hillshade, Band 5 = multidirectional hillshades, Band 6 = slopeshade). rasterizeFeatures: convert vector polygons to raster masks (1 = feature, 0 = background).
makeChips.R: R function to break terrain derivatives and chips into image chips of a defined size. makeTerrainDerivatives.R: R function to generated 6-band terrain derivatives from digital terrain data (same as ArcGIS Pro tool). merge_logs.R: R script to merge training logs into a single file. predictToExtents.ipynb: Python notebook to use trained model to predict to new data. trainExperiments.ipynb: Python notebook used to train semantic segmentation models using PyTorch and the Segmentation Models package. assessmentExperiments.ipynb: Python code to generate assessment metrics using PyTorch and the torchmetrics library. graphs_results.R: R code to make graphs with ggplot2 to summarize results. makeChipsList.R: R code to generate lists of chips in a directory. makeMasks.R: R function to make raster masks from vector data (same as rasterizeFeatures ArcGIS Pro tool).
surficialDL
The digital terrain model associated with these data/project is available here: https://s3.us-east-1.amazonaws.com/download.massgis.digital.mass.gov/lidar/LIDAR_DEM_32BIT_FP.gdb.zip.
alluvDL: polygons (vectors folder) and extents (extents folder) for alluvium features separated into training, validation, and testing partitions. These data were derived from the 1:24,000 scale Massachusetts Surficial Geology dataset: https://www.mass.gov/info-details/massgis-data-usgs-124000-surficial-geology.
tillDL: polygons (vector folder) and extents (extents folder) for thick till features separated into training, validation, and testing partitions. These data were derived from the 1:24,000 scale Massachusetts Surficial Geology dataset: https://www.mass.gov/info-details/massgis-data-usgs-124000-surficial-geology.
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Data-independent acquisition-mass spectrometry (DIA-MS) is the method of choice for deep, consistent, and accurate single-shot profiling in bottom-up proteomics. While classic workflows for targeted quantification from DIA-MS data require auxiliary data-dependent acquisition (DDA) MS analysis of subject samples to derive prior-knowledge spectral libraries, library-free approaches based on in silico prediction promise deep DIA-MS profiling with reduced experimental effort and cost. Coverage and sensitivity in such analyses are however limited, in part, by the large library size and persistent deviations from the experimental data. We present MSLibrarian, a new workflow and tool to obtain optimized predicted spectral libraries by the integrated usage of spectrum-centric DIA data interpretation via the DIA-Umpire approach to inform and calibrate the in silico predicted library and analysis approach. Predicted-vs-observed comparisons enabled optimization of intensity prediction parameters, calibration of retention time prediction for deviating chromatographic setups, and optimization of the library scope and sample representativeness. Benchmarking via a dedicated ground-truth-embedded experiment of species-mixed proteins and quantitative ratio-validation confirmed gains of up to 13% on peptide and 8% on protein level at equivalent FDR control and validation criteria. MSLibrarian is made available as an open-source R software package, including step-by-step user instructions, at https://github.com/MarcIsak/MSLibrarian.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Produces Tables 2 to 4 and Figs 5–8. In addition to the packages listed in S1 File, the following packages are required: MASS 73sandwich 74stats4 (https://cran.r-project.org/package=stats4) Directions and package versions used for publication are in comments. MASS 73 sandwich 74 stats4 (https://cran.r-project.org/package=stats4) (R)
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Only female mosquitoes consume blood giving them the opportunity to transmit deadly human pathogens. Therefore, it is critical to remove females before conducting releases for genetic biocontrol interventions. Here we describe a robust sex-sorting approach termed SEPARATOR (Sexing Element Produced by Alternative RNA-splicing of A Transgenic Observable Reporter) that exploits sex-specific alternative splicing of an innocuous reporter to ensure exclusive dominant male-specific expression. Using SEPARATOR, we demonstrate reliable sex selection from early larval and pupal stages in Aedes aegypti, and use a Complex Object Parametric Analyzer and Sorter (COPAS) to demonstrate scalable high-throughput sex-selection of first instar larvae. Additionally, we use this approach to sequence the transcriptomes of early larval males and females and find several genes that are sex-specifically expressed. SEPARATOR can simplify mass production of males for release programs and is designed to be cross-species portable and should be instrumental for genetic biocontrol interventions.
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
In mass spectrometry-based lipidomics, complex lipid mixtures undergo chromatographic separation, are ionized, and are detected using tandem MS (MSn) to simultaneously quantify and structurally characterize eluting species. The reported structural granularity of these identified lipids is strongly reliant on the analytical techniques leveraged in a study. For example, lipid identifications from traditional collisionally activated data-dependent acquisition experiments are often reported at either species level or molecular species level. Structural resolution of reported lipid identifications is routinely enhanced by integrating both positive and negative mode analyses, requiring two separate runs or polarity switching during a single analysis. MS3+ can further elucidate lipid structure, but the lengthened MS duty cycle can negatively impact analysis depth. Recently, functionality has been introduced on several Orbitrap Tribrid mass spectrometry platforms to identify eluting molecular species on-the-fly. These real-time identifications can be leveraged to trigger downstream MSn to improve structural characterization with lessened impacts on analysis depth. Here, we describe a novel lipidomics real-time library search (RTLS) approach, which utilizes the lipid class of real-time identifications to trigger class-targeted MSn and to improve the structural characterization of phosphotidylcholines, phosphotidylethanolamines, phosphotidylinositols, phosphotidylglycerols, phosphotidylserine, and sphingomyelins in the positive ion mode. Our class-based RTLS method demonstrates improved selectivity compared to the current methodology of triggering MSn in the presence of characteristic ions or neutral losses.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The test statistics are: Trend — Cochran-Armitage trend test; Chi-Square — Pearson's chi-square test; LRT — the likelihood ratio test for the proportional odds model computed by using the polr function in the R package MASS; Score — the proposed score statistic computed by using the SNPass.test function in the R package iGasso.Simulated type I error rate in the case of a univariate phenotype under various generating models.
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Mass spectrometry imaging (MSI) experiments result in complex multi-dimensional datasets, which require specialist data analysis tools. Here we have developed massPix - an R package for analysing and interpreting data from MSI of lipids in tissue. MassPix is an open-source tool for the analysis and statistical interpretation of MSI data, and is particularly useful for lipidomics applications. MassPix produces single ion images, performs multivariate statistics and provides putative lipid annotations based on accurate mass matching against generated lipid libraries. Classification of tissue regions with high spectral similarly can be carried out by principal components analysis (PCA) or k-means clustering. Mouse cerebellum was analysed using matrix assisted laser desorption ionisation (MALDI) MSI. The resulting MSI dataset forms the test data for massPix.