Facebook
Twitterhttps://www.nist.gov/open/licensehttps://www.nist.gov/open/license
Data here contain and describe an open-source structured query language (SQLite) portable database containing high resolution mass spectrometry data (MS1 and MS2) for per- and polyfluorinated alykl substances (PFAS) and associated metadata regarding their measurement techniques, quality assurance metrics, and the samples from which they were produced. These data are stored in a format adhering to the Database Infrastructure for Mass Spectrometry (DIMSpec) project. That project produces and uses databases like this one, providing a complete toolkit for non-targeted analysis. See more information about the full DIMSpec code base - as well as these data for demonstration purposes - at GitHub (https://github.com/usnistgov/dimspec) or view the full User Guide for DIMSpec (https://pages.nist.gov/dimspec/docs). Files of most interest contained here include the database file itself (dimspec_nist_pfas.sqlite) as well as an entity relationship diagram (ERD.png) and data dictionary (DIMSpec for PFAS_1.0.1.20230615_data_dictionary.json) to elucidate the database structure and assist in interpretation and use.
Facebook
TwitterNIST peptide libraries are comprehensive, annotated mass spectral reference collections from various organisms and proteins useful for the rapid matching and identification of acquired MS/MS spectra. Spectra were produced by tandem mass spectrometers using liquid chromatographic separations followed by electrospray ionization. Unlike the NIST small molecule electron ionization library which contains one spectrum per molecular structure, there are several different modes of fragmentation (ion trap and ?beam-type? collision cells are currently the most commonly used fragmentation devices) that result in spectra with different, energy dependent, patterns. These result in multiple spectral libraries, distinguished by ionization mode, each of which may contain several spectra per peptide. Different libraries have also been assembled for iTRAQ-4 derivatized peptides and for phosphorylated peptides. Separating libraries by animal species reduces search time, although investigators may elect to include several species in their searches.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This repository contains a database of high-resolution electron ionization (EI) mass spectra recorded under gas chromatography - mass spectrometry (GC-MS) conditions. The vast majority of publicly available GC-MS data sets are obtained using low-resolution mass spectrometry. Few exceptions are the works E.J. Price, 2021, and V.Castro, 2022. At the same time, gas chromatography-high-resolution mass spectrometry (GC-HRMS) is used quite often in studies.This database aimed to create a GC-HRMS data set covering the diverse classes of volatile compounds (trimethylsilyl- derivatives are not included!), using a wide m/z range (starting from m/z = 40). Mass spectra were recorded using an Orbitrap Exploris GC mass detector (Thermo Fisher Scientific, USA). The mass determination error is no more than 0.0006 Da, and the mass spectral resolution value is 30000. All mass spectra were checked manually; the .zip archives contain information on peak annotations. The data.xlsx file contains a list of compounds and spectra IDs. Peaks with intensity less than 1/999 of the most intense were discarded.The data set includes:130 mass spectra of pure compounds recorded using GC-MS of 10-molecule batches or GC-MS of individual compound solutions.61 mass spectra of compounds included in the 8270 MegaMix standard compound mixture.45 mass spectra of volatile compounds included in lavender essential oil.38 mass spectra of volatile compounds included in mint essential oil.33 mass spectra of volatile compounds included in lemon essential oil.22 mass spectra of volatile compounds included in coffee.These groups of spectra are designated as Pure samples, 8270 MegaMix Standard, Lavender (essential oil), Mint (essential oil), Lemon (essential oil), and Coffee, respectively in the data.xlsx file and in the "Comments" tag in the MSP files. Please note which spectrum was obtained in what way. Identification of compounds in essential oils and coffee is quite reliable, but it was still performed without using standard samples.For convenience, in some cases (for essential oils), SMILES are provided using symbols denoting stereoisomers, but we cannot be sure that we really know which stereoisomer we are considering: often, both the retention indices and mass spectra are very close.Detailed information on the experimental conditions under which the spectra were obtained, on the equipment, and data processing is contained in the info.pdf file. The quality_assessment.xlsx file contains data obtained during quality control of the mass spectra (see the info.pdf file for additional information).Each file named all_spectra contains all spectra (both those obtained using the sample collection and those obtained from essential oil and coffee samples) in different file formats. Most likely, you need the all_spectra.msp file (NIST-compatible), it contains all the data. The plant_volatiles.msp file contains all mass spectra obtained from essential oils and coffee. The names of the remaining files are self-explanatory. If you need annotations of all peaks or more file formats, then look at the .zip archives. JCAMP (.jdx) files are in the .zip archives.Processing (interpretation) of mass spectra was done using our software:https://github.com/mtshn/gchrmsexplain versions 0.0.2 and 0.0.3.The settings used are given in the info.pdf file; however, these settings are the default for the corresponding versions.Levels of explanation of each peak in the mass spectrum:Level 1 - the molecular formula is selected, but some isotopic peaks are not found at allLevel 2 - isotopic peaks merge with other peaks. For example, the 13C peak of some ion X is superimposed (taking into account the resolution) on the main peak X + H. At not very high resolutions, such peaks may not be resolved. This also includes cases of "incorrect" isotopic peak intensity, differing from the theoretically calculated one.Level 3 - all main isotopic peaks are observed correctly, up to the accuracy of mass determination.The minimum number of bonds that must be broken to obtain such a fragment is indicated without taking into account the loss of hydrogens, as well as without some other "trivial" bond breaks: the loss of a halogen atom, a methyl group, NO-loss from a nitro-group. Details are given in the documentation of the software used to process the mass spectra: https://github.com/mtshn/gchrmsexplain.In files containing abbreviated interpretations of mass spectra (e.g., in CSV_annotated folders in .zip archives), notations like 3-1 are used. The first number denotes the interpretation level (see above), and the second denotes the number of (non-trivial) bond breaks required to obtain such a molecular formula.
Facebook
Twitterhttps://www.nist.gov/open/licensehttps://www.nist.gov/open/license
The NIST DART-MS Forensics Database is an evaluated collection of in-source collisionally-induced dissociation (is-CID) mass spectra of compounds of interest to the forensics community (e.g. seized drugs, cutting agents, etc.). The is-CID mass spectra were collected using Direct Analysis in Real-Time (DART) Mass Spectrometry (MS), either by NIST scientists or by contributing agencies noted per compound. The database is provided as a general-purpose structure data file (.SDF). For users on Windows operating systems, the .SDF format library can be converted to NIST MS Search format using Lib2NIST and then explored using NIST MS Search v2.4 for general mass spectral analysis. These software tools can be downloaded at https://chemdata.nist.gov. The database is now (09-28-2021) also provided in R data format (.RDS) for use with the R programming language. This database, also commonly referred to as a library, is one in a series of high-quality mass spectral libraries/databases produced by NIST (see NIST SRD 1a, https://dx.doi.org/10.18434/T4H594).
Facebook
TwitterThe METLIN (Metabolite and Tandem Mass Spectrometry) Database is a repository of metabolite information as well as tandem mass spectrometry data, providing public access to its comprehensive MS and MS/MS metabolite data. An annotated list of known metabolites and their mass, chemical formula, and structure are available, with each metabolite linked to external resources for further reference and inquiry.
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
INPUT.7z contains the initial data of the original spectra from the OMSLs (Open Mass Spectral Libraries).
OUTPUT.7z contains the filtered and standardized data of these spectra after processing by FragHub, for each output files formats.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
(Version 20181130)
Edit #1 (Mar 06, 2023): New database version (v.4.2 - 20230306) - available: https://zenodo.org/records/14562231
Version 3 (20181130) of the RKI’s MALDI-TOF mass spectral database represents the second update of the original database (version 20161027, https://doi.org/10.5281/zenodo.163517). The RKI database v.3 contains altogether 6264 mass spectra from highly pathogenic (i.e. biosafety level 3, BSL-3) bacteria such as Bacillus anthracis, Yersinia pestis, Burkholderia mallei, Burkholderia pseudomallei and Francisella tularensis as well as a selection of spectra from their close and more distant relatives. The database can be used as a reference for the diagnostics of BSL-3 bacteria using proprietary and free software packages for MALDI-TOF MS-based microbial identification. Spectral data are distributed as a 7-zip archive that contains the original mass spectra in its native data format (Bruker Daltonics). Please refer to the pdf file (181130-ZENODO-Metadata.pdf) to obtain information on cultivation condition, sample preparation and details of spectra acquisition. Do not try to print this document (~1000 pages!)
The pkf-file (181130_ZENODO_Peaklist_30Peaks_1.6.pkf) contains the MS peak list data in a Matlab compatible format. The latter data file can be imported into MicrobeMS, a Matlab-based free-of-charge software solution developed at RKI. MicrobeMS is available from https://wiki-ms.microbe-ms.com.
The RKI mass spectral database will be updated on a regular basis.
The author's grateful thanks are given to the following persons for providing microbial strains and species, or mass spectra. Without their help this work would not be possible.
Facebook
TwitterPublic repository of mass spectral data which allows users to search similar spectra on a peak-to-peak basis, on a neutral loss-to-neutral loss basis, or by the m/z value and molecular formula, search chemical compounds by substructures, and keyword search chemical compounds
Facebook
TwitterMetadata-centric, auto-curating repository designed for storage and querying of mass spectral records. It contains metabolite mass spectra, metadata and associated compounds.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
(Version 20230306)
Version 4 (20230306) of the RKI MALDI-ToF mass spectra database is the third update of the original database (version 20161027, https://doi.org/10.5281/zenodo.163517). The RKI Database v.4 now contains a total of 11055 MALDI-ToF mass spectra from 1599 microbial strains of highly pathogenic (i.e. biosafety level 3, BSL-3) bacteria such as Bacillus anthracis, Brucella melitensis, Yersinia pestis, Burkholderia mallei / pseudomallei and Francisella tularensis as well as a selection of spectra of their close and distant relatives. The database can be used as a reference for the diagnosis of BSL-3 bacteria using proprietary and free software packages for MALDI-ToF MS-based microbial identification. The spectral data are provided as a zip archive (zenodo db 230306.zip) containing the original mass spectra in their native data format (Bruker Daltonics). Please refer to the pdf file (230306-ZENODO-Metadata.pdf) for information on cultivation conditions, sample preparation and details of the spectra acquisition. Please do not try to print this document (>1600 pages!).
Version 20230306 of the RKI database contains for the first time a file in btmsp format (230306_v4_RKI_DB_BSL3.btmsp). This file was generated using the MALDI Biotyper software (Bruker Daltonics) and contains a total of 1599 main spectra from the BSL-3 database in the proprietary data format of the MALDI Biotyper software. *.btmsp files can be imported and used for identification with this software solution. Note that the btmsp file available in database version 4 is broken and cannot be imported. Please refer to updated database versions (4.1, or 4.2) to download valid btmsp files.
The pkf files (230306_ZENODO_30Peaks_0.75.pkf, 230306_ZENODO_45Peaks_0.75.pkf) represent two versions of the MS peak list data in a Matlab compatible format. The latter data can be imported into MicrobeMS, a free Matlab-based software solution developed at the RKI. MicrobeMS can be used for the identification of microorganisms by MALDI-ToF MS and is available at https://wiki-ms.microbe-ms.com.
The RKI mass spectrometry database is updated regularly.
The author would like to thank the following individuals for providing microbial strains and species or mass spectra thereof. Without their help, this work would not have been possible.
Wolfgang Beyer - University of Hohenheim, Faculty of Agricultural Sciences, Stuttgart, Germany
Guido Werner - Robert Koch-Institute, Nosocomial Pathogens and Antibiotic Resistances (FG13), Wernigerode, Germany
Alejandra Bosch - CINDEFI, CONICET-CCT La Plata, Facultad de Ciencias Exactas, Universidad Nacional de La Plata, La Plata, Buenos Aires, Argentina
Michal Drevinek - National Institute for Nuclear, Biological and Chemical Protection, Milin, Czech Republic
Roland Grunow, Daniela Jacob, Silke Klee, Susann Dupke and Holger Scholz - Robert Koch-Institute, Highly Pathogenic Microorganisms (ZBS2), Berlin, Germany
Jörg Rau - Chemisches und Veterinäruntersuchungsamt Stuttgart, Fellbach, Germany
Jens Jacob - Robert Koch-Institute, Hospital Hygiene, Infection Prevention and Control (FG14), Berlin, Germany
Martin Mielke - Robert Koch-Institute, Department 1 - Infectious Diseases, Berlin, Germany
Monika Ehling-Schulz - Functional Microbiology, Institute of Microbiology, University of Veterinary Medicine, Vienna, Austria
Armand Paauw - Department of Medical Microbiology, CBRN protection, Universitair Medisch Centrum Utrecht, TNO, Rijswijk, The Netherlands
Herbert Tomaso – Friedrich-Löffler-Institut (FLI), Federal Research Institute for Animal Health, Jena, Germany
Gabriel Karner - Karner Düngerproduktion GmbH, Research & Development, Neulengbach, Austria
Rainer Borriss - Institute of Marine Biotechnology e.V. (IMaB), Greifswald, Germany
Le Thi Thanh Tam - Division of Plant Pathology and Phyto-Immunology, Plant Protection Research Institute, Hanoi, Socialist Republic of Vietnam
Xuewen Gao - College of Plant Protection, Nanjing Agricultural University, Key Laboratory of Integrated Management of Crop Diseases and Pests, Nanjing, People’s Republic of China
Facebook
TwitterGolm Metabolome Database (GMD) provides public access to custom mass spectral libraries, metabolite profiling experiments as well as additional information and tools. Analytes are subjected to a gas chromatograph coupled to a mass spectrometer, which records the mass spectrum and the retention time linked to an analyte. This collection references GC-MS spectra.
Facebook
TwitterA library containing spectra upwards of 200,000 chemical compounds. Spectra include metabolites, peptides, contaminants, and lipids. All spectra and chemical structures are examined by professionals.
Facebook
TwitterAttribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
To achieve accurate assignment of peptide sequences to observed fragmentation spectra, a shotgun proteomics database search tool must make good use of the very high-resolution information produced by state-of-the-art mass spectrometers. However, making use of this information while also ensuring that the search engine’s scores are well calibrated, that is, that the score assigned to one spectrum can be meaningfully compared to the score assigned to a different spectrum, has proven to be challenging. Here we describe a database search score function, the “residue evidence” (res-ev) score, that achieves both of these goals simultaneously. We also demonstrate how to combine calibrated res-ev scores with calibrated XCorr scores to produce a “combined p value” score function. We provide a benchmark consisting of four mass spectrometry data sets, which we use to compare the combined p value to the score functions used by several existing search engines. Our results suggest that the combined p value achieves state-of-the-art performance, generally outperforming MS Amanda and Morpheus and performing comparably to MS-GF+. The res-ev and combined p-value score functions are freely available as part of the Tide search engine in the Crux mass spectrometry toolkit (http://crux.ms).
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
unique INCHIKEY2D experimental: 248245
unique INCHIKEY2D in silico: 1611201
Facebook
TwitterA mass spectral database for organic compounds. The spectra included in the database are: electron impact Mass spectrum (EI-MS), Fourier transform infrared spectrum (FT-IR), 1H nuclear magnetic resonance (NMR) spectrum, 13C NMR spectrum, laser Raman spectrum, and electron spin resonance (ESR) spectrum.
Facebook
TwitterA mass spectral database that assists in identifying compunds in life sciences, matabolomics, pharmaceutical research, toxicology, forensic investigations, environemnta analysis, food control, and industry.
Facebook
TwitterAttribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
(Version 20170523)
Edit #1 (Nov 30, 2018): New database version (v.3 - 20181130) - available: 10.5281/zenodo.1880975
Edit #2 (Mar 06, 2023): New database version (v.4.2 - 20230306) - available: 10.5281/zenodo.7702375
Version 2 (20170523) of the RKI’s MALDI-TOF mass spectral database is an update of the original database (version 20161027, https://doi.org/10.5281/zenodo.163517). The RKI database contains mass spectral entries from highly pathogenic (biosafety level 3, BSL-3) bacteria such as Bacillus anthracis, Yersinia pestis, Burkholderia mallei, Burkholderia pseudomallei and Francisella tularensis as well as a selection of spectra from their close and more distant relatives. The database can be used as a reference for the diagnostics of BSL-3 bacteria using proprietary and free software packages for MALDI-TOF MS-based microbial identification. Spectral data are distributed as a 7-zip archive that contains the original mass spectra in its native data format (Bruker Daltonics). Please refer to the pdf file (170523-ZENODO-Metadata.pdf) to obtain information on the metadata of the spectra. Do not try to print this document (~1100 pages!)
The pkf-file (170523_ZENODO_Peaklist_30Peaks_1.6.pkf) contains the MS peak list data in a Matlab compatible format. The latter data file can be imported into MicrobeMS, a Matlab-based free-of-charge software solution developed at RKI. MicrobeMS is available from http://www.microbe-ms.com.
The RKI mass spectral database will be updated on a regular basis.
The author's grateful thanks are given to the following persons for providing microbial strains and species. Without their help this work would not be possible.
Wolfgang Beyer - University of Hohenheim, Faculty of Agricultural Sciences, Stuttgart, Germany
Guido Werner - Robert Koch-Institute, Nosocomial Pathogens and Antibiotic Resistances (FG13), Wernigerode, Germany
Alejandra Bosch - CINDEFI, CONICET-CCT La Plata, Facultad de Ciencias Exactas, Universidad Nacional de La Plata, La Plata, Buenos Aires, Argentina
Michal Drevinek - National Institute for Nuclear, Biological and Chemical Protection, Milin, Czech Republic
Roland Grunow - Robert Koch-Institute, Highly Pathogenic Microorganisms (ZBS2), Berlin, Germany
Daniela Jacob - Robert Koch-Institute, Highly Pathogenic Microorganisms (ZBS2), Berlin, Germany
Silke Klee - Robert Koch-Institute, Highly Pathogenic Microorganisms (ZBS2), Berlin, Germany
Jörg Rau - Chemisches und Veterinäruntersuchungsamt Stuttgart, Fellbach, Germany
Jens Jacob - Robert Koch-Institute, Hospital Hygiene, Infection Prevention and Control (FG14), Berlin, Germany
Martin Mielke - Robert Koch-Institute, Department 1 - Infectious Diseases, Berlin, Germany
Monika Ehling-Schulz - Functional Microbiology, Institute of Microbiology, University of Veterinary Medicine, Vienna, Austria
Armand Paauw - Department of Medical Microbiology, CBRN protection, Universitair Medisch Centrum Utrecht, TNO, Rijswijk, The Netherlands
Facebook
TwitterPublication Salla et al., 2013. Anal Chim Acta 794, 55, DOI: 10.1016/j.aca.2013.07.014. A mass spectrometry library for shrimp identification was developed with the goal of developing mass spectrometry methods for identifying contaminated seafood using mass spectrometry fingerprinting. Matrix-assisted laser desorption ionization (MALDI) time of flight mass spectrometry was used to identify shrimp at the species level using commercial mass spectral fingerprint matching software (Bruker Biotyper). In the first step, a mass spectrum reference database was constructed from the analysis of six commercially important shrimp species: L. setiferus, F. azticus, S. brevirostris, P. robustus, P. dispar and P. platyceros. In the second step, the reference database was tested using 74 unknown shrimp samples from these six species. Specimens were collected by extractive fishing in the Gulf of Mexico, North Pacific Coast, and North Atlantic Coast and shipped to our location on ice or, for Louisiana shrimp, obtained locally. Correct identification was achieved for 72 of the 74 samples (97%): 72 samples were identified at the species level and 2 samples were identified at the genus level using the manufacturer’s log score specifications. Samples of 1 g of shrimp skeletal muscle were obtained by dissecting a shrimp and then homogenizing at room temperature in 2 mL of nanopure water using a mortar and pestle. The homogenate was then centrifuged at 13,000 rpm for 20 min. The supernatant was removed and further purified using desalting pipette tips. A 4 µl volume of the desalted sample was directly pipetted into 4 µl of 30 mg/ml 2, 5-dihydroxy benzoic acid matrix solution in 1:1 (v/v) ethanol/0.1 % TFA. A 1 µl aliquot of the analyte/matrix mixed solution was spotted onto a stainless steel MALDI target and allowed to dry at room temperature. MALDI-TOF MS measurements were performed on a commercial instrument in positive ion reflectron mode with an accelerating voltage of 25 kV and analyzed in the mass range of 1,000 – 5,000 Da. A minimum of 500 laser shots per sample was used to generate each mass spectrum. MALDI BioTyper 2.0 software (Bruker) was used for the mass spectra fingerprinting [46]. Unknown spectra are identified by comparing their individual peak lists to the mass spectrum database and a matching score based on identified masses and their intensity is used for ranking of the results. The MALDI fingerprinting method for the identification of shrimp species was found to be reproducible and accurate with rapid analysis. Data was collected between October 2011 and May 2012 in the Department of Chemistry at Louisiana State University in Baton Rouge.
Facebook
Twitter
Facebook
TwitterThe Proteome 2D-PAGE Database system for microbial research is a curated database for storing and investigating proteomics data. Software tools are available and for data submission, please contact the Database Curator. Established at the Max Plank Institution for Infection Biology, this system contains four interconnected databases: i.) 2D-PAGE Database: Two dimensional electrophoresis (2-DE) and mass spectrometry of diverse microorganisms and other organisms. This database currently contains 4971 identified spots and 1228 mass peaklists in 44 reference maps representing experiments from 24 different organisms and strains. The data were submitted by 84 Submitters from 24 Institutes and 12 nations. It also contains various software tools that are important in formatting and analyzing gels and mass peaks; software include: *TopSpot: Scanning the gel, editing the spots and saving the information *Fragmentation: Fragmentation of the gel image into sections *MS-Screener: Perl script to compare the similarity of MALDI-PMF peaklists *MS-Screener update: MS-Screener can be used to compare mass spectra (MALDI-MS(/MS) as well as ESI-MS/MS spectra) on the basis of their peak lists (.dta, .pkm, .pkt, or .txt files), to recalibrate mass spectra, to determine and eliminate exogenous contaminant peaks, and to create matrices for cluster analyses. *GelCali: Online calibration of the Mr- and pI-axis of 2-DE gels with mathematical regression methods ii.)Isotope Coded Affinity Tag (ICAT)-LC/MS database: Isotope Coded Affinity Tag (ICAT)-LC/MS data for Mycobacterium tuberculosis strain BCG versus H37Rv. iii.) FUNC_CLASS database: Functional classification of diverse microorganism. This database also integrates genomic, proteomic, and metabolic data. iv.) DIFF database: Presentation of differently regulated proteins obtained by comparative proteomic experiments using computerized gel image analysis.
Facebook
Twitterhttps://www.nist.gov/open/licensehttps://www.nist.gov/open/license
Data here contain and describe an open-source structured query language (SQLite) portable database containing high resolution mass spectrometry data (MS1 and MS2) for per- and polyfluorinated alykl substances (PFAS) and associated metadata regarding their measurement techniques, quality assurance metrics, and the samples from which they were produced. These data are stored in a format adhering to the Database Infrastructure for Mass Spectrometry (DIMSpec) project. That project produces and uses databases like this one, providing a complete toolkit for non-targeted analysis. See more information about the full DIMSpec code base - as well as these data for demonstration purposes - at GitHub (https://github.com/usnistgov/dimspec) or view the full User Guide for DIMSpec (https://pages.nist.gov/dimspec/docs). Files of most interest contained here include the database file itself (dimspec_nist_pfas.sqlite) as well as an entity relationship diagram (ERD.png) and data dictionary (DIMSpec for PFAS_1.0.1.20230615_data_dictionary.json) to elucidate the database structure and assist in interpretation and use.