Data here contain and describe an open-source structured query language (SQLite) portable database containing high resolution mass spectrometry data (MS1 and MS2) for per- and polyfluorinated alykl substances (PFAS) and associated metadata regarding their measurement techniques, quality assurance metrics, and the samples from which they were produced. These data are stored in a format adhering to the Database Infrastructure for Mass Spectrometry (DIMSpec) project. That project produces and uses databases like this one, providing a complete toolkit for non-targeted analysis. See more information about the full DIMSpec code base - as well as these data for demonstration purposes - at GitHub (https://github.com/usnistgov/dimspec) or view the full User Guide for DIMSpec (https://pages.nist.gov/dimspec/docs).Files of most interest contained here include the database file itself (dimspec_nist_pfas.sqlite) as well as an entity relationship diagram (ERD.png) and data dictionary (DIMSpec for PFAS_1.0.1.20230615_data_dictionary.json) to elucidate the database structure and assist in interpretation and use.
A multistage tandem mass spectral database using a variety of structurally defined glycans. It provides tools for glycomics research that enable users to identify glycans by spectral matching. The database stores MS2, MS3, and MS4 spectra of N-and O-linked glycans, and glycolipid glycans as well as the partial structures of these glycans.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The data for MSnLib are divided into several Zenodo records due to size constraints.
raw positive: 10966404raw negative: 10967081mzml positive and negative: 10966280spectral libraries: 11163380
This record includes the automatically generated spectral libraries (MSnLib) within mzmine, acquired using a flow injection method on an Orbitrap ID-X instrument, for all compound libraries. There are multiple files for each compound library containing MS2 only or MSn in two data formats (.mgf or .json) for both polarities.
MS2 contains next to all MS2 spectra all pseudo MS2 spectra (a full MSn tree merged into one spectrum per compound ion). MSn contains all individual MSn stages additionally. The first number for each file highlights the library building date.
7 Compound Libraries:
Short Name: Full name, Provider (Catalog number), total compounds (not all detected during library building)
MCEBIO: Bioactive Compound Library, MedChemExpress (HY-L001), 10,315 compounds
MCESAF: 5k Scaffold Library, MedChemExpress, (HY-L902), 4998 compounds
NIHNP: NIH NPAC ACONN collection of NP, NIH/NCATS, 3988 compounds
OTAVAPEP: Alpha-helix Peptiomimetic Library, OTAVAchemicals (a-helix-Peptido), 1298 compounds
ENAMDISC: Discovery Diversity Set -10, Enamine (DDS-10), 10,240 compounds
ENAMMOL: Carboxylic Acid Fragment Library + Random, Enamine and Molport, 4378 compounds
MCEDRUG: FDA-Approved Drug Library, MedChemExpress (HY-L022), 2610 compounds
Information regarding the SPECTYPE
no SPECTYPE or SINGLE_BEST_SCAN: Best spectrum for each precursor and energy (highest TIC)
'SAME_ENERGY' = Additionally, if a spectrum was acquired multiple times for a precursor with the same energy, they are merged into one spectrum only with the same energy (max. signal height used for each fragment signal).
'ALL_ENERGIES' = merged spectrum of all used energies (in our case 3 for each precursor, using the merged (same energy) if available).
'ALL_MSN_TO_PSEUDO_MS2' = mzmine merges all MSn into one pseudo MS2.
V5 fixed USIs
A library containing spectra upwards of 200,000 chemical compounds. Spectra include metabolites, peptides, contaminants, and lipids. All spectra and chemical structures are examined by professionals.
The NIST DART-MS Forensics Database is an evaluated collection of in-source collisionally-induced dissociation (is-CID) mass spectra of compounds of interest to the forensics community (e.g. seized drugs, cutting agents, etc.). The is-CID mass spectra were collected using Direct Analysis in Real-Time (DART) Mass Spectrometry (MS), either by NIST scientists or by contributing agencies noted per compound. The database is provided as a general-purpose structure data file (.SDF). For users on Windows operating systems, the .SDF format library can be converted to NIST MS Search format using Lib2NIST and then explored using NIST MS Search v2.4 for general mass spectral analysis. These software tools can be downloaded at https://chemdata.nist.gov. The database is now (09-28-2021) also provided in R data format (.RDS) for use with the R programming language. This database, also commonly referred to as a library, is one in a series of high-quality mass spectral libraries/databases produced by NIST (see NIST SRD 1a, https://dx.doi.org/10.18434/T4H594).
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Raw LC-MS (liquid chromatography-mass spectrometry) data for photocatalyst 1 (PC1). Data acquired on a Waters Acquity UPLC + Xevo G2-XS (LC-MS/MS). Sample in Water:Acetonitrile 95:5.
A mass spectral database that assists in identifying compunds in life sciences, matabolomics, pharmaceutical research, toxicology, forensic investigations, environemnta analysis, food control, and industry.
Public repository of mass spectral data which allows users to search similar spectra on a peak-to-peak basis, on a neutral loss-to-neutral loss basis, or by the m/z value and molecular formula, search chemical compounds by substructures, and keyword search chemical compounds,
https://academictorrents.com/nolicensespecifiedhttps://academictorrents.com/nolicensespecified
GC-MS Database NIST/EPA/NIH MASS SPECTRAL LIBRARY (NIST 08) + update 2010 2.0f Apr 1 2009 x86 [2008, ENG] This library package contains the NIST 2008 Mass Spectral Library in the following manufacturer formats: 1. Agilent Chemstation (.L) (with structures) 2. NIST MS Search (compatible with most mass spectrometry software brands): Bruker; JEOL; LECO; PerkinElmer TurboMass; Thermo Electron XCalibur; Varian MS Workstation; Waters MassLynx; and other brands 3. PerkinElmer TurboMass (IDB) (with structures) 4. Shimadzu GCMS Solution (QP5000) (SPC) (no structures) 5. Waters MassLynx (IDB) (with structures) 6. Finnigan GCQ/Varian ITS-40 7. Thermo Galactic Spectral ID Includes: - Over 220,000 spectra, - Over 190,000 chemical structures, and - GC Retention Index Library, MS/MS Library - Licenses keys
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Additional file 2: Table S1. List of strains used in this study.
A public repository of metabolite information as well as tandem mass spectrometry data is provided to facilitate metabolomics experiments. It contains structures and represents a data management system designed to assist in a broad array of metabolite research and metabolite identification. An annotated list of known metabolites and their mass, chemical formula, and structure are available. Each metabolite is linked to outside resources for further reference and inquiry. MS/MS data is also available on many of the metabolites.
Direct Analysis in Real Time Mass Spectrometry (DART-MS) is an analytical chemistry technology that is being increasingly employed in forensic applications. This form of mass spectrometry rapidly yields rich structural information about an analyte with minimal sample preparation. The challenge with DART-MS data, much like other data generated with high throughput technologies, lies in the data interpretation. This is especially true when the analyzed samples are multi-component mixtures like seized drug evidence. The NIST/NIJ DART-MS Data Interpretation Tool (DIT) is a freely available and open-source software tool developed to support the interpretation of in-source collision induced dissociation (is-CID) DART-MS data. The NIST/NIJ DART-MS DIT can be used to view reference mass spectra from DART-MS spectral libraries, search query DART-MS mass spectra of mixtures against reference libraries, using the Inverted Library Search Algorithm, and generate printable reports from search results. Several of the features, including the formatting of generated reports, were iteratively designed with input from local, state, and federal forensic practitioners, ensuring that the program is intuitive and usable for the expected users.
https://www.nist.gov/open/licensehttps://www.nist.gov/open/license
A SQLite database containing mass absorption coefficient (both discrete and continuous), atomic sub-shell binding energy, X-ray energy, jump ratio, ground-state occupancy, atomic relaxation rate following core shell ionization and X-ray linewidth data. The data is in the common SQLite format and also available in SQL format. SQLite is an open-source database which is supported on many different platforms. This database represents a compilation of data from other sources. Each datum is labeled with a literature reference which represents the source. The references are listed in the LIT_REFERENCES table with associated BIBTEX reference data. The two exceptions to this rule are the FFAST and FFAST_EXTRA tables which are associated with the Chantler2005 reference.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
(Version 20230306)
Version 4 (20230306) of the RKI MALDI-ToF mass spectra database is the third update of the original database (version 20161027, https://doi.org/10.5281/zenodo.163517). The RKI Database v.4 now contains a total of 11055 MALDI-ToF mass spectra from 1599 microbial strains of highly pathogenic (i.e. biosafety level 3, BSL-3) bacteria such as Bacillus anthracis, Brucella melitensis, Yersinia pestis, Burkholderia mallei / pseudomallei and Francisella tularensis as well as a selection of spectra of their close and distant relatives. The database can be used as a reference for the diagnosis of BSL-3 bacteria using proprietary and free software packages for MALDI-ToF MS-based microbial identification. The spectral data are provided as a zip archive (zenodo db 230306.zip) containing the original mass spectra in their native data format (Bruker Daltonics). Please refer to the pdf file (230306-ZENODO-Metadata.pdf) for information on cultivation conditions, sample preparation and details of the spectra acquisition. Please do not try to print this document (>1600 pages!).
Version 20230306 of the RKI database contains for the first time a file in btmsp format (230306_v4_RKI_DB_BSL3.btmsp). This file was generated using the MALDI Biotyper software (Bruker Daltonics) and contains a total of 1599 main spectra from the BSL-3 database in the proprietary data format of the MALDI Biotyper software. *.btmsp files can be imported and used for identification with this software solution. Note that the btmsp file available in database version 4 is broken and cannot be imported. Please refer to updated database versions (4.1, or 4.2) to download valid btmsp files.
The pkf files (230306_ZENODO_30Peaks_0.75.pkf, 230306_ZENODO_45Peaks_0.75.pkf) represent two versions of the MS peak list data in a Matlab compatible format. The latter data can be imported into MicrobeMS, a free Matlab-based software solution developed at the RKI. MicrobeMS can be used for the identification of microorganisms by MALDI-ToF MS and is available at https://wiki-ms.microbe-ms.com.
The RKI mass spectrometry database is updated regularly.
The author would like to thank the following individuals for providing microbial strains and species or mass spectra thereof. Without their help, this work would not have been possible.
Wolfgang Beyer - University of Hohenheim, Faculty of Agricultural Sciences, Stuttgart, Germany
Guido Werner - Robert Koch-Institute, Nosocomial Pathogens and Antibiotic Resistances (FG13), Wernigerode, Germany
Alejandra Bosch - CINDEFI, CONICET-CCT La Plata, Facultad de Ciencias Exactas, Universidad Nacional de La Plata, La Plata, Buenos Aires, Argentina
Michal Drevinek - National Institute for Nuclear, Biological and Chemical Protection, Milin, Czech Republic
Roland Grunow, Daniela Jacob, Silke Klee, Susann Dupke and Holger Scholz - Robert Koch-Institute, Highly Pathogenic Microorganisms (ZBS2), Berlin, Germany
Jörg Rau - Chemisches und Veterinäruntersuchungsamt Stuttgart, Fellbach, Germany
Jens Jacob - Robert Koch-Institute, Hospital Hygiene, Infection Prevention and Control (FG14), Berlin, Germany
Martin Mielke - Robert Koch-Institute, Department 1 - Infectious Diseases, Berlin, Germany
Monika Ehling-Schulz - Functional Microbiology, Institute of Microbiology, University of Veterinary Medicine, Vienna, Austria
Armand Paauw - Department of Medical Microbiology, CBRN protection, Universitair Medisch Centrum Utrecht, TNO, Rijswijk, The Netherlands
Herbert Tomaso – Friedrich-Löffler-Institut (FLI), Federal Research Institute for Animal Health, Jena, Germany
Gabriel Karner - Karner Düngerproduktion GmbH, Research & Development, Neulengbach, Austria
Rainer Borriss - Institute of Marine Biotechnology e.V. (IMaB), Greifswald, Germany
Le Thi Thanh Tam - Division of Plant Pathology and Phyto-Immunology, Plant Protection Research Institute, Hanoi, Socialist Republic of Vietnam
Xuewen Gao - College of Plant Protection, Nanjing Agricultural University, Key Laboratory of Integrated Management of Crop Diseases and Pests, Nanjing, People’s Republic of China
(Version 20161027) Edit #1 (May 23, 2017): New database version (v.2 - 20170523) - available: 10.5281/zenodo.582602 Edit #2 (Nov 30, 2018): New database version (v.3 - 20181130) - available: 10.5281/zenodo.1880975 Edit #3 (Mar 06, 2023): New database version (v.4.2 - 20230306) - available: 10.5281/zenodo.7702375 The Robert Koch-Institute (RKI) database of microbial MALDI-TOF mass spectra contains mass spectral entries from highly pathogenic (biosafety level 3, BSL-3) bacteria such as Bacillus anthracis, Yersinia pestis, Burkholderia mallei, Burkholderia pseudomallei and Francisella tularensis as well as a selection of spectra from their close and more distant relatives. The RKI mass spectral database can be used as a reference for the diagnostics of BSL-3 bacteria using proprietary and free software packages for MALDI-TOF MS-based microbial identification. The database itself is distributed as a zip archive that contains the original mass spectra in its native data format (Bruker Daltonics). Please refer to the pdf file (161027-ZENODO-Metadata.pdf) to obtain information on the metadata of the spectra. Do not try to print this document (~1000 pages!) The pkf-file (161027_zenodo_Peaklist_(30Peaks1,6).pkf ) contains so-called database spectra in a Matlab compatible format. The latter data file can be imported into MicrobeMS, a Matlab-based free-of-charge software solution developed at the RKI. MicrobeMS is available from http://www.microbe-ms.com. For the future it is intended to update the RKI database of MALDI-TOF mass spectra on a regular basis. The author's grateful thanks are given to the following persons for providing microbial strains and species. Without their help this work would not be possible. Wolfgang Beyer - University of Hohenheim, Faculty of Agricultural Sciences, Stuttgart, Germany Guido Werner - Robert Koch-Institute, Nosocomial Pathogens and Antibiotic Resistances (FG13), Wernigerode, Germany Alejandra Bosch - CINDEFI, CONICET-CCT La Plata, Facultad de Ciencias Exactas, Universidad Nacional de La Plata, La Plata, Buenos Aires, Argentina Michal Drevinek - National Institute for Nuclear, Biological and Chemical Protection, Milin, Czech Republic Roland Grunow - Robert Koch-Institute, Highly Pathogenic Microorganisms (ZBS2), Berlin, Germany Daniela Jacob - Robert Koch-Institute, Highly Pathogenic Microorganisms (ZBS2), Berlin, Germany Silke Klee - Robert Koch-Institute, Highly Pathogenic Microorganisms (ZBS2), Berlin, Germany Jörg Rau - Chemisches und Veterinäruntersuchungsamt Stuttgart, Fellbach, Germany Jens Jacob - Robert Koch-Institute, Hospital Hygiene, Infection Prevention and Control (FG14), Berlin, Germany Martin Mielke - Robert Koch-Institute, Department 1 - Infectious Diseases, Berlin, Germany Monika Ehling-Schulz - Functional Microbiology, Institute of Microbiology, University of Veterinary Medicine, Vienna, Austria License type for data base files (spectra): Creative Commons Attribution Non Commercial 4.0 International (CC-BY-NC): Licensees must credit the original authors by stating their names & the original work's title. Licensees may copy, distribute, display, and perform the work and make derivative works and remixes based on it only for non-commercial purposes.
MMMDB, Mouse Multiple tissue Metabolome DataBase, is a freely available metabolomic database containing a collection of metabolites measured from multiple tissues from single mice. The datases are collected using a single instrument and not integrated from literatures, which is useful for capturing the holistic overview of large metabolomic pathway. Currently data from cerabra, cerebella, thymus, spleen, lung, liver, kidney, heart, pancreas, testis, and plasma are provided. Non-targeted analyses were performed by capillary electropherograms time-of-flight mass spectrometry (CE-TOFMS) and, therefore, both identified metabolites and unknown (without matched standard) peaks were uploaded to this database. Not only quantified concentration but also processed raw data such as electropherogram, mass spectrometry, and annotation (such as isotope and fragment) are provided.
We have collected seed mass data for almost 13,000 species (angiosperms and gymnosperms) from all around the world. We have constructed a phylogeny for these species, and are using this, plus data on plant growth form, seed dispersal syndrome, vegetation type, net primary productivity, temperature, precipitation and leaf area index to look at factors that have influenced the evolution of seed size.
NIST peptide libraries are comprehensive, annotated mass spectral reference collections from various organisms and proteins useful for the rapid matching and identification of acquired MS/MS spectra. Spectra were produced by tandem mass spectrometers using liquid chromatographic separations followed by electrospray ionization. Unlike the NIST small molecule electron ionization library which contains one spectrum per molecular structure, there are several different modes of fragmentation (ion trap and ?beam-type? collision cells are currently the most commonly used fragmentation devices) that result in spectra with different, energy dependent, patterns. These result in multiple spectral libraries, distinguished by ionization mode, each of which may contain several spectra per peptide. Different libraries have also been assembled for iTRAQ-4 derivatized peptides and for phosphorylated peptides. Separating libraries by animal species reduces search time, although investigators may elect to include several species in their searches.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The repository contains three mzML and four imzML mass spectrometry datasets,
The mzML data are compiled in a single directory 'mzML' and zipped:
The imzML mass spectrometry imaging data are zipped individually:
All these datasets are publicly available from different repositories; however, If you reuse them, please attribute the original authors!
Data here contain and describe an open-source structured query language (SQLite) portable database containing high resolution mass spectrometry data (MS1 and MS2) for per- and polyfluorinated alykl substances (PFAS) and associated metadata regarding their measurement techniques, quality assurance metrics, and the samples from which they were produced. These data are stored in a format adhering to the Database Infrastructure for Mass Spectrometry (DIMSpec) project. That project produces and uses databases like this one, providing a complete toolkit for non-targeted analysis. See more information about the full DIMSpec code base - as well as these data for demonstration purposes - at GitHub (https://github.com/usnistgov/dimspec) or view the full User Guide for DIMSpec (https://pages.nist.gov/dimspec/docs).Files of most interest contained here include the database file itself (dimspec_nist_pfas.sqlite) as well as an entity relationship diagram (ERD.png) and data dictionary (DIMSpec for PFAS_1.0.1.20230615_data_dictionary.json) to elucidate the database structure and assist in interpretation and use.