100+ datasets found

r
METLIN
rrid.site
scicrunch.org
+2more
Updated Jun 17, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). METLIN [Dataset]. http://identifiers.org/RRID:SCR_010500/resolver?q=*&i=rrid
Explore at:
Unique identifier
https://identifiers.org/RRID:SCR_010500 https://identifiers.org/RRID:SCR_010500/resolver?q=*&i=rrid
Dataset updated
Jun 17, 2025
Description
A public repository of metabolite information as well as tandem mass spectrometry data is provided to facilitate metabolomics experiments. It contains structures and represents a data management system designed to assist in a broad array of metabolite research and metabolite identification. An annotated list of known metabolites and their mass, chemical formula, and structure are available. Each metabolite is linked to outside resources for further reference and inquiry. MS/MS data is also available on many of the metabolites.
f
Enabling Efficient and Confident Annotation of LC−MS Metabolomics Data...
acs.figshare.com
zip
Updated May 31, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Corey D. Broeckling; Andrea Ganna; Mark Layer; Kevin Brown; Ben Sutton; Erik Ingelsson; Graham Peers; Jessica E. Prenni (2023). Enabling Efficient and Confident Annotation of LC−MS Metabolomics Data through MS1 Spectrum and Time Prediction [Dataset]. http://doi.org/10.1021/acs.analchem.6b02479.s001
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.1021/acs.analchem.6b02479.s001
Dataset updated
May 31, 2023
Dataset provided by
ACS Publications
Authors
Corey D. Broeckling; Andrea Ganna; Mark Layer; Kevin Brown; Ben Sutton; Erik Ingelsson; Graham Peers; Jessica E. Prenni
License
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Description
Liquid chromatography coupled to electrospray ionization-mass spectrometry (LC–ESI-MS) is a versatile and robust platform for metabolomic analysis. However, while ESI is a soft ionization technique, in-source phenomena including multimerization, nonproton cation adduction, and in-source fragmentation complicate interpretation of MS data. Here, we report chromatographic and mass spectrometric behavior of 904 authentic standards collected under conditions identical to a typical nontargeted profiling experiment. The data illustrate that the often high level of complexity in MS spectra is likely to result in misinterpretation during the annotation phase of the experiment and a large overestimation of the number of compounds detected. However, our analysis of this MS spectral library data indicates that in-source phenomena are not random but depend at least in part on chemical structure. These nonrandom patterns enabled predictions to be made as to which in-source signals are likely to be observed for a given compound. Using the authentic standard spectra as a training set, we modeled the in-source phenomena for all compounds in the Human Metabolome Database to generate a theoretical in-source spectrum and retention time library. A novel spectral similarity matching platform was developed to facilitate efficient spectral searching for nontargeted profiling applications. Taken together, this collection of experimental spectral data, predictive modeling, and informatic tools enables more efficient, reliable, and transparent metabolite annotation.
d
Data from: MMMDB - Mouse Multiple tissue Metabolome DataBase
dknet.org
neuinfo.org
Updated Jan 29, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2022). MMMDB - Mouse Multiple tissue Metabolome DataBase [Dataset]. http://identifiers.org/RRID:SCR_006064
Explore at:
Unique identifier
https://identifiers.org/RRID:SCR_006064
Dataset updated
Jan 29, 2022
Description
MMMDB, Mouse Multiple tissue Metabolome DataBase, is a freely available metabolomic database containing a collection of metabolites measured from multiple tissues from single mice. The datases are collected using a single instrument and not integrated from literatures, which is useful for capturing the holistic overview of large metabolomic pathway. Currently data from cerabra, cerebella, thymus, spleen, lung, liver, kidney, heart, pancreas, testis, and plasma are provided. Non-targeted analyses were performed by capillary electropherograms time-of-flight mass spectrometry (CE-TOFMS) and, therefore, both identified metabolites and unknown (without matched standard) peaks were uploaded to this database. Not only quantified concentration but also processed raw data such as electropherogram, mass spectrometry, and annotation (such as isotope and fragment) are provided.
b
Metabolite and Tandem Mass Spectrometry Database
bioregistry.io
Updated Nov 16, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2021). Metabolite and Tandem Mass Spectrometry Database [Dataset]. http://identifiers.org/re3data:r3d100012311
Explore at:
Unique identifier
https://identifiers.org/re3data:r3d100012311
Dataset updated
Nov 16, 2021
Description
The METLIN (Metabolite and Tandem Mass Spectrometry) Database is a repository of metabolite information as well as tandem mass spectrometry data, providing public access to its comprehensive MS and MS/MS metabolite data. An annotated list of known metabolites and their mass, chemical formula, and structure are available, with each metabolite linked to external resources for further reference and inquiry.
f
Data from: FiehnLib: Mass Spectral and Retention Index Libraries for...
acs.figshare.com
figshare.com
xls
Updated May 30, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Tobias Kind; Gert Wohlgemuth; Do Yup Lee; Yun Lu; Mine Palazoglu; Sevini Shahbaz; Oliver Fiehn (2023). FiehnLib: Mass Spectral and Retention Index Libraries for Metabolomics Based on Quadrupole and Time-of-Flight Gas Chromatography/Mass Spectrometry [Dataset]. http://doi.org/10.1021/ac9019522.s001
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1021/ac9019522.s001
Dataset updated
May 30, 2023
Dataset provided by
ACS Publications
Authors
Tobias Kind; Gert Wohlgemuth; Do Yup Lee; Yun Lu; Mine Palazoglu; Sevini Shahbaz; Oliver Fiehn
License
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Description
At least two independent parameters are necessary for compound identification in metabolomics. We have compiled 2 212 electron impact mass spectra and retention indices for quadrupole and time-of-flight gas chromatography/mass spectrometry (GC/MS) for over 1 000 primary metabolites below 550 Da, covering lipids, amino acids, fatty acids, amines, alcohols, sugars, amino-sugars, sugar alcohols, sugar acids, organic phosphates, hydroxyl acids, aromatics, purines, and sterols as methoximated and trimethylsilylated mass spectra under electron impact ionization. Compounds were selected from different metabolic pathway databases. The structural diversity of the libraries was found to be highly overlapping with metabolites represented in the BioMeta/KEGG pathway database using chemical fingerprints and calculations using Instant-JChem. In total, the FiehnLib libraries comprised 68% more compounds and twice as many spectra with higher spectral diversity than the public Golm Metabolite Database. A range of unique compounds are present in the FiehnLib libraries that are not comprised in the 4 345 trimethylsilylated spectra of the commercial NIST05 mass spectral database. The libraries can be used in conjunction with GC/MS software but also support compound identification in the public BinBase metabolomic database that currently comprises 5 598 unique mass spectra generated from 19 032 samples covering 279 studies of 47 species (plants, animals, and microorganisms).
d
Mass Spectral Library
dknet.org
scicrunch.org
+1more
Updated Jan 29, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2022). Mass Spectral Library [Dataset]. http://identifiers.org/RRID:SCR_014668
Explore at:
Unique identifier
https://identifiers.org/RRID:SCR_014668
Dataset updated
Jan 29, 2022
Description
A library containing spectra upwards of 200,000 chemical compounds. Spectra include metabolites, peptides, contaminants, and lipids. All spectra and chemical structures are examined by professionals.
w
Golm Metabolome Database
data.wu.ac.at
wsdl
Updated Oct 10, 2013
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Global (2013). Golm Metabolome Database [Dataset]. https://data.wu.ac.at/odso/datahub_io/MTFkZDY2YjYtZmZjMS00YmYxLTk2MmUtZmQ0ODZjNGJjZWI3
Explore at:
wsdlAvailable download formats
Dataset updated
Oct 10, 2013
Dataset provided by
Global
Description
The Golm Metabolome Database (GMD) facilitates the search for and dissemination of mass spectra from biologically active metabolites quantified using gas chromatography mass spectrometry (GC-MS). Academic users may download the material offered on the site for their non-commercial use, but all copyright and other proprietary notices contained in the materials are to be retained. Non-academic/commercial, for-profit users may use the GMD website and online database APIs/services, but any other use, in particular the download of any component of the GMD, requires a license agreement.
Data from: A database of high-resolution MS/MS spectra for lichen...
data.niaid.nih.gov
xml
Updated Jul 22, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Damien Olivier (2019). A database of high-resolution MS/MS spectra for lichen metabolites [Dataset]. https://data.niaid.nih.gov/resources?id=mtbls999
Explore at:
xmlAvailable download formats
Dataset updated
Jul 22, 2019
Dataset provided by
ISCR, CORINT
Authors
Damien Olivier
Variables measured
Adduct, Metabolomics, Chemical class, Collision energy
Description
While analytical techniques in natural products research massively shifted to liquid chromatography-mass spectrometry, lichen chemistry remains reliant on limited analytical methods, Thin Layer Chromatography being the gold standard. To meet the modern standards of metabolomics within lichenochemistry, we announce the publication of an open access MS/MS library with 250 metabolites, coined LDB for Lichen DataBase, providing a comprehensive coverage of lichen chemodiversity. These were donated by the Berlin Garden and Botanical Museum from the collection of Siegfried Huneck to be analyzed by LC-MS/MS. Spectra at individual collision energies were submitted to MetaboLights while merged spectra were uploaded to the GNPS platform (CCMSLIB00004751209 to CCMSLIB00004751517). Technical validation was achieved by dereplicating three lichen extracts using a Molecular Networking approach, revealing the detection of eleven unique molecules that would have been missed without LDB implementation to the GNPS. From a chemist's viewpoint, this database should help streamlining the isolation of formerly unreported metabolites. From a taxonomist perspective, the LDB offers a versatile tool for the chemical profiling of newly reported species.
b
Golm Metabolome Database GC-MS spectra
bioregistry.io
registry.identifiers.org
Updated Mar 8, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2022). Golm Metabolome Database GC-MS spectra [Dataset]. https://bioregistry.io/registry/gmd.gcms
Explore at:
Dataset updated
Mar 8, 2022
Description
Golm Metabolome Database (GMD) provides public access to custom mass spectral libraries, metabolite profiling experiments as well as additional information and tools. Analytes are subjected to a gas chromatograph coupled to a mass spectrometer, which records the mass spectrum and the retention time linked to an analyte. This collection references GC-MS spectra.
S
Metabolomics data
scidb.cn
Updated Mar 31, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ning Deyuan; Chen Guobing (2023). Metabolomics data [Dataset]. http://doi.org/10.57760/sciencedb.07845
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.57760/sciencedb.07845
Dataset updated
Mar 31, 2023
Dataset provided by
Science Data Bank
Authors
Ning Deyuan; Chen Guobing
License
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Description
We collected the blood of 77 patients with mushroom poisoning, 28 patients with sepsis, and 31 healthy individuals for metabonomic analysis by LC-MS. method: Liquid chromatography conditions: chromatographic column: Volterra ACQUITY UPLC BEH C18 1.7um, 2.1mm * 100mm; For ESI+mode and UPLC high-speed steel T3 column (2.1 mm × 100 mm, 1.8 micron) ESI mode. In ESI+mode analysis, the mobile phase A of the binary gradient elution system is ultrapure water (0.1% formic acid, v/v). Mobile phase B: acetonitrile (0.1% formic acid, v/v); Column temperature: 40 ° C; Flow rate: 040 ml/min; Injection volume: 5uL. Separation is accomplished through the following steps: Gradient: B starts at 5%, maintains the composition at 100% B for 1-24 minutes to 100%, and then holds for 27.5-27.6 minutes, 100-5% B, and 27.6-5% B for 30 minutes. In ESI mode analysis, mobile phase A consists of water and 6.5 mM ammonium acetate, and mobile phase B contains a 95% methanol solution of 6.5 mM ammonium acetate. Separation is accomplished through the following gradients: B starts at 5%, reaches 18% in 100-1 min, the composition remains at 100% B for 18.1-22 min, and then remains at 22% B for 22-1.100 min, 5-22% B, and 1.25-5 min. Mass spectrum conditions: QE MS, ESI ion source, full scan mode, scan range: 70-1000m/z, mass spectrum resolution set to 70000, complete MS/MS scan resolution set to 17500. Sheath gas velocity (sheath): 35mL/min, auxiliary gas velocity (auxiliary): 8mL/min, capillary temperature: 350 ° C, auxiliary heating temperature: 350 ° C. Metabolomic data were obtained using XCMS software (1.50.1). The preprocessing process generates a data matrix that includes retention time, mass charge ratio (m/z) values, and peak intensity. All ions are normalized to the total peak area of each sample. If more than 85% of the variables in two subsets of a variable are non zero variables, the variable will remain in the dataset. Otherwise, the variable will be eliminated. OSI。 SMMS (1.0 vision, Dalian Chemical Data Solutions Information Technology Co., Ltd.) is used for peak labeling. The data were analyzed through the EMBL-EBI metabolic database. The Graphpad prism is used to analyze and plot data for different metabolites. File: All data is stored in an Excel spreadsheet, with the first green row displaying the names of different metabolites and the blue column displaying the patient type. The data is the peak area value detected by the original mass spectrometry method.
Data from: ISDB: In Silico Spectral Databases of Natural Products
zenodo.org
bin
Updated Aug 27, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Pierre-Marie Allard; Pierre-Marie Allard; Jonathan Bisson; Jonathan Bisson; Adriano Rutz; Adriano Rutz (2023). ISDB: In Silico Spectral Databases of Natural Products [Dataset]. http://doi.org/10.5281/zenodo.7534250
Explore at:
binAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.7534250
Dataset updated
Aug 27, 2023
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Pierre-Marie Allard; Pierre-Marie Allard; Jonathan Bisson; Jonathan Bisson; Adriano Rutz; Adriano Rutz
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
An In Silico spectral DataBase (ISDB) of natural products calculated from structures aggregated in the frame of the LOTUS Initiative (https://doi.org/10.7554/eLife.70780).
Fragmented using cfm-predict 4 (https://doi.org/10.1021/acs.analchem.1c01465) .

In silico spectral database preparation and use for dereplication initially described in Integration of Molecular Networking and In-Silico MS/MS Fragmentation for Natural Products Dereplication https://doi.org/10.1021/ACS.ANALCHEM.5B04804

See https://github.com/mandelbrot-project/spectral_lib_builder for associated building scripts.

See https://github.com/mandelbrot-project/spectral_lib_matcher for associated matching scripts.
S
Metabolomics data for crude protein content in diets for Huangjiang...
scidb.cn
Updated Apr 12, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Md. Abul Kalam Azad (2024). Metabolomics data for crude protein content in diets for Huangjiang mini-pigs [Dataset]. http://doi.org/10.57760/sciencedb.17962
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.57760/sciencedb.17962
Dataset updated
Apr 12, 2024
Dataset provided by
Science Data Bank
Authors
Md. Abul Kalam Azad
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Huangjiangzhen
Description
The metabolite contents in the jejunum and ileum of Huanjiang mini-pigs were determined using a non-targeted metabolomics approach with the UPLC-HDMS. The metabolomics procedures included sample preparation, metabolite separation and detection, data preprocessing, and statistical analysis. For metabolite identification, approximately 25 mg of each sample was weighed into a 2-mL EP tube and then added 500 mL extract solution (acetonitrile: methanol: water = 2:2:1 (v/v), with the isotopically-labeled internal standard mixture) to the EP tube. After 30 s of vortexing, the mixed samples were homogenized at 35 Hz for 4 min and sonicated in an ice-water bath for 5 min. The homogenization and sonication cycles were repeated three times. Then the samples were incubated for 1 h at -40 °C and centrifuged at 12,000 ´ g for 15 min at 4 °C. The resulting supernatants were filtered through a 0.22-µm membrane and transferred to fresh glass vials for further analysis. The quality control (QC) sample was obtained by mixing an equal aliquot of the supernatants from all samples. An ultra-performance liquid chromatography (UPLC) system (Vanquish, Thermo Fisher Scientific, Waltham, MA, USA) with a UPLC BEH Amide column (2.10 × 100 mm, 1.70 mm) coupled with Q Exactive HFX mass spectrometer (Orbitrap MS, Thermo Fisher Scientific, Waltham, MA, USA) was used to perform LC-MS/MS analyses. The mobile phase A contained 25 mmol/L ammonium acetate and 25 mmol/L ammonia hydroxide in water, and the mobile phase B contained acetonitrile. The injection volume was 3 mL, and the temperature of the auto-sampler was set at 4 °C. To acquire MS/MS spectra on an information-dependent acquisition (IDA) mode, the QE HFX mass spectrometer was used for its ability in the control of the acquisition software (Xcalibur, Thermo Fisher Scientific, Waltham, MA, USA). In this mode, the acquisition software continuously evaluated the full scan of the MS spectrum. The conditions for ESI source were set as follows: sheath gas flow rate 30 Arb, Aux gas flow rate 25 Arb, capillary temperature 350 °C, full MS resolution 60,000, MS/MS resolution a7500, collision energy 10/30/60 in NCE mode, and spray voltage 3.60 kV (positive ion mode) or -3.20 kV (negative ion mode), respectively. For peak detection, extraction, alignment, and integration, obtained raw data were converted into mzXML format by ProteoWizard and then processed with an in-house program, which was developed using R and based on XCMS. The metabolites were annotated using an in-house MS2 (secondary mass spectrometry) database (BiotreeDB v2.1). The value of the cutoff was 0.3. The PCA and orthogonal partial least squares discriminant analysis (OPLS-DA) were established by the SIMCA software v.16.0.2 (Sartorious Stedim Data Analytics AB, Umea, Sweden) to visualize the distinction and detect differential metabolites among different CP content groups. Moreover, the Kyoto Encyclopedia of Genes and Genomes (KEGG) and MetaboAnalyst 5.0 were used for pathway analysis.
n
Data from: Leaf metabolic traits reveal hidden dimensions of plant form and...
data.niaid.nih.gov
datadryad.org
+1more
zip
Updated Jul 31, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Tom Walker (2023). Leaf metabolic traits reveal hidden dimensions of plant form and function [Dataset]. http://doi.org/10.5061/dryad.zpc866tdn
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5061/dryad.zpc866tdn
Dataset updated
Jul 31, 2023
Dataset provided by
University of Neuchâtel
Authors
Tom Walker
License
https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
Description
In this study, we interpreted leaf metabolome variation among 457 tropical and 339 temperate plant species to understand how the metabolome contributes to macroecological variation in plant functioning. Metabolome data were generated using liquid chromatography mass spectrometry, annotated with compound names (where possible), and cross-referenced against chemoinformatics databases to derive metabolite chemical properties. We then compared variation in leaf metabolite chemical properties among species with variation in classical plant functional traits.
Automated Label-free Quantification of Metabolites from Liquid...
data.niaid.nih.gov
xml
Updated Dec 16, 2015
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Erhan Kenar (2015). Automated Label-free Quantification of Metabolites from Liquid Chromatography–Mass Spectrometry Data (Simulated) [Dataset]. https://data.niaid.nih.gov/resources?id=mtbls235
Explore at:
xmlAvailable download formats
Dataset updated
Dec 16, 2015
Dataset provided by
Quantitative Biology Center TÃ¼bingen
Authors
Erhan Kenar
Variables measured
Metabolomics, simulated mass error, simulated detector noise, simulated error profile distortion
Description
Liquid chromatography coupled to mass spectrometry (LC-MS) has become a standard technology in metabolomics. In particular, label-free quantification based on LC-MS is easily amenable to large-scale studies and thus well suited to clinical metabolomics. Large-scale studies, however, require automated processing of the large and complex LC-MS datasets. We present a novel algorithm for the detection of mass traces and their aggregation into features (i.e. all signals caused by the same analyte species) that is computationally efficient and sensitive and that leads to reproducible quantification results. The algorithm is based on a sensitive detection of mass traces, which are then assembled into features based on mass-to-charge spacing, co-elution information, and a support vector machine–based classifier able to identify potential metabolite isotope patterns. The algorithm is not limited to metabolites but is applicable to a wide range of small molecules (e.g. lipidomics, peptidomics), as well as to other separation technologies. We assessed the algorithm's robustness with regard to varying noise levels on synthetic data and then validated the approach on experimental data investigating human plasma samples. We obtained excellent results in a fully automated data-processing pipeline with respect to both accuracy and reproducibility. Relative to state-of-the art algorithms, ours demonstrated increased precision and recall of the method. The algorithm is available as part of the open-source software package OpenMS and runs on all major operating systems.Simulated data is reported in the current study MTBLS235.Plasma data is reported in MTBLS234.
Exposomics Spectral Library
zenodo.org
data.niaid.nih.gov
Updated Apr 20, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Biswapriya B. Misra; Biswapriya B. Misra (2020). Exposomics Spectral Library [Dataset]. http://doi.org/10.5281/zenodo.3755855
Explore at:
Unique identifier
https://doi.org/10.5281/zenodo.3755855
Dataset updated
Apr 20, 2020
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Biswapriya B. Misra; Biswapriya B. Misra
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Title: Repurposing Public Metabolomics Datasets for Construction of an Exposomics Spectral Library

Introduction

Publicly archived metabolomics datasets from diverse human biosamples provides an opportunity to repurpose the shared datasets for further exploratory analysis into human health. Though, most of the times the endogenous metabolome is implicated in disease research as biomarkers, the growing role of exposome in human health underscores the need for identification of chemical exposures in human samples. In this regard, I explored the possibility of finding previously unreported exposomal compounds (i.e., anthropogenic, industrial, dietary, and microbial chemicals) from the true unknowns in these studied datasets. Using in silico spectral library matching followed by molecular structure prediction approaches, the aim of this study is to recognize the exposome, and minimize the gap between potential number of true exposomic substances in biosamples.

Methods

Raw metabolomics (GC-MS) datasets were downloaded from Metabolomics Workbench, GNPS, and MetaboLights using key words- ‘human, GC-MS, serum, plasma, muscle, liver, kidney’. The vendor formatted mass spectrometry datasets were converted to .mzML formats using MSConvertGUI (ProteoWizad) for data processing and spectral library (GOLM, MoNA, Fiehnlib, MassBank) matching using MS-DIAL. For EI-MS spectral annotation, the identity was confirmed by the presence of [M−CH₃]⁺, [M+H]⁺, [M+C₂H₅]⁺ and [M+C₃H₅]⁺ and using Global Natural Products Social (GNPS) molecular networking. Exposomal metabolites were separated from the rest based on identifiers at the Blood Exposome DB. True unassigned spectra were further interrogated using MS-FINDER for structural prediction. Exported spectra in .msp and .txt formats were pooled into a single file for free public download and use.

Preliminary data

The pooled GC-MS datasets (50) obtained from the three repositories were from multiple human samples, multiple vendors, and were generated using multiple mass analyzers (single and triple quads, ToFs, and Orbitraps). The .mzML files were processed for data preprocessing such as deconvolution, peak picking, and peak alignment followed by compound identification using MS-DIAL and GNPS tools. Processing parameters for the datasets were optimized individually in a study-specific manner. Altogether, the data resulted in spectral assignment of approx. 400 compounds of endogenous origin, associated with a KEGG and HMDB identifier relating to generic metabolic pathways, using only open source spectral libraries. Given extremely limited overlap between spectral libraries, I used a pooled spectral library generated from all available open source spectral data. Further, 350 unassigned spectra (displaying insufficient matching scores for an assignment, i.e., < 500; with S/N >25 in each dataset) were interrogated using MS-FINDER and Global Natural Products Social (GNPS) molecular networking approach (both cosine score, > 0.5; balance score, > 0.9) that resulted in annotation of 250 exposomic compounds. Using ClassyFire the exposomal compounds (InChIs) were assigned a hierarchical chemical classification which indicated diverse origin of these compounds ranging from medications, industrial chemicals, pollutants to phytochemicals of dietary origin. The assigned spectra were individually manually curated and then compiled as a single file available as the ‘Exposomics Spectral Library’ to public as .txt and .msp file formats for free use and is available: 10.5281/zenodo.3755855.
s
GMD
scicrunch.org
Updated Jun 23, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). GMD [Dataset]. http://identifiers.org/RRID:nif-0000-21180
Explore at:
Unique identifier
https://identifiers.org/RRID:SCR_006625 https://identifiers.org/RRID:nif-0000-21180
Dataset updated
Jun 23, 2025
Description
It facilitates the search for and dissemination of mass spectra from biologically active metabolites quantified using Gas chromatography (GC) coupled to mass spectrometry (MS). Use the Search Page to search for a compound of your interest, using the name, mass, formula, InChI etc. as query input. Additionally, a Library Search service enables the search of user submitted mass spectra within the GMD. In parallel to the library search, a prediction of chemical sub-groups is performed. This approach has reached beta level and a publication is currently under review. Using several sub-group specific Decision Trees (DTs), mass spectra are classified with respect to the presence of the chemical moieties within the linked (unknown) compound. Prediction of functional groups (ms analysis) facilitates the search of metabolites within the GMD by means of user submitted GC-MS spectra consisting of retention index (n-alkanes, if vailable) and mass intensities ratios. In addition, a functional group prediction will help to characterize those metabolites without available reference mass spectra included in the GMD so far. Instead, the unknown metabolite is characterized by predicted presence or absence of functional groups. For power users this functionality presented here is exposed as soap based web services. Functional group prediction of compounds by means of GC-EI-MS spectra using Microsoft analysis service decision trees All currently available trained decision trees and sub-structure predictions provided by the GMD interface. Table describes the functional group, optional use of an RI system, record date of the trained decision tree, number of MSTs with proportion of MSTs linked to metabolites with the functional group present for each tree. Average and standard deviation of the 50-fold CV error, namely the ratio false over correctly sorted MSTs in the trained DT, are listed. The GMD website offers a range of mass spectral reference libraries to academic users which can be downloaded free of charge in various electronic formats. The libraries are constituted by base peak normalized consensus spectra of single analytes and contain masses in the range 70 to 600 amu, while the ubiquitous mass fragments typically generated from compounds carrying a trimethylsilyl-moiety, namely the fragments at m/z 73, 74, 75, 147, 148, and 149, were excluded.
d
YMDB - Yeast Metabolome Database
dknet.org
scicrunch.org
+1more
Updated Aug 12, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). YMDB - Yeast Metabolome Database [Dataset]. http://identifiers.org/RRID:SCR_005890
Explore at:
Unique identifier
https://identifiers.org/RRID:SCR_005890
Dataset updated
Aug 12, 2024
Description
A manually curated database of small molecule metabolites found in or produced by Saccharomyces cerevisiae (also known as Baker's yeast and Brewer's yeast). This database covers metabolites described in textbooks, scientific journals, metabolic reconstructions and other electronic databases. YMDB contains metabolites arising from normal S. cerevisiae metabolism under defined laboratory conditions as well as metabolites generated by S. cerevisiae when used in baking and in the production of wines, beers and spirits. YMDB currently contains 2027 small molecules with 857 associated enzymes and 138 associated transporters. Each small molecule has 48 data fields describing the metabolite, its chemical properties and links to spectral and chemical databases. Each enzyme/transporter is linked to its associated metabolites and has 30 data fields describing both the gene and corresponding protein. Users may search through the YMDB using a variety of database-specific tools. The simple text query supports general text queries of the textual component of the database. By selecting either metabolites or proteins in the search for field it is possible to restrict the search and the returned results to only those data associated with metabolites or with proteins. Clicking on the Browse button generates a tabular synopsis of YMDB's content. This browser view allows users to casually scroll through the database or re-sort its contents. Clicking on a given MetaboCard button brings up the full data content for the corresponding metabolite. A complete explanation of all the YMDB fields and sources is available. Under the Search link users will find a number of search options listed in a pull-down menu. The Chem Query option allows users to draw (using MarvinSketch applet or a ChemSketch applet) or to type (SMILES string) a chemical compound and to search the YMDB for chemicals similar or identical to the query compound. The Advanced Search option supports a more sophisticated text search of the text portion of YMDB. The Sequence Search button allows users to conduct BLASTP (protein) sequence searches of all sequences contained in YMDB. Both single and multiple sequence (i.e. whole proteome) BLAST queries are supported. YMDB also supports a Data Extractor option that allows specific data fields or combinations of data fields to be searched and/or extracted. Spectral searches of YMDB's reference compound NMR and MS spectral data are also supported through its MS, MS/MS, GC/MS and NMR Spectra Search links. Users may download YMDB's complete textual data, chemical structures and sequence data by clicking on the Download button.
f
Large-Scale Prediction of Collision Cross-Section Values for Metabolites in...
figshare.com
zip
Updated Nov 9, 2016
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Zhiwei Zhou; Xiaotao Shen; Jia Tu; Zheng-Jiang Zhu (2016). Large-Scale Prediction of Collision Cross-Section Values for Metabolites in Ion Mobility-Mass Spectrometry [Dataset]. http://doi.org/10.1021/acs.analchem.6b03091.s002
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.1021/acs.analchem.6b03091.s002
Dataset updated
Nov 9, 2016
Dataset provided by
ACS Publications
Authors
Zhiwei Zhou; Xiaotao Shen; Jia Tu; Zheng-Jiang Zhu
License
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Description
The rapid development of metabolomics has significantly advanced health and disease related research. However, metabolite identification remains a major analytical challenge for untargeted metabolomics. While the use of collision cross-section (CCS) values obtained in ion mobility-mass spectrometry (IM-MS) effectively increases identification confidence of metabolites, it is restricted by the limited number of available CCS values for metabolites. Here, we demonstrated the use of a machine-learning algorithm called support vector regression (SVR) to develop a prediction method that utilized 14 common molecular descriptors to predict CCS values for metabolites. In this work, we first experimentally measured CCS values (ΩN2) of ∼400 metabolites in nitrogen buffer gas and used these values as training data to optimize the prediction method. The high prediction precision of this method was externally validated using an independent set of metabolites with a median relative error (MRE) of ∼3%, better than conventional theoretical calculation. Using the SVR based prediction method, a large-scale predicted CCS database was generated for 35 203 metabolites in the Human Metabolome Database (HMDB). For each metabolite, five different ion adducts in positive and negative modes were predicted, accounting for 176 015 CCS values in total. Finally, improved metabolite identification accuracy was demonstrated using real biological samples. Conclusively, our results proved that the SVR based prediction method can accurately predict nitrogen CCS values (ΩN2) of metabolites from molecular descriptors and effectively improve identification accuracy and efficiency in untargeted metabolomics. The predicted CCS database, namely, MetCCS, is freely available on the Internet.
n
MassBank of North America
neuinfo.org
Updated Oct 16, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2019). MassBank of North America [Dataset]. http://identifiers.org/RRID:SCR_015536
Explore at:
Unique identifier
https://identifiers.org/RRID:SCR_015536
Dataset updated
Oct 16, 2019
Description
Metadata-centric, auto-curating repository designed for storage and querying of mass spectral records. It contains metabolite mass spectra, metadata and associated compounds.
R
Untargeted metabolomics raw data
entrepot.recherche.data.gouv.fr
tsv, zip
Updated Feb 4, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sarah Jane Cookson; Sarah Jane Cookson; Grégoire Loupit; Grégoire Loupit; Pierre Petriaq; Pierre Petriaq; Josep Valls-Fonayet; Josep Valls-Fonayet (2025). Untargeted metabolomics raw data [Dataset]. http://doi.org/10.57745/GJRUQG
Explore at:
zip(13646326467), tsv(6433)Available download formats
Unique identifier
https://doi.org/10.57745/GJRUQG
Dataset updated
Feb 4, 2025
Dataset provided by
Recherche Data Gouv
Authors
Sarah Jane Cookson; Sarah Jane Cookson; Grégoire Loupit; Grégoire Loupit; Pierre Petriaq; Pierre Petriaq; Josep Valls-Fonayet; Josep Valls-Fonayet
License
https://spdx.org/licenses/etalab-2.0.htmlhttps://spdx.org/licenses/etalab-2.0.html
Description
Semi-polar compounds were extracted, including primary and secondary metabolites, using automated high-throughput ethanol extraction procedures at the MetaboHUB-Bordeaux Metabolome (https://metabolome.u-bordeaux.fr/) from 35 mg of fresh powder, following previously established protocols (Luna et al., 2020). All samples were randomised and injected alternately with extraction blanks (prepared without plant material and used to rule out potential contaminants detected by untargeted metabolomics), and 13 Quality control samples that were prepared by mixing 10 µL from each sample. Quality control samples were injected every 8 runs and used for the correction of signal drift during the analytical batch, and the calculation of coefficients of variation for each metabolomic feature so only the most robust ones are retained for chemometrics (Broadhurst et al., 2018). Untargeted analysis was performed on a UHPLC Vanquish (Thermo Fisher Scientific) coupled to a Q-Exactive Plus mass spectrometer (Thermo Fisher Scientific). One µL of sample was injected on a Phenomenex Luna® Omega Polar C18 column (50 x 2.1 mm, 1.6 µm) at 40°C and a gradient of solvent A (milliQ water – 0.1 % formic acid) and solvent B (acetonitrile – 0.1% formic acid) with a flow of 0.5 mL min-1 was used. The gradient elution was set as follows: 0-11.5 min: 1-40% solvent B; 11.5-12.5 min: 40-95% solvent B; 12.5-14 min: 95% solvent B; 14.5-16 min: 1% solvent B. The mass spectrometry data was acquired in negative polarity at 140.000 FWHM resolution with an automatic gain target at 3e6 and maximum IT of 100 ms. The source conditions were as follow: Spray voltage: 3000 V; Sheath gas: 45 a.u; Auxiliary gas: 15 a.u; Capillary temperature: 320°C; Probe heater temperature: 250°C; S-lens RF level: 100. The experiments were in full scan (mass range: 70-1050 m/z) – data depending MS2 with top three precursors and normalized collision energies of 15, 30, 45 using a dynamic exclusion of 5 s.

Facebook

Twitter

Click to copy link

Link copied

Cite

(2025). METLIN [Dataset]. http://identifiers.org/RRID:SCR_010500/resolver?q=*&i=rrid

METLIN

RRID:SCR_010500, nlx_158116, METLIN (RRID:SCR_010500), METLIN, Metabolite and Tandem MS Database (METLIN), METLIN Metabolite Database, Metabolite and Tandem MS Database

Explore at:

11 scholarly articles cite this dataset (View in Google Scholar)

Unique identifier

https://identifiers.org/RRID:SCR_010500 https://identifiers.org/RRID:SCR_010500/resolver?q=*&i=rrid

Dataset updated

Jun 17, 2025

Description

A public repository of metabolite information as well as tandem mass spectrometry data is provided to facilitate metabolomics experiments. It contains structures and represents a data management system designed to assist in a broad array of metabolite research and metabolite identification. An annotated list of known metabolites and their mass, chemical formula, and structure are available. Each metabolite is linked to outside resources for further reference and inquiry. MS/MS data is also available on many of the metabolites.

Clear search

Close search

Google apps

Main menu

METLIN

Enabling Efficient and Confident Annotation of LC−MS Metabolomics Data...

Data from: MMMDB - Mouse Multiple tissue Metabolome DataBase

Metabolite and Tandem Mass Spectrometry Database

Data from: FiehnLib: Mass Spectral and Retention Index Libraries for...

Mass Spectral Library

Golm Metabolome Database

Data from: A database of high-resolution MS/MS spectra for lichen...

Golm Metabolome Database GC-MS spectra

Metabolomics data

Data from: ISDB: In Silico Spectral Databases of Natural Products

Metabolomics data for crude protein content in diets for Huangjiang...

Data from: Leaf metabolic traits reveal hidden dimensions of plant form and...

Automated Label-free Quantification of Metabolites from Liquid...

Exposomics Spectral Library

GMD

YMDB - Yeast Metabolome Database

Large-Scale Prediction of Collision Cross-Section Values for Metabolites in...

MassBank of North America

Untargeted metabolomics raw data

METLINSee More Versions

RRID:SCR_010500, nlx_158116, METLIN (RRID:SCR_010500), METLIN, Metabolite and Tandem MS Database (METLIN), METLIN Metabolite Database, Metabolite and Tandem MS Database

METLIN