100+ datasets found
  1. r

    METLIN

    • rrid.site
    • scicrunch.org
    • +2more
    Updated Jun 17, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). METLIN [Dataset]. http://identifiers.org/RRID:SCR_010500/resolver?q=*&i=rrid
    Explore at:
    Dataset updated
    Jun 17, 2025
    Description

    A public repository of metabolite information as well as tandem mass spectrometry data is provided to facilitate metabolomics experiments. It contains structures and represents a data management system designed to assist in a broad array of metabolite research and metabolite identification. An annotated list of known metabolites and their mass, chemical formula, and structure are available. Each metabolite is linked to outside resources for further reference and inquiry. MS/MS data is also available on many of the metabolites.

  2. f

    Enabling Efficient and Confident Annotation of LC−MS Metabolomics Data...

    • acs.figshare.com
    zip
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Corey D. Broeckling; Andrea Ganna; Mark Layer; Kevin Brown; Ben Sutton; Erik Ingelsson; Graham Peers; Jessica E. Prenni (2023). Enabling Efficient and Confident Annotation of LC−MS Metabolomics Data through MS1 Spectrum and Time Prediction [Dataset]. http://doi.org/10.1021/acs.analchem.6b02479.s001
    Explore at:
    zipAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    ACS Publications
    Authors
    Corey D. Broeckling; Andrea Ganna; Mark Layer; Kevin Brown; Ben Sutton; Erik Ingelsson; Graham Peers; Jessica E. Prenni
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    Liquid chromatography coupled to electrospray ionization-mass spectrometry (LC–ESI-MS) is a versatile and robust platform for metabolomic analysis. However, while ESI is a soft ionization technique, in-source phenomena including multimerization, nonproton cation adduction, and in-source fragmentation complicate interpretation of MS data. Here, we report chromatographic and mass spectrometric behavior of 904 authentic standards collected under conditions identical to a typical nontargeted profiling experiment. The data illustrate that the often high level of complexity in MS spectra is likely to result in misinterpretation during the annotation phase of the experiment and a large overestimation of the number of compounds detected. However, our analysis of this MS spectral library data indicates that in-source phenomena are not random but depend at least in part on chemical structure. These nonrandom patterns enabled predictions to be made as to which in-source signals are likely to be observed for a given compound. Using the authentic standard spectra as a training set, we modeled the in-source phenomena for all compounds in the Human Metabolome Database to generate a theoretical in-source spectrum and retention time library. A novel spectral similarity matching platform was developed to facilitate efficient spectral searching for nontargeted profiling applications. Taken together, this collection of experimental spectral data, predictive modeling, and informatic tools enables more efficient, reliable, and transparent metabolite annotation.

  3. d

    Data from: MMMDB - Mouse Multiple tissue Metabolome DataBase

    • dknet.org
    • neuinfo.org
    Updated Jan 29, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2022). MMMDB - Mouse Multiple tissue Metabolome DataBase [Dataset]. http://identifiers.org/RRID:SCR_006064
    Explore at:
    Dataset updated
    Jan 29, 2022
    Description

    MMMDB, Mouse Multiple tissue Metabolome DataBase, is a freely available metabolomic database containing a collection of metabolites measured from multiple tissues from single mice. The datases are collected using a single instrument and not integrated from literatures, which is useful for capturing the holistic overview of large metabolomic pathway. Currently data from cerabra, cerebella, thymus, spleen, lung, liver, kidney, heart, pancreas, testis, and plasma are provided. Non-targeted analyses were performed by capillary electropherograms time-of-flight mass spectrometry (CE-TOFMS) and, therefore, both identified metabolites and unknown (without matched standard) peaks were uploaded to this database. Not only quantified concentration but also processed raw data such as electropherogram, mass spectrometry, and annotation (such as isotope and fragment) are provided.

  4. b

    Metabolite and Tandem Mass Spectrometry Database

    • bioregistry.io
    Updated Nov 16, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2021). Metabolite and Tandem Mass Spectrometry Database [Dataset]. http://identifiers.org/re3data:r3d100012311
    Explore at:
    Dataset updated
    Nov 16, 2021
    Description

    The METLIN (Metabolite and Tandem Mass Spectrometry) Database is a repository of metabolite information as well as tandem mass spectrometry data, providing public access to its comprehensive MS and MS/MS metabolite data. An annotated list of known metabolites and their mass, chemical formula, and structure are available, with each metabolite linked to external resources for further reference and inquiry.

  5. f

    Data from: FiehnLib: Mass Spectral and Retention Index Libraries for...

    • acs.figshare.com
    • figshare.com
    xls
    Updated May 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Tobias Kind; Gert Wohlgemuth; Do Yup Lee; Yun Lu; Mine Palazoglu; Sevini Shahbaz; Oliver Fiehn (2023). FiehnLib: Mass Spectral and Retention Index Libraries for Metabolomics Based on Quadrupole and Time-of-Flight Gas Chromatography/Mass Spectrometry [Dataset]. http://doi.org/10.1021/ac9019522.s001
    Explore at:
    xlsAvailable download formats
    Dataset updated
    May 30, 2023
    Dataset provided by
    ACS Publications
    Authors
    Tobias Kind; Gert Wohlgemuth; Do Yup Lee; Yun Lu; Mine Palazoglu; Sevini Shahbaz; Oliver Fiehn
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    At least two independent parameters are necessary for compound identification in metabolomics. We have compiled 2 212 electron impact mass spectra and retention indices for quadrupole and time-of-flight gas chromatography/mass spectrometry (GC/MS) for over 1 000 primary metabolites below 550 Da, covering lipids, amino acids, fatty acids, amines, alcohols, sugars, amino-sugars, sugar alcohols, sugar acids, organic phosphates, hydroxyl acids, aromatics, purines, and sterols as methoximated and trimethylsilylated mass spectra under electron impact ionization. Compounds were selected from different metabolic pathway databases. The structural diversity of the libraries was found to be highly overlapping with metabolites represented in the BioMeta/KEGG pathway database using chemical fingerprints and calculations using Instant-JChem. In total, the FiehnLib libraries comprised 68% more compounds and twice as many spectra with higher spectral diversity than the public Golm Metabolite Database. A range of unique compounds are present in the FiehnLib libraries that are not comprised in the 4 345 trimethylsilylated spectra of the commercial NIST05 mass spectral database. The libraries can be used in conjunction with GC/MS software but also support compound identification in the public BinBase metabolomic database that currently comprises 5 598 unique mass spectra generated from 19 032 samples covering 279 studies of 47 species (plants, animals, and microorganisms).

  6. d

    Mass Spectral Library

    • dknet.org
    • scicrunch.org
    • +1more
    Updated Jan 29, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2022). Mass Spectral Library [Dataset]. http://identifiers.org/RRID:SCR_014668
    Explore at:
    Dataset updated
    Jan 29, 2022
    Description

    A library containing spectra upwards of 200,000 chemical compounds. Spectra include metabolites, peptides, contaminants, and lipids. All spectra and chemical structures are examined by professionals.

  7. w

    Golm Metabolome Database

    • data.wu.ac.at
    wsdl
    Updated Oct 10, 2013
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Global (2013). Golm Metabolome Database [Dataset]. https://data.wu.ac.at/odso/datahub_io/MTFkZDY2YjYtZmZjMS00YmYxLTk2MmUtZmQ0ODZjNGJjZWI3
    Explore at:
    wsdlAvailable download formats
    Dataset updated
    Oct 10, 2013
    Dataset provided by
    Global
    Description

    The Golm Metabolome Database (GMD) facilitates the search for and dissemination of mass spectra from biologically active metabolites quantified using gas chromatography mass spectrometry (GC-MS). Academic users may download the material offered on the site for their non-commercial use, but all copyright and other proprietary notices contained in the materials are to be retained. Non-academic/commercial, for-profit users may use the GMD website and online database APIs/services, but any other use, in particular the download of any component of the GMD, requires a license agreement.

  8. Data from: A database of high-resolution MS/MS spectra for lichen...

    • data.niaid.nih.gov
    xml
    Updated Jul 22, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Damien Olivier (2019). A database of high-resolution MS/MS spectra for lichen metabolites [Dataset]. https://data.niaid.nih.gov/resources?id=mtbls999
    Explore at:
    xmlAvailable download formats
    Dataset updated
    Jul 22, 2019
    Dataset provided by
    ISCR, CORINT
    Authors
    Damien Olivier
    Variables measured
    Adduct, Metabolomics, Chemical class, Collision energy
    Description

    While analytical techniques in natural products research massively shifted to liquid chromatography-mass spectrometry, lichen chemistry remains reliant on limited analytical methods, Thin Layer Chromatography being the gold standard. To meet the modern standards of metabolomics within lichenochemistry, we announce the publication of an open access MS/MS library with 250 metabolites, coined LDB for Lichen DataBase, providing a comprehensive coverage of lichen chemodiversity. These were donated by the Berlin Garden and Botanical Museum from the collection of Siegfried Huneck to be analyzed by LC-MS/MS. Spectra at individual collision energies were submitted to MetaboLights while merged spectra were uploaded to the GNPS platform (CCMSLIB00004751209 to CCMSLIB00004751517). Technical validation was achieved by dereplicating three lichen extracts using a Molecular Networking approach, revealing the detection of eleven unique molecules that would have been missed without LDB implementation to the GNPS. From a chemist's viewpoint, this database should help streamlining the isolation of formerly unreported metabolites. From a taxonomist perspective, the LDB offers a versatile tool for the chemical profiling of newly reported species.

  9. b

    Golm Metabolome Database GC-MS spectra

    • bioregistry.io
    • registry.identifiers.org
    Updated Mar 8, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2022). Golm Metabolome Database GC-MS spectra [Dataset]. https://bioregistry.io/registry/gmd.gcms
    Explore at:
    Dataset updated
    Mar 8, 2022
    Description

    Golm Metabolome Database (GMD) provides public access to custom mass spectral libraries, metabolite profiling experiments as well as additional information and tools. Analytes are subjected to a gas chromatograph coupled to a mass spectrometer, which records the mass spectrum and the retention time linked to an analyte. This collection references GC-MS spectra.

  10. S

    Metabolomics data

    • scidb.cn
    Updated Mar 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ning Deyuan; Chen Guobing (2023). Metabolomics data [Dataset]. http://doi.org/10.57760/sciencedb.07845
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Mar 31, 2023
    Dataset provided by
    Science Data Bank
    Authors
    Ning Deyuan; Chen Guobing
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Description

    We collected the blood of 77 patients with mushroom poisoning, 28 patients with sepsis, and 31 healthy individuals for metabonomic analysis by LC-MS. method: Liquid chromatography conditions: chromatographic column: Volterra ACQUITY UPLC BEH C18 1.7um, 2.1mm * 100mm; For ESI+mode and UPLC high-speed steel T3 column (2.1 mm × 100 mm, 1.8 micron) ESI mode. In ESI+mode analysis, the mobile phase A of the binary gradient elution system is ultrapure water (0.1% formic acid, v/v). Mobile phase B: acetonitrile (0.1% formic acid, v/v); Column temperature: 40 ° C; Flow rate: 040 ml/min; Injection volume: 5uL. Separation is accomplished through the following steps: Gradient: B starts at 5%, maintains the composition at 100% B for 1-24 minutes to 100%, and then holds for 27.5-27.6 minutes, 100-5% B, and 27.6-5% B for 30 minutes. In ESI mode analysis, mobile phase A consists of water and 6.5 mM ammonium acetate, and mobile phase B contains a 95% methanol solution of 6.5 mM ammonium acetate. Separation is accomplished through the following gradients: B starts at 5%, reaches 18% in 100-1 min, the composition remains at 100% B for 18.1-22 min, and then remains at 22% B for 22-1.100 min, 5-22% B, and 1.25-5 min. Mass spectrum conditions: QE MS, ESI ion source, full scan mode, scan range: 70-1000m/z, mass spectrum resolution set to 70000, complete MS/MS scan resolution set to 17500. Sheath gas velocity (sheath): 35mL/min, auxiliary gas velocity (auxiliary): 8mL/min, capillary temperature: 350 ° C, auxiliary heating temperature: 350 ° C. Metabolomic data were obtained using XCMS software (1.50.1). The preprocessing process generates a data matrix that includes retention time, mass charge ratio (m/z) values, and peak intensity. All ions are normalized to the total peak area of each sample. If more than 85% of the variables in two subsets of a variable are non zero variables, the variable will remain in the dataset. Otherwise, the variable will be eliminated. OSI。 SMMS (1.0 vision, Dalian Chemical Data Solutions Information Technology Co., Ltd.) is used for peak labeling. The data were analyzed through the EMBL-EBI metabolic database. The Graphpad prism is used to analyze and plot data for different metabolites. File: All data is stored in an Excel spreadsheet, with the first green row displaying the names of different metabolites and the blue column displaying the patient type. The data is the peak area value detected by the original mass spectrometry method.

  11. Data from: ISDB: In Silico Spectral Databases of Natural Products

    • zenodo.org
    bin
    Updated Aug 27, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Pierre-Marie Allard; Pierre-Marie Allard; Jonathan Bisson; Jonathan Bisson; Adriano Rutz; Adriano Rutz (2023). ISDB: In Silico Spectral Databases of Natural Products [Dataset]. http://doi.org/10.5281/zenodo.7534250
    Explore at:
    binAvailable download formats
    Dataset updated
    Aug 27, 2023
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Pierre-Marie Allard; Pierre-Marie Allard; Jonathan Bisson; Jonathan Bisson; Adriano Rutz; Adriano Rutz
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    An In Silico spectral DataBase (ISDB) of natural products calculated from structures aggregated in the frame of the LOTUS Initiative (https://doi.org/10.7554/eLife.70780).
    Fragmented using cfm-predict 4 (https://doi.org/10.1021/acs.analchem.1c01465) .

    In silico spectral database preparation and use for dereplication initially described in Integration of Molecular Networking and In-Silico MS/MS Fragmentation for Natural Products Dereplication https://doi.org/10.1021/ACS.ANALCHEM.5B04804

    See https://github.com/mandelbrot-project/spectral_lib_builder for associated building scripts.

    See https://github.com/mandelbrot-project/spectral_lib_matcher for associated matching scripts.

  12. S

    Metabolomics data for crude protein content in diets for Huangjiang...

    • scidb.cn
    Updated Apr 12, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Md. Abul Kalam Azad (2024). Metabolomics data for crude protein content in diets for Huangjiang mini-pigs [Dataset]. http://doi.org/10.57760/sciencedb.17962
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Apr 12, 2024
    Dataset provided by
    Science Data Bank
    Authors
    Md. Abul Kalam Azad
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Huangjiangzhen
    Description

    The metabolite contents in the jejunum and ileum of Huanjiang mini-pigs were determined using a non-targeted metabolomics approach with the UPLC-HDMS. The metabolomics procedures included sample preparation, metabolite separation and detection, data preprocessing, and statistical analysis. For metabolite identification, approximately 25 mg of each sample was weighed into a 2-mL EP tube and then added 500 mL extract solution (acetonitrile: methanol: water = 2:2:1 (v/v), with the isotopically-labeled internal standard mixture) to the EP tube. After 30 s of vortexing, the mixed samples were homogenized at 35 Hz for 4 min and sonicated in an ice-water bath for 5 min. The homogenization and sonication cycles were repeated three times. Then the samples were incubated for 1 h at -40 °C and centrifuged at 12,000 ´ g for 15 min at 4 °C. The resulting supernatants were filtered through a 0.22-µm membrane and transferred to fresh glass vials for further analysis. The quality control (QC) sample was obtained by mixing an equal aliquot of the supernatants from all samples. An ultra-performance liquid chromatography (UPLC) system (Vanquish, Thermo Fisher Scientific, Waltham, MA, USA) with a UPLC BEH Amide column (2.10 × 100 mm, 1.70 mm) coupled with Q Exactive HFX mass spectrometer (Orbitrap MS, Thermo Fisher Scientific, Waltham, MA, USA) was used to perform LC-MS/MS analyses. The mobile phase A contained 25 mmol/L ammonium acetate and 25 mmol/L ammonia hydroxide in water, and the mobile phase B contained acetonitrile. The injection volume was 3 mL, and the temperature of the auto-sampler was set at 4 °C. To acquire MS/MS spectra on an information-dependent acquisition (IDA) mode, the QE HFX mass spectrometer was used for its ability in the control of the acquisition software (Xcalibur, Thermo Fisher Scientific, Waltham, MA, USA). In this mode, the acquisition software continuously evaluated the full scan of the MS spectrum. The conditions for ESI source were set as follows: sheath gas flow rate 30 Arb, Aux gas flow rate 25 Arb, capillary temperature 350 °C, full MS resolution 60,000, MS/MS resolution a7500, collision energy 10/30/60 in NCE mode, and spray voltage 3.60 kV (positive ion mode) or -3.20 kV (negative ion mode), respectively. For peak detection, extraction, alignment, and integration, obtained raw data were converted into mzXML format by ProteoWizard and then processed with an in-house program, which was developed using R and based on XCMS. The metabolites were annotated using an in-house MS2 (secondary mass spectrometry) database (BiotreeDB v2.1). The value of the cutoff was 0.3. The PCA and orthogonal partial least squares discriminant analysis (OPLS-DA) were established by the SIMCA software v.16.0.2 (Sartorious Stedim Data Analytics AB, Umea, Sweden) to visualize the distinction and detect differential metabolites among different CP content groups. Moreover, the Kyoto Encyclopedia of Genes and Genomes (KEGG) and MetaboAnalyst 5.0 were used for pathway analysis.

  13. n

    Data from: Leaf metabolic traits reveal hidden dimensions of plant form and...

    • data.niaid.nih.gov
    • datadryad.org
    • +1more
    zip
    Updated Jul 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Tom Walker (2023). Leaf metabolic traits reveal hidden dimensions of plant form and function [Dataset]. http://doi.org/10.5061/dryad.zpc866tdn
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jul 31, 2023
    Dataset provided by
    University of Neuchâtel
    Authors
    Tom Walker
    License

    https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html

    Description

    In this study, we interpreted leaf metabolome variation among 457 tropical and 339 temperate plant species to understand how the metabolome contributes to macroecological variation in plant functioning. Metabolome data were generated using liquid chromatography mass spectrometry, annotated with compound names (where possible), and cross-referenced against chemoinformatics databases to derive metabolite chemical properties. We then compared variation in leaf metabolite chemical properties among species with variation in classical plant functional traits.

  14. Automated Label-free Quantification of Metabolites from Liquid...

    • data.niaid.nih.gov
    xml
    Updated Dec 16, 2015
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Erhan Kenar (2015). Automated Label-free Quantification of Metabolites from Liquid Chromatography–Mass Spectrometry Data (Simulated) [Dataset]. https://data.niaid.nih.gov/resources?id=mtbls235
    Explore at:
    xmlAvailable download formats
    Dataset updated
    Dec 16, 2015
    Dataset provided by
    Quantitative Biology Center Tübingen
    Authors
    Erhan Kenar
    Variables measured
    Metabolomics, simulated mass error, simulated detector noise, simulated error profile distortion
    Description

    Liquid chromatography coupled to mass spectrometry (LC-MS) has become a standard technology in metabolomics. In particular, label-free quantification based on LC-MS is easily amenable to large-scale studies and thus well suited to clinical metabolomics. Large-scale studies, however, require automated processing of the large and complex LC-MS datasets. We present a novel algorithm for the detection of mass traces and their aggregation into features (i.e. all signals caused by the same analyte species) that is computationally efficient and sensitive and that leads to reproducible quantification results. The algorithm is based on a sensitive detection of mass traces, which are then assembled into features based on mass-to-charge spacing, co-elution information, and a support vector machine–based classifier able to identify potential metabolite isotope patterns. The algorithm is not limited to metabolites but is applicable to a wide range of small molecules (e.g. lipidomics, peptidomics), as well as to other separation technologies. We assessed the algorithm's robustness with regard to varying noise levels on synthetic data and then validated the approach on experimental data investigating human plasma samples. We obtained excellent results in a fully automated data-processing pipeline with respect to both accuracy and reproducibility. Relative to state-of-the art algorithms, ours demonstrated increased precision and recall of the method. The algorithm is available as part of the open-source software package OpenMS and runs on all major operating systems.Simulated data is reported in the current study MTBLS235.Plasma data is reported in MTBLS234.

  15. Exposomics Spectral Library

    • zenodo.org
    • data.niaid.nih.gov
    Updated Apr 20, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Biswapriya B. Misra; Biswapriya B. Misra (2020). Exposomics Spectral Library [Dataset]. http://doi.org/10.5281/zenodo.3755855
    Explore at:
    Dataset updated
    Apr 20, 2020
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Biswapriya B. Misra; Biswapriya B. Misra
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Title: Repurposing Public Metabolomics Datasets for Construction of an Exposomics Spectral Library

    Introduction

    Publicly archived metabolomics datasets from diverse human biosamples provides an opportunity to repurpose the shared datasets for further exploratory analysis into human health. Though, most of the times the endogenous metabolome is implicated in disease research as biomarkers, the growing role of exposome in human health underscores the need for identification of chemical exposures in human samples. In this regard, I explored the possibility of finding previously unreported exposomal compounds (i.e., anthropogenic, industrial, dietary, and microbial chemicals) from the true unknowns in these studied datasets. Using in silico spectral library matching followed by molecular structure prediction approaches, the aim of this study is to recognize the exposome, and minimize the gap between potential number of true exposomic substances in biosamples.

    Methods

    Raw metabolomics (GC-MS) datasets were downloaded from Metabolomics Workbench, GNPS, and MetaboLights using key words- ‘human, GC-MS, serum, plasma, muscle, liver, kidney’. The vendor formatted mass spectrometry datasets were converted to .mzML formats using MSConvertGUI (ProteoWizad) for data processing and spectral library (GOLM, MoNA, Fiehnlib, MassBank) matching using MS-DIAL. For EI-MS spectral annotation, the identity was confirmed by the presence of [M−CH3]+, [M+H]+, [M+C2H5]+ and [M+C3H5]+ and using Global Natural Products Social (GNPS) molecular networking. Exposomal metabolites were separated from the rest based on identifiers at the Blood Exposome DB. True unassigned spectra were further interrogated using MS-FINDER for structural prediction. Exported spectra in .msp and .txt formats were pooled into a single file for free public download and use.

    Preliminary data

    The pooled GC-MS datasets (50) obtained from the three repositories were from multiple human samples, multiple vendors, and were generated using multiple mass analyzers (single and triple quads, ToFs, and Orbitraps). The .mzML files were processed for data preprocessing such as deconvolution, peak picking, and peak alignment followed by compound identification using MS-DIAL and GNPS tools. Processing parameters for the datasets were optimized individually in a study-specific manner. Altogether, the data resulted in spectral assignment of approx. 400 compounds of endogenous origin, associated with a KEGG and HMDB identifier relating to generic metabolic pathways, using only open source spectral libraries. Given extremely limited overlap between spectral libraries, I used a pooled spectral library generated from all available open source spectral data. Further, 350 unassigned spectra (displaying insufficient matching scores for an assignment, i.e., < 500; with S/N >25 in each dataset) were interrogated using MS-FINDER and Global Natural Products Social (GNPS) molecular networking approach (both cosine score, > 0.5; balance score, > 0.9) that resulted in annotation of 250 exposomic compounds. Using ClassyFire the exposomal compounds (InChIs) were assigned a hierarchical chemical classification which indicated diverse origin of these compounds ranging from medications, industrial chemicals, pollutants to phytochemicals of dietary origin. The assigned spectra were individually manually curated and then compiled as a single file available as the ‘Exposomics Spectral Library’ to public as .txt and .msp file formats for free use and is available: 10.5281/zenodo.3755855.

  16. s

    GMD

    • scicrunch.org
    Updated Jun 23, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). GMD [Dataset]. http://identifiers.org/RRID:nif-0000-21180
    Explore at:
    Dataset updated
    Jun 23, 2025
    Description

    It facilitates the search for and dissemination of mass spectra from biologically active metabolites quantified using Gas chromatography (GC) coupled to mass spectrometry (MS). Use the Search Page to search for a compound of your interest, using the name, mass, formula, InChI etc. as query input. Additionally, a Library Search service enables the search of user submitted mass spectra within the GMD. In parallel to the library search, a prediction of chemical sub-groups is performed. This approach has reached beta level and a publication is currently under review. Using several sub-group specific Decision Trees (DTs), mass spectra are classified with respect to the presence of the chemical moieties within the linked (unknown) compound. Prediction of functional groups (ms analysis) facilitates the search of metabolites within the GMD by means of user submitted GC-MS spectra consisting of retention index (n-alkanes, if vailable) and mass intensities ratios. In addition, a functional group prediction will help to characterize those metabolites without available reference mass spectra included in the GMD so far. Instead, the unknown metabolite is characterized by predicted presence or absence of functional groups. For power users this functionality presented here is exposed as soap based web services. Functional group prediction of compounds by means of GC-EI-MS spectra using Microsoft analysis service decision trees All currently available trained decision trees and sub-structure predictions provided by the GMD interface. Table describes the functional group, optional use of an RI system, record date of the trained decision tree, number of MSTs with proportion of MSTs linked to metabolites with the functional group present for each tree. Average and standard deviation of the 50-fold CV error, namely the ratio false over correctly sorted MSTs in the trained DT, are listed. The GMD website offers a range of mass spectral reference libraries to academic users which can be downloaded free of charge in various electronic formats. The libraries are constituted by base peak normalized consensus spectra of single analytes and contain masses in the range 70 to 600 amu, while the ubiquitous mass fragments typically generated from compounds carrying a trimethylsilyl-moiety, namely the fragments at m/z 73, 74, 75, 147, 148, and 149, were excluded.

  17. d

    YMDB - Yeast Metabolome Database

    • dknet.org
    • scicrunch.org
    • +1more
    Updated Aug 12, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). YMDB - Yeast Metabolome Database [Dataset]. http://identifiers.org/RRID:SCR_005890
    Explore at:
    Dataset updated
    Aug 12, 2024
    Description

    A manually curated database of small molecule metabolites found in or produced by Saccharomyces cerevisiae (also known as Baker's yeast and Brewer's yeast). This database covers metabolites described in textbooks, scientific journals, metabolic reconstructions and other electronic databases. YMDB contains metabolites arising from normal S. cerevisiae metabolism under defined laboratory conditions as well as metabolites generated by S. cerevisiae when used in baking and in the production of wines, beers and spirits. YMDB currently contains 2027 small molecules with 857 associated enzymes and 138 associated transporters. Each small molecule has 48 data fields describing the metabolite, its chemical properties and links to spectral and chemical databases. Each enzyme/transporter is linked to its associated metabolites and has 30 data fields describing both the gene and corresponding protein. Users may search through the YMDB using a variety of database-specific tools. The simple text query supports general text queries of the textual component of the database. By selecting either metabolites or proteins in the search for field it is possible to restrict the search and the returned results to only those data associated with metabolites or with proteins. Clicking on the Browse button generates a tabular synopsis of YMDB's content. This browser view allows users to casually scroll through the database or re-sort its contents. Clicking on a given MetaboCard button brings up the full data content for the corresponding metabolite. A complete explanation of all the YMDB fields and sources is available. Under the Search link users will find a number of search options listed in a pull-down menu. The Chem Query option allows users to draw (using MarvinSketch applet or a ChemSketch applet) or to type (SMILES string) a chemical compound and to search the YMDB for chemicals similar or identical to the query compound. The Advanced Search option supports a more sophisticated text search of the text portion of YMDB. The Sequence Search button allows users to conduct BLASTP (protein) sequence searches of all sequences contained in YMDB. Both single and multiple sequence (i.e. whole proteome) BLAST queries are supported. YMDB also supports a Data Extractor option that allows specific data fields or combinations of data fields to be searched and/or extracted. Spectral searches of YMDB's reference compound NMR and MS spectral data are also supported through its MS, MS/MS, GC/MS and NMR Spectra Search links. Users may download YMDB's complete textual data, chemical structures and sequence data by clicking on the Download button.

  18. f

    Large-Scale Prediction of Collision Cross-Section Values for Metabolites in...

    • figshare.com
    zip
    Updated Nov 9, 2016
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Zhiwei Zhou; Xiaotao Shen; Jia Tu; Zheng-Jiang Zhu (2016). Large-Scale Prediction of Collision Cross-Section Values for Metabolites in Ion Mobility-Mass Spectrometry [Dataset]. http://doi.org/10.1021/acs.analchem.6b03091.s002
    Explore at:
    zipAvailable download formats
    Dataset updated
    Nov 9, 2016
    Dataset provided by
    ACS Publications
    Authors
    Zhiwei Zhou; Xiaotao Shen; Jia Tu; Zheng-Jiang Zhu
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    The rapid development of metabolomics has significantly advanced health and disease related research. However, metabolite identification remains a major analytical challenge for untargeted metabolomics. While the use of collision cross-section (CCS) values obtained in ion mobility-mass spectrometry (IM-MS) effectively increases identification confidence of metabolites, it is restricted by the limited number of available CCS values for metabolites. Here, we demonstrated the use of a machine-learning algorithm called support vector regression (SVR) to develop a prediction method that utilized 14 common molecular descriptors to predict CCS values for metabolites. In this work, we first experimentally measured CCS values (ΩN2) of ∼400 metabolites in nitrogen buffer gas and used these values as training data to optimize the prediction method. The high prediction precision of this method was externally validated using an independent set of metabolites with a median relative error (MRE) of ∼3%, better than conventional theoretical calculation. Using the SVR based prediction method, a large-scale predicted CCS database was generated for 35 203 metabolites in the Human Metabolome Database (HMDB). For each metabolite, five different ion adducts in positive and negative modes were predicted, accounting for 176 015 CCS values in total. Finally, improved metabolite identification accuracy was demonstrated using real biological samples. Conclusively, our results proved that the SVR based prediction method can accurately predict nitrogen CCS values (ΩN2) of metabolites from molecular descriptors and effectively improve identification accuracy and efficiency in untargeted metabolomics. The predicted CCS database, namely, MetCCS, is freely available on the Internet.

  19. n

    MassBank of North America

    • neuinfo.org
    Updated Oct 16, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2019). MassBank of North America [Dataset]. http://identifiers.org/RRID:SCR_015536
    Explore at:
    Dataset updated
    Oct 16, 2019
    Description

    Metadata-centric, auto-curating repository designed for storage and querying of mass spectral records. It contains metabolite mass spectra, metadata and associated compounds.

  20. R

    Untargeted metabolomics raw data

    • entrepot.recherche.data.gouv.fr
    tsv, zip
    Updated Feb 4, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sarah Jane Cookson; Sarah Jane Cookson; Grégoire Loupit; Grégoire Loupit; Pierre Petriaq; Pierre Petriaq; Josep Valls-Fonayet; Josep Valls-Fonayet (2025). Untargeted metabolomics raw data [Dataset]. http://doi.org/10.57745/GJRUQG
    Explore at:
    zip(13646326467), tsv(6433)Available download formats
    Dataset updated
    Feb 4, 2025
    Dataset provided by
    Recherche Data Gouv
    Authors
    Sarah Jane Cookson; Sarah Jane Cookson; Grégoire Loupit; Grégoire Loupit; Pierre Petriaq; Pierre Petriaq; Josep Valls-Fonayet; Josep Valls-Fonayet
    License

    https://spdx.org/licenses/etalab-2.0.htmlhttps://spdx.org/licenses/etalab-2.0.html

    Description

    Semi-polar compounds were extracted, including primary and secondary metabolites, using automated high-throughput ethanol extraction procedures at the MetaboHUB-Bordeaux Metabolome (https://metabolome.u-bordeaux.fr/) from 35 mg of fresh powder, following previously established protocols (Luna et al., 2020). All samples were randomised and injected alternately with extraction blanks (prepared without plant material and used to rule out potential contaminants detected by untargeted metabolomics), and 13 Quality control samples that were prepared by mixing 10 µL from each sample. Quality control samples were injected every 8 runs and used for the correction of signal drift during the analytical batch, and the calculation of coefficients of variation for each metabolomic feature so only the most robust ones are retained for chemometrics (Broadhurst et al., 2018). Untargeted analysis was performed on a UHPLC Vanquish (Thermo Fisher Scientific) coupled to a Q-Exactive Plus mass spectrometer (Thermo Fisher Scientific). One µL of sample was injected on a Phenomenex Luna® Omega Polar C18 column (50 x 2.1 mm, 1.6 µm) at 40°C and a gradient of solvent A (milliQ water – 0.1 % formic acid) and solvent B (acetonitrile – 0.1% formic acid) with a flow of 0.5 mL min-1 was used. The gradient elution was set as follows: 0-11.5 min: 1-40% solvent B; 11.5-12.5 min: 40-95% solvent B; 12.5-14 min: 95% solvent B; 14.5-16 min: 1% solvent B. The mass spectrometry data was acquired in negative polarity at 140.000 FWHM resolution with an automatic gain target at 3e6 and maximum IT of 100 ms. The source conditions were as follow: Spray voltage: 3000 V; Sheath gas: 45 a.u; Auxiliary gas: 15 a.u; Capillary temperature: 320°C; Probe heater temperature: 250°C; S-lens RF level: 100. The experiments were in full scan (mass range: 70-1050 m/z) – data depending MS2 with top three precursors and normalized collision energies of 15, 30, 45 using a dynamic exclusion of 5 s.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
(2025). METLIN [Dataset]. http://identifiers.org/RRID:SCR_010500/resolver?q=*&i=rrid

METLIN

RRID:SCR_010500, nlx_158116, METLIN (RRID:SCR_010500), METLIN, Metabolite and Tandem MS Database (METLIN), METLIN Metabolite Database, Metabolite and Tandem MS Database

Explore at:
11 scholarly articles cite this dataset (View in Google Scholar)
Dataset updated
Jun 17, 2025
Description

A public repository of metabolite information as well as tandem mass spectrometry data is provided to facilitate metabolomics experiments. It contains structures and represents a data management system designed to assist in a broad array of metabolite research and metabolite identification. An annotated list of known metabolites and their mass, chemical formula, and structure are available. Each metabolite is linked to outside resources for further reference and inquiry. MS/MS data is also available on many of the metabolites.

Search
Clear search
Close search
Google apps
Main menu