A mass spectral database for organic compounds. The spectra included in the database are: electron impact Mass spectrum (EI-MS), Fourier transform infrared spectrum (FT-IR), 1H nuclear magnetic resonance (NMR) spectrum, 13C NMR spectrum, laser Raman spectrum, and electron spin resonance (ESR) spectrum.
This data set contains the spectral data associated with the collection of EC-SERS spectra using mainly a nontargeted drug identification approach, with several samples using a targeted fentanyl identification approach. The data set contains the replicate measurements and averaged Raman spectra used in the characterization of the analytes (drugs of abuse and adulterant compounds) to allow for forensic library formation. The data set also contains spectra of analytes collected at varying concentrations and additional fentanyl analog data collected using a targeted method.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Raman spectroscopy is a rapid, non-invasive, and non-destructive method, featuring high chemical specificity for different biological materials and low sensitivity to water. This makes it ideal for natural medicines, as it offers a relatively objective and comprehensive characterization for their complicated material basis. Therefore, Raman spectroscopy plays a crucial role in the identification of medicinal properties, authentication of authenticity, and quality control of TCMs. At present, TCMRSD stands as the only downloadable, comprehensive Raman spectral database for TCMs, which encompasses spectra of 327 Chinese medicines collected through rigorous methodological validation. The selection of TCMs for database development is based on the considerations of diversity of medicines, medicinal importance and variety of medicinal properties, in order to guarantee a comprehensive range and representation of substances used in Chinese medicine.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The provided dataset includes information on edible oil samples collected from grocery stores in and around Newark, Delaware, between the summer of 2014 and the spring of 2016. A total of 100 oil bottles were obtained, and three different data sets were created.
Data Set 1 contains measurements from all 100 samples and was obtained using NIR, MIR, and Raman spectroscopic techniques. The peroxide values (PVs) of the samples were determined through titration at Lawrence Livermore National Laboratory. Data Set 1 is divided into two subgroups: Data Set 1A, measured in 2016, and Data Set 1B, measured in 2019.
Data Set 2 is a subset of Data Set 1, consisting of 53 oil samples. These samples were measured using Raman spectroscopy and titrated to determine the PV at the University of Delaware.
Data Set 3 is another subset of Data Set 1, comprising 356 IR spectra of 20 varieties of edible oils as well as 120 spectra of extra virgin olive oil that has been adulterated by corn oil, canola oil or almond oil. These samples were measured using ATR-FTIR spectroscopy at 4 cm^-1 resolution at Oklahoma State University. Data Set 3 includes pure oil samples as well as adulterated oil samples, specifically adulterated extra virgin olive oil (EVOO) with corn oil, canola oil, or almond oil.
The measurement techniques and parameters varied for each data set. NIR spectra were acquired using FTIR spectrometers with different optical path lengths, MIR spectra were collected using a liquid nitrogen-cooled mercury cadmium telluride (MCT) detector, and Raman spectra were obtained with different Raman probes and lasers. The spectroscopic measurements were complemented with titration measurements to determine the PVs.
The dataset is provided as individual csv files for each type of spectroscopy, with the first two columns capturing class and corresponding peroxide value for the spectrum and the top row capturing the wavelength range of the spectra.
Note: There are a few instances where replicates were not taken or certain samples were replaced with NaN variables to maintain the proper matrix dimensions.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Please note that there is no peer-reviewed publication associated with this data record.This fileset consists of 13 data files, 1 code file and 2 ReadMe files.The dataset data.mat is in .mat file format and therefore not openly-accessible. The following datasets, are an openly-accessible version of the .mat file:Fig2_1.txt in .txt file formatFig2_2.txt in .txt file formatFig2_3.txt in .txt file formatFig2_4.txt in .txt file formatFig2_5.txt in .txt file formatFig2_6.txt in .txt file formatraw_COVID.txt in .txt file formatraw_Helthy.txt in .txt file formatraw_Suspected.txt in .txt file formatraw_Tube.txt in .txt file formattable2_data.txt in .txt file formatwave_number.txt in .txt file formatThe code file is the following: code.m in .m file formatThe two ReadMe files are the following: readme.txt in .txt file format and readme.m in .m file format.Data in Fig2_1.txt, Fig2_2.txt, Fig2_3.txt, Fig2_4.txt, Fig2_5.txt and Fig2_6.txt were used to plot Figure 2 in the related manuscript.raw_COVID.txt contains the raw Raman spectroscopy data from the serum samples obtained from the 53 confirmed COVID-19 patients.raw_Helthy.txt contains the raw Raman spectroscopy data from the serum samples obtained from healthy individuals.raw_Suspected.txt contains the raw Raman spectroscopy data from the serum samples obtained from suspected cases (individuals suspected of COVID-19 infection)raw_Tube.txt contains the raw spectra data from cryopreservation tubes with saline solution inside.wave_number.txt contains data of the Raman Spectrum shift.table2_data.txt was used to generate Table 2 in the related manuscript.The code code.m was used for data processing.Software needed to access data: data.mat can only be accessed using the Matlab software. Running the code code.m also requires Matlab.Study aims and methodology: The recommended diagnosis method for the coronavirus disease (COVID-19 is a qPCR-based technique, however, it is a time consuming, expensive, and a sample dependent procedure with relative high false negative ratio. The aim of this study was to develop a widely available, cheap and quick method to diagnose COVID-19 disease based on Raman spectroscopy.A total of 157 serum samples were collected from 53 confirmed patients, 54 suspected cases (fever but not COVID-19) and 50 healthy controls. Raman spectroscopy was used to analyse these samples and the machine learning support vector machine (SVM) method were applied to the spectral dataset to build a diagnostic algorithm.The experimental set up consisted of a Volume Phase Holographic (VPH) spectrograph, deep-cooled CCD camera, and a Raman probe and laser. A total of 2355 spectra from 157 individuals were imported to MATLAB (R2013a) software (Math-200 works, Inc.).For more details on the methodology, please read the related article.
https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
Biosignature detection is one of the most important goals in Mars missions. Since the Curiosity mission, the laser-induced breakdown spectrometer (LIBS) becomes an essential payload due to its convenience and versatility in profiling elemental chemistry. To test whether LIBS alone could filter potential biosignatures, a clastic quartz stone collected from a Mars analog setting, the western Qaidam Basin, was selected for LIBS analysis. Raman spectroscopy was used as an indicator of organic signals to support the presence of potential hypolithic communities and the dearth of epilithic biomass on the rock. A total of 344 LIBS spectra were determined and statistically analyzed using principal component analysis (PCA). Our results indicate that, with a sufficient sample size, PCA analysis can partially differentiate biotic and abiotic signals based on LIBS measures. This finding is significant since it indicates that multivariate analysis of LIBS data can be useful for biosignatures filtering on Mars exploration. Methods Located on “the roof of the world” Tibetan Plateau, the western Qaidam Basin is a cold, dry, and irradiative environment that shapes itself with landforms (e.g., dunes, yardangs, playas, wind streaks, polygonal terrains, and gullies) commonly found on Mars. A clastic quartz stone was sampled from a Cenozoic gravel deposit (38°35′44″ N, 90°59′6″ E, 3245.17 m altitude) from the hyperarid Dalangtan Playa, western Qaidam Basin, on 29 July 2021. The Cenozoic gravel deposit was likely derived from the weathering of Mesozoic (Pre-Jurassic and Jurassic) rocks, and quartz stones were common in the deposit. Visible light greenish color could be observed at the bottom of the quartz stone. Multiple spots of four vertical lines (11 spots for line 1 Qz-l1, 6 spots for line 2 Qz-l2, 9 spots for line 3 Qz-l3, and 8 spots for line 4 Qz-l4) of the Qaidam quartz stone were selected to stereoscopically investigate the spatial distribution of Raman spectra-based mineralogical or organic/biotic signals. An alpha 300R confocal Raman imaging system (WITec, Germany) incorporated with a 50x objective lens of numerical aperture = 0.55 and an excitation laser source of 532 nm was used for Raman spectral measurements. The laser wavelength was corrected using the Raman peak of a Si wafer. All spectra were acquired in a spectral range of 0-4000 cm−1 with a spectral resolution of 4.8 cm-1. To retain the resolutions of both minerals and organic matter as much as possible, laser power was kept at 3.1 mW for an integration time of 3 s with the number of accumulations of 30. To understand the elemental compositions and spectral features of chosen samples, the SciAps Z-300 Handheld LIBS Analyzer (SciAps Inc, Woburn, MA, USA) was employed for LIBS analysis (excitation source: 5-6 mJ·pulse-1, 50 Hz repetition rate, 1064 nm laser source, argon purge). Z-300 LIBS Analyzer measured the signal intensity every 0.0333 nm from 200 to 900 nm. LIBS Analyzer was equilibrated with an internal standard prior to determining the peak patterns of respective target samples. LIBS was employed to construct a pseudo-three-dimensional geochemical profile of spots on the four vertical lines identical to Raman spectroscopic measurements. The LIBS spectrum of each spot was generated by the LIBS Analyzer in quadruplicate. In addition, 140 spots from 9 oval outlines of the top, side, and bottom faces of the Qaidam quartz were measured using LIBS: 8 spots of inner circle, 16 spots of middle circle, and 15 spots of outer circle on the top face; 16 spots of upper circle, 23 spots of middle circle, and 23 spots of lower circle on the side face; and 8 spots of inner circle, 12 spots of middle circle, and 19 spots of outer circle on the bottom face. Moreover, 68 random spots (20 from the top, 28 from the side, and 20 from the bottom) of the Qaidam quartz were chosen for singlicate LIBS measurements and the downstream statistical analysis.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
The RamanLab database system consists of two primary pickle files that store comprehensive mineral spectroscopic data. The main database file RamanLab_Database_20250602.pkl contains the core spectral library with reference Raman spectra for hundreds of minerals, including their wavenumber positions, relative intensities, and associated metadata such as chemical formulas, crystal systems, and space groups. This database serves as the primary reference for the correlation-based search and match functionality, enabling identification of unknown minerals through spectral comparison algorithms. The database is structured to support both individual mineral identification and complex mixed-mineral analysis workflows.The complementary mineral_modes.pkl file focuses specifically on vibrational mode assignments and implements the complete Hey-Celestian classification system with all 15 mineral groups, including Sheet Silicates, Simple Oxides, Octahedral Framework minerals, various Silicate chains (Single and Double), Ring Silicates, Complex Oxides, Hydroxides, and Mixed Modes. This database provides detailed vibrational mode information for each mineral, including fundamental frequencies, overtones, combination bands, and their structural origins. The classification system includes chemical constraints and scoring mechanisms that provide 2.0x boosts when sample chemistry matches expected mineral compositions, enabling more accurate phase identification in complex samples. Together, these databases form an integrated system that supports both spectral matching and crystallographic interpretation of Raman spectroscopic data.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Data contain Raman spectra, calibration and data alalysis results for all experiments conducted in the manuscript
Included in this data release are eight files: one metadata file, 6 comma separated value (.csv) datafiles and one data dictionary file defining the entities and attributes that constitute this dataset. The 6 datafiles support the publication, Deep syntectonic burial of the Anthracite belt, Eastern Pennsylvania, by providing in csv format the data from Figure 11 – Representative Raman spectra for selected individual CH4 ± CO2 fluid inclusions, Figure 13 – Representative Raman spectra pairs for selected individual High-ThA fluid inclusions, Figure 15 – Representative Raman spectra pairs for selected individual Low-ThA fluid inclusions, Table 1 – Comparison of composition by Raman and microthermometry for Single-Phase inclusions, Table 2 – Comparison of inclusion density by Raman and microthermometry for single-phase Inclusions, and Table 3 – Two-phase inclusion vapor bubble composition determined by Raman spectroscopy. These files contain Raman data from fluid inclusions contained in rocks from the Anthracite belt region, Pennsylvania.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
A Raman spectral dataset comprising 3,510 spectra from 32 chemical substances. This dataset includes organic solvents and reagents commonly used in API development, along with information regarding the products in the XLSX, and code to visualise and perform technical validation on the data.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
X-ray diffraction (XRD) pattern of the iron substrate sample after electrodeposition.
X-ray photoelectron spectroscopy (XPS) of the synthetic mackinawite deposit for (a) Fe 2p spectrum and (b) S 2p spectrum, confirming the deposited sulfide is mackinawite and not stoichiometric pyrite.
Raman spectrum of iron pyrite nano-particles synthesized by the hot injection method at 200 deg C.
https://www.gnu.org/licenses/gpl-3.0-standalone.htmlhttps://www.gnu.org/licenses/gpl-3.0-standalone.html
Spectra of three over-the-counter pharmaceuticals—acetylsalicylic acid, paracetamol, and ibuprofen—were collected at the Faculty of Exact and Natural Sciences of the National University of Asuncion, with the aim of creating a dataset that serves as a reference for Raman responses from different drug manufacturers. This dataset will also provide the scientific community with data that can be used for multivariate analysis and model training.
In the data collection phase, spectra were obtained using a Raman spectroscopy system (iRaman 785s model from BWTEK) equipped with a 785 nm excitation laser. Samples were collected from diverse sales points such as pharmacies, shopping centers, and street vendors. Each spectrum was captured at 50% laser power with a measurement time of 1 second and an accumulation of 10 spectra over a range of 150 to 3200 cm-1. This method preserved the integrity of the raw data, which includes a common column for Raman shifts and additional columns for intensities and labels, detailing the activation modes in the Raman spectrum.
The data is structured into specific xlsx files for each drug, such as "Paracetamol.xlsx", "acetylsalicylic-acid .xlsx", and "Ibuprofen .xlsx", each containing 50 spectra categorized by the type of pharmaceutical but not by brand. Brand-specific categorization is detailed in separate files like "Paracetamol-trademark .xlsx", where samples are classified using codes such as "Par-A" for different brands. This organization aids the scientific community in using clustering methods to analyze the spectral data and differentiate pharmaceutical brands based on their excipients or binders, with consistent codes across different drugs suggesting common manufacturers for various medications.
The minerals used in this study were supplied by the Australian Museum (ASM). The minerals have been characterized by both X-ray diffraction (XRD) and by chemical analysis using ICP-AES (inductively coupled plasma atomic emission spectroscopy) techniques.
The following samples were used: (a) sample ASM-D49056 boléite from the Amelia Mine, Santa Rosalia, Baja, California, Mexico; (b) sample ASM-D 27575 cumengéite, Beleo, Baja California, Mexico; (c) sample ASM D36845 diaboléite from Mannoth mine, Tiger, Arizona, USA; and (d) sample ASM D191881 phosgenite from Consols mine, Broken Hill, South Australia.
Crystals of the minerals were placed and orientated on a polished metal surface on the stage of an Olympus BHSM microscope, which is equipped with 10 × and 50 × objectives. The microscope is part of a Renishaw 1000 Raman microscope system, which also includes a monochromator, a filter system and a Charge Coupled Device (CCD). Raman spectra were excited by a Spectra-Physics model 127 He-Ne laser (633 nm) at a resolution of 2 cm−1 in the range between 100 and 4000 cm−1. Repeated acquisition using the highest magnification was accumulated to improve the signal to noise ratio in the spectra. Spectra were calibrated using the 520.5 cm−1 line of a silicon wafer.
Infrared (IR) spectra were obtained using a Nicolet Nexus 870 FTIR spectrometer with a smart endurance single bounce diamond ATR cell. Spectra over the 4000 to 525 cm−1 range were obtained by the co-addition of 64 scans with a resolution of 4 cm−1 and a mirror velocity of 0.6329 cm/s.
Spectroscopic manipulation such as baseline adjustment, smoothing and normalization were performed using the Spectracalc software package GRAMS (Galactic Industries Corporation, New Hampshire, USA). Band component analysis was undertaken using the Jandel ‘Peakfit’ software package, which enabled the type of fitting function to be selected and allows specific parameters to be fixed or varied accordingly. Band fitting was done using a Gauss-Lorentz cross-product function with the minimum number of component bands used for the fitting process. The Gauss-Lorentz ratio was maintained at values >0.7 and fitting was undertaken until reproducible results were obtained with squared correlations of r2 >0.995.
Figure 1 is Raman spectra of the hydroxyl-stretching region of (a) phosgenite, (b) boléite, (c) diaboléite and (d) cumengéite. Figure 2 shows band component analysis of the hydroxyl-stretching region of the Raman spectrum of (a) diaboléite and (b) cumengéite. Figure 3 is Raman spectra of the 600–1000 cm−1 region of (a) boléite, (b) diaboléite and (c) cumengéite. Figure 4 is Raman spectra of the carbonate region of phosgenite. Figure 5 is Raman spectra of the 100–500 cm−1 region of (a) phosgenite, (b) boléite, (c) diaboléite and (d) cumengéite.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This data is presented in the publication "Hyperspectral image analysis for CARS, SRS, and Raman data", J. Raman. Spectroscopy (2015): (http://dx.doi.org/10.1002/jrs.4729). It contains coherent anti-Stokes Raman scattering hyperspectral images and their analysis in terms of concentrations of chemical components and their spectra using the hyperspectral image analysis (HIA) software developed by ourselves. Additional data is shown to exemplify the functionality of HIA to filter motion artefacts.
This data release supports the paper titled, "Insights into the metamorphic history and origin of flake graphite mineralization at the Graphite Creek graphite deposit, Seward Peninsula, Alaska, USA", published in the journal Mineralium Deposita. The data release includes zircon and titanite U-Pb-Thisotope and age data, monazite U-Pb-Th isotope, trace element and age data, carbon and sulfur stableisotope data, and graphite Raman spectroscopy data, from samples collected at the Graphite Creek deposit, Alaska. Sample location information and descriptions are in table "sample_descriptions.csv". The raw numerical data are presented in tabular format. Additionally, plots of select zircon data – conventional discordia, U/Th vs age, U (ppm) vs age, and percent discordance vs age – are included in PDF format (zircon_plots.pdf), along with scanning electron microscope cathodoluminescence images of zircon in TIFF format. Folder "zircon_data" contains all zircon numerical data, plots, and images. Table "raman_peak_fits.csv" contains all raw Raman spectra peak fit data. Table "monazite_isotopic_data.csv" contains all monazite data. Table "carbon_sulfur_isotopic_data.csv" contains all carbon and sulfur data. Interpretations of the data are presented in the aforementioned journal paper.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Establishment of nanoplastic dataset. The spectra included in the nanoplastic database were obtained directly from the plastic samples. To establish the internal Raman spectral dataset, a total of 1,000 individual nanoparticles were examined, encompassing five common plastic contaminants, namely Polyethylene (PE), polytetrafluoroethylene (PTFE), Polystyrene (PS), polymethyl methacrylate (PMMA) and Polyvinyl chloride (PVC). For each specific plastic category, 200 nanoparticles were selected for subsequent analysis.
Content In each txt file corresponding to a Raman spectrum, the first two columns are the corresponding X and Y coordinates, respectively. The columns are: X-coordinate - wavenumber, Y-coordinate - Raman signal intensity.
More data are available upon request for research purposes only. Please send an email to zhanglw@fudan.edu.cn with a brief description of the purpose of use and your request for more data.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Raman spectra of Graphene oxide with different oxidation degrees were measured at 20°C with a spectrometer Horiba Scientific - LabRAM HR Evolution using laser wavelength of 514.5 nm in the range from 700 to 3500 cm-1.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Datasets Mineral and Organic for replicating the paper "Raman spectrum matching with contrastive learning".
The detailed instruction about how to use the dataset, please visit our Github Repository.
To use these two datasets, please cite:
@inproceedings{Lafuente2016ThePO, title={The power of databases: The RRUFF project}, author={B. Lafuente and R. Downs and Hexiong Yang and N. Stone}, booktitle = {Highlights in Mineralogical Crystallography}, year={2016} }
@article{organic_dataset, author = {Zhang, Rui and Xie, Huimin and Cai, Shuning and Hu, Yong and Liu, Guo-kun and Hong, Wenjing and Tian, Zhong-qun}, title = {Transfer-learning-based Raman spectra identification}, journal = {Journal of Raman Spectroscopy}, volume = {51}, number = {1}, pages = {176-186}, keywords = {deep learning, Raman spectroscopy, transfer learning}, year = {2020} }
@Article{D2AN00403H, author ="Li, Bo and Schmidt, Mikkel N. and Alstrøm, Tommy S.", title ="Raman spectrum matching with contrastive representation learning", journal ="Analyst", year ="2022", volume ="147", issue ="10", pages ="2238-2246", publisher ="The Royal Society of Chemistry", doi ="https://doi.org/10.1039/d2an00403h", url ="http://dx.doi.org/10.1039/D2AN00403H", }
In biochemical systems, enzymes catalyze the endergonic phosphorylation of adenosine diphos-phate (ADP) to adenosine triphosphate (ATP) by different pathways, e.g., oxidative phosphoryla-tion catalyzed by membrane bound ATP synthase or substrate-level phosphorylation. The stored energy is released by the enzymatically controlled exergonic hydrolysis of ATP to power other vital endergonic reactions; therefore, ATP is widely known as the universal energy currency. Rapid abiotic ATP hydrolysis kinetics thus means higher maintenance energy costs for cells, and it has been suggested that this is an important factor in setting the limits to the functioning of living organisms (Bains et al. 2015). In order to evaluate the running conditions of the in-situ procedure by Moeller et al. (2022) using Raman spectroscopy opened up an efficient way of obtaining further insights to the effects of P-T- ionic composition on the kinetics of ATP-ADP hy-drolysis. Raman spectroscopy can be combined with a hydrothermal diamond anvil cell, which provides an isochoric system for measurements up to pressures of 2000 MPa. Another system for in-situ Raman spectroscopy at elevated pressures and temperatures is based on an autoclave fitted with optical high-pressure windows, as shown by Louvel et al. (2015) and works up to 200 MPa. In this system, pressure and temperature can be controlled independently, so that isobaric temperature series are possible. This data publication compromises all Raman spectra measured in-situ of N2H2ATP solutions at 80, 100 and 120 °C and up to 1666 MPa to determine the rate constants of the hydrolysis of adenosine triphosphate (ATP) to adenosine diphosphate (ADP) at 48 different P-T conditions. Furthermore, an assignment of peaks in the fitted range, the initial fit parameters and the fit-results are provided. Besides the kinetic data, the pH of the ATP solutions was calculated at ex-perimental temperature and pressure conditions.
The traditional definitive diagnosis of brain tumors is performed by needle biopsy under the guidance of imaging-based exams. This paradigm is based on the experience of radiogolists, and accuracy could be affected by uncertainty in imaging interpretation and needle placement. Raman spectroscopy has the potential to improve needle biopsy by providing fingerprints of different materials and performing in situ tissue identification. In this paper, we present the development of a supervised machine learning algorithm using random forest to distinguish the Raman spectrum of different types of tissue. An integral process from raw data collection and preprocessing to model training and evaluation is presented. To illustrate the feasibility of this approach, viable animal tissues were used, including ectocinerea (grey matter), alba (white matter) and blood vessels. Raman spectra were acquired using a custom-built Raman spectrometer. The hyperparamet...
A mass spectral database for organic compounds. The spectra included in the database are: electron impact Mass spectrum (EI-MS), Fourier transform infrared spectrum (FT-IR), 1H nuclear magnetic resonance (NMR) spectrum, 13C NMR spectrum, laser Raman spectrum, and electron spin resonance (ESR) spectrum.