Tidal Datum GIS outputsShapefiles are provided that present the approximate shore-parallel extent of tidal datums across coastal Massachusetts. These shapefiles are provided for 2030, 2050, and 2070 sea level rise scenarios. Individual shapefiles are provided for the north and south model domains for a total of 6 tidal datum shapefiles (2 model domains, 3 sea level rise scenarios). The results presented within these polygons are based upon tidal model simulations conducted using the MC-FRM, with north shapefiles created using the north model domain, and south using the south model domain. Separate polygons (zones) are provided for approximate location where MHW values vary to the nearest 0.1 ft interval. These zones are derived based on the variation in the MHW datum, and as such other datums (MHHW, MTL, MLW, and MLLW) may vary withineach segmented polygon, especially in areas of varied bathymetry. Data are presented in units of feet relative to the NAVD88 datum.These shapefiles contain the following fields: FID, Shape, Hatch, MHHW, MHW, MTL, MLW, and MLLW. The MHHW, MHW, MTL, MLW, and MLLW fields contain float type values representing the tidal datums calculated for each polygon rounded to the nearest tenth of a foot. The Hatch field contains a binary value (0 or 1), with 1 representing zones of uncertainty for tidal datums. These uncertain zones are either dynamic in terms of geomorphology or are restricted by smaller anthropogenic features (culverts, tide gates, etc.) that were not fully resolved in the MC-FRM. Zones with a 1 Hatch value may or may not contain tidal datum information. It is recommended that care be taken when utilizing the tidal benchmark information in these hatched zones and site-specific data observations (tide data) are recommended to verify the values in these areas. If datum information is not available 9999 values are located in the datum fields for that polygon. The FID and Shape fields contain an ID number and shape type contained in each polygon.The shapefiles provided are not intended to represent a spatial extent of the tidal benchmark (i.e., they do not present the geospatial location of water level). Rather, these shapefiles provide the tidal benchmark values that should be applied over each of the geospatial zones.
U.S. Government Workshttps://www.usa.gov/government-works
License information was derived automatically
Since the late 1950s, the USGS has maintained a long-term glacier mass-balance program at three North American glaciers. Measurements began on South Cascade Glacier, WA in 1958, expanding to Gulkana and Wolverine glaciers, AK in 1966, and later Sperry Glacier, MT in 2005. Additional measurements have been made on Lemon Creek Glacier, AK to compliment data collected by the Juneau Icefield Research Program (JIRP; Pelto and others, 2013). Direct field measurements of point glaciological data are combined with weather and geodetic data to estimate the seasonal and annual mass balance at each glacier in both a conventional and reference surface format (Cogley and others, 2011). The analysis framework (O'Neel, 2019; prior to v 3.0 van Beusekom and others, 2010) is identical at each glacier to enable cross-comparison between output time series. Vocabulary used follows Cogley and others (2011) Glossary of Glacier Mass Balance.
In 2024, most U.S. customers surveyed were satisfied with the driving performance of cars from mass-market nameplates. This benchmark had an ACSI® score of ** in 2024, which remained stable compared to 2023. Vehicle safety was tied as the benchmark recording the highest level of customer satisfaction in 2024.
Benchmark data acquired on Thermo ISQ MS system, following methoximation and trimethylsilylation. Several datasets derived from a common sample set. Samples are derived from NIST 1950 plasma, Cambrige Isotope Labs yeast extract (12C and 13C) a complex mixture of 96 pure authentic standards, and isotopically labelled internal standards. Data were acquired using GC-MS, reverse phase and HILIC LC-MS, on Orbitrap and Waters Q-ToF instruments in both positive and negative ionization mode.
Three-dimensional imaging mass spectrometry (3D imaging MS) is a technique of analytical chemistry for 3D molecular analysis of a tissue specimen, entire organ, or microbial colonies on an agar plate. 3D imaging MS has unique advantages over existing 3D imaging techniques, offers novel perspectives for understanding the spatial organization of biological processes, and has growing potential to be introduced into routine use in both biology and medicine. Due to the sheer quantity of data generated, visualization, analysis, and interpretation of 3D imaging MS data remain a significant challenge. Bioinformatics research in this field is hampered by the lack of publicly available benchmark datasets needed for evaluation and comparison of algorithms. Findings We acquired high-quality 3D imaging MS datasets from different biological systems at several labs, supplied them with overview images and scripts demonstrating how to read them, and deposited them into MetaboLights, an open repository for metabolomics data. 3D imaging MS data was collected from five samples using two types of 3D imaging MS. 3D Matrix-Assisted Laser Desorption/Ionization (MALDI) imaging MS data was collected from murine pancreas, murine kidney, human oral squamous cell carcinoma, and interacting microbial colonies cultured in Petri dishes. 3D Desorption Electrospray Ionization (DESI) imaging MS data was collected from a human colorectal adenocarcinoma. Conclusions With the aim to stimulate computational research in the field of computational 3D imaging MS, we provided selected high-quality 3D imaging MS datasets which can be used by algorithm developers as benchmark datasets.
Label Free Quantification (LFQ) of shotgun proteomics data is a popular and robust method for the characterization of relative protein abundance between samples. Many analytical pipelines exist for the automation of this analysis and some tools exist for the subsequent representation and inspection of the results of these pipelines. Mass Dynamics 1.0 (MD 1.0) is a web-based analysis environment that can analyse and visualize LFQ data produced by software such as MaxQuant. Unlike other tools, MD 1.0 utilizes cloud-based architecture to enable researchers to store their data, enabling researchers to not only automatically process and visualize their LFQ data but annotate and share their findings with collaborators and, if chosen, to easily publish results to the community. With a view toward increased reproducibility and standardisation in proteomics data analysis and streamlining collaboration between researchers, MD 1.0 requires minimal parameter choices and automatically generates quality control reports to verify experiment integrity. Here, we demonstrate that MD 1.0 provides reliable results for protein expression quantification, emulating Perseus on benchmark datasets over a wide dynamic range.
Attribution-NonCommercial 3.0 (CC BY-NC 3.0)https://creativecommons.org/licenses/by-nc/3.0/
License information was derived automatically
This dataset contains a bundle of 5 mass market receiver (ublox Neo M8T) and three geodetic graded receiver (Leica GNSS1200+GNSS, Septentrio PolaRx 5TR, Javad Delta TRE_G3T) combined in a zero baseline. The dataset captures 7 days of measurements with carrier phase, code phase, Doppler, carrier-to-noise ratio (C/N0) for GPS/GLONASS C/A code on frequency L1 for the mass market receiver and GPS/GLONASS/GALILEO L1/L2/L5 for geodetic receiver. All geodetic receiver were feeded by external rubidium clock (SRS FS725 Benchmark).
U.S. Government Workshttps://www.usa.gov/government-works
License information was derived automatically
Since the late 1950s, the USGS has maintained a long-term glacier mass-balance program at key North American glaciers. Measurements began on South Cascade Glacier, WA in 1958, expanding to Gulkana and Wolverine glaciers, AK in 1966, and later Sperry Glacier, MT in 2005. The Juneau Icefield Research Program has measured glacier mass balance on Lemon Creek since the mid-1940s, with USGS providing complimentary seasonal measurements of Lemon Creek beginning in 2014 (JIRP; McNeil et al., 2020). Direct field measurements of point glaciological data are combined with weather and geodetic data to estimate the seasonal and annual mass balance at each glacier in both a conventional and reference surface format (Cogley and others, 2011). The analysis framework (O'Neel and others, 2019; Florentine and others, 2024; prior to v 3.0 van Beusekom and others, 2010) is identical at each glacier to enable cross-comparison between output time series. Vocabulary used follows Cogley and others (2011) ...
The dataset contains information on the statutory and plan-bid components of the regional and county Medicare Advantage (MA) benchmarks for the year 2023. The patient population that chooses MA includes individuals with a wide variation in health and disease status.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This is a de novo sequencing benchmark dataset derived from nine
publicly available mass spectrometry datasets. There are two versions
of the benchmark: main and balanced. The balanced version randomly
eliminates some spectra associated with some species in order to
create a smaller, more evenly balanced dataset. Also provided are two
zip files containing the raw data as well as intermediate results.
Details about how the benchmark was created are provided in an
associated zenodo release, which contains the source code as well as a
manuscript describing the benchmark.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
N represents the number of observations. D represents the number of model parameters.
To assess the variability of low-abundance oligonucleotide detection across sample matrices, we spiked DNA reference standards (meta sequins) into replicate wastewater DNA extracts at logarithmically decreasing mass-to-mass percentages (m/m%) and deeply sequenced them on the Illumina platform. This dataset summarizes the experimental conditions and results of the detection frequencies of those oligonucleotides as well as detailed descriptions of the DNA reference standards used. This dataset is associated with the following publication: Davis, B., P. Vikesland, and A. Pruden. Evaluating Quantitative Metagenomics for Environmental Monitoring of Antibiotic Resistance and Establishing Detection Limits. ENVIRONMENTAL SCIENCE & TECHNOLOGY. American Chemical Society, Washington, DC, USA, 59(12): 6192-6202, (2025).
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This repository contains all raw unprocessed quantum Monte Carlo data utilized in testing and benchmarking the pigsfli code available at https://github.com/DelMaestroGroup/pigsfli.
Directory Names
Directory names are encoded according to the following rule:
{dimension}D_{linear_size}_{total_particles}_{partition size}_{interactionpotential}_{tunneling parameter}_{beta}_{number of bins}
File Names
Contains the state of the RNG:
1D_16_16_8_7.071100_1.000000_12.000000_10001_rng-state_0_square_2.dat {dimension}D_{linear_size}_{total_particles}_{partition size}_{interaction potential}_{tunneling parameter}_{beta}_{number of bins}_rng-state_{seed}_{geometry of subregion}_{number_of_replicas}.dat
Contains the state of the system:
1D_16_16_8_7.071100_1.000000_12.000000_10001_system-state_0_square_2.dat {dimension}D_{linear_size}_{total_particles}_{partition size}_{interaction potential}_{tunneling parameter}_{beta}_{number of bins}_system-state_{seed}_{geometry of subregion}_{number_of_replicas}.dat
Number of times each possible number of swapped sites was measured (each column is a number of swaps ranging from 0 to ℓ):
1D_16_16_8_7.071100_1.000000_12.000000_10001_SWAP_137_square.dat {dimension}D_{linear_size}_{total_particles}_{partition size}_{interaction potential}_{tunneling parameter}_{beta}_{number of bins}_SWAP_{seed}_{geometry of subregion}.dat
For fixed number of swapped sites (mA), how many times each possible local particle number was measured (columns range from n=0,...,N):
1D_8_8_4_3.300000_1.000000_0.600000_10000_SWAPn-mA4_42_square.dat {dimension}D_{linear_size}_{total_particles}_{partition size}_{interaction potential}_{tunneling parameter}_{beta}_{number of bins}_SWAPn-mA{number of swapped sites}_{seed}_{geometry of subregion}.dat
For fixed number of subregion sites (mA), how many times each possible local particle number was measured (columns range from n=0,...,N):
1D_8_8_4_3.300000_1.000000_0.600000_10000_Pn-mA4_42_square.dat {dimension}D_{linear_size}_{total_particles}_{partition size}_{interaction potential}_{tunneling parameter}_{beta}_{number of bins}_Pn-mA{number of swapped sites}_{seed}_{geometry of subregion}.dat
For fixed number of subregion sites (mA), how many times each possible local particle number was measured simultaneously on both replicas when there were no swapped sited (columns range from n=0,...,N):
1D_8_8_4_3.300000_1.000000_0.600000_10000_PnSquared-mA4_42_square.dat {dimension}D_{linear_size}_{total_particles}_{partition size}_{interaction potential}_{tunneling parameter}_{beta}_{number of bins}_PnSquared-mA{number of swapped sites}_{seed}_{geometry of subregion}.dat
Kinetic energy:
1D_8_8_4_3.300000_1.000000_2.000000_10000_K_93_square.dat {dimension}D_{linear_size}_{total_particles}_{partition size}_{interaction potential}_{tunneling parameter}_{beta}_{number of bins}_K_{seed}_{geometry of subregion}.dat
Potential energy:
1D_8_8_4_3.300000_1.000000_2.000000_10000_V_93_square.dat {dimension}D_{linear_size}_{total_particles}_{partition size}_{interaction potential}_{tunneling parameter}_{beta}_{number of bins}_V_{seed}_{geometry of subregion}.dat
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
BackgroundAdvocacy around mass treatment for the elimination of selected Neglected Tropical Diseases (NTDs) has typically put the cost per person treated at less than US$ 0.50. Whilst useful for advocacy, the focus on a single number misrepresents the complexity of delivering “free” donated medicines to about a billion people across the world. We perform a literature review and meta-regression of the cost per person per round of mass treatment against NTDs. We develop a web-based software application (https://healthy.shinyapps.io/benchmark/) to calculate setting-specific unit costs against which programme budgets and expenditures or results-based pay-outs can be benchmarked.MethodsWe reviewed costing studies of mass treatment for the control, elimination or eradication of lymphatic filariasis, schistosomiasis, soil-transmitted helminthiasis, onchocerciasis, trachoma and yaws. These are the main 6 NTDs for which mass treatment is recommended. We extracted financial and economic unit costs, adjusted to a standard definition and base year. We regressed unit costs on the number of people treated and other explanatory variables. Regression results were used to “predict” country-specific unit cost benchmarks.ResultsWe reviewed 56 costing studies and included in the meta-regression 34 studies from 23 countries and 91 sites. Unit costs were found to be very sensitive to economies of scale, and the decision of whether or not to use local volunteers. Financial unit costs are expected to be less than 2015 US$ 0.50 in most countries for programmes that treat 100 thousand people or more. However, for smaller programmes, including those in the “last mile”, or those that cannot rely on local volunteers, both economic and financial unit costs are expected to be higher.DiscussionThe available evidence confirms that mass treatment offers a low cost public health intervention on the path towards universal health coverage. However, more costing studies focussed on elimination are needed. Unit cost benchmarks can help in monitoring value for money in programme plans, budgets and accounts, or in setting a reasonable pay-out for results-based financing mechanisms.
Mountain glaciers are closely coupled to climate processes, ecosystems, and regional water resources. To enhance physical understanding of these connections, the USGS maintains a collection of glacier mass balance and climate data across the western United States and Alaska. In some cases, records of glacier mass balance extend back to the mid-1940s. These data have been incorporated from various sources, primarily original USGS studies, but also including work from the University of Alaska, and the Juneau Icefield Research Program (JIRP). The core of this collection is composed of mass balance data from the USGS Benchmark Glaciers. These five glaciers are Lemon Creek Glacier, AK (1953 -Present), South Cascade Glacier, WA (1958 - Present), Gulkana and Wolverine glaciers, AK (1966 - Present), and Sperry Glacier, MT (2005 - Present). Datasets from each benchmark glacier are composed of, at a minimum, point mass balances, glacier hypsometry, daily temperature and precipitation, geodetic mass balances, and glacier-wide mass balances. Data from other glaciers within this collection may be less complete, continuous, or representative as data from the benchmark glaciers. In these cases, we urge users to carefully inspect the associated metadata of each specific data release for further details.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Mass spectrometry-based metaproteomics, the identification and quantification of thousands of proteins expressed by complex microbial communities, has become pivotal for unraveling functional interactions within microbiomes. However, metaproteomics data analysis encounters many challenges, including the search of tandem mass spectra against a protein sequence database using proteomics database search algorithms. We used a ground-truth dataset to assess a spectral library searching method against established database searching approaches. Mass spectrometry data collected by data-dependent acquisition (DDA-MS) was analyzed using database searching approaches (MaxQuant and FragPipe), as well as using Scribe with Prosit predicted spectral libraries. We used FASTA databases that included protein sequences from microbial species present in the ground-truth dataset along with background protein sequences, to estimate error rates and assess the effects on detection, peptide-spectral match quality, and quantification. Using the Scribe search engine resulted in more proteins detected at a 1% false discovery rate (FDR) compared to MaxQuant or FragPipe, while FragPipe detected more peptides verified by PepQuery. Scribe was able to detect more low-abundance proteins in the microbiome dataset and was more accurate in quantifying the microbial community composition. This research provides insights and guidance for metaproteomics researchers aiming to optimize results in their analysis of DDA-MS data.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Two sets of compounds used to benchmark various molecular fingerprint and molecular similarity methods.
compounds_ms2structures.csv
Contains 37811 different compounds for which also public MS/MS spectra exist (spectra not included here).
The data contains the SMILES, as well as inchikey, mass, classyfire class, subclass and superclass, formula, and for many entries also the npc class, pathway and superclass.
biostructures_combined.csv
This is a set of 730464 unique organic compounds and contains the SMILES. Most of this compounds are taken from the biostructures dataset used in Kreschmer et al. (2025), https://doi.org/10.1038/s41467-024-55462-w. In addition, we merged all missing compounds from the compounds_ms2structures dataset.
Attribution-NonCommercial 3.0 (CC BY-NC 3.0)https://creativecommons.org/licenses/by-nc/3.0/
License information was derived automatically
This dataset contains a bundle of 5 mass market receiver (ublox Neo M8T) and three geodetic graded receiver (Leica GNSS1200+GNSS, Septentrio PolaRx 5TR, Javad Delta TRE_G3T) combined in a zero baseline. The dataset captures 7 days of measurements with carrier phase, code phase, Doppler, carrier-to-noise ratio (C/N0) for GPS/GLONASS C/A code on frequency L1 for the mass market receiver and GPS/GLONASS/GALILEO L1/L2/L5 for geodetic receiver. All geodetic receiver were feeded by external rubidium clock (SRS FS725 Benchmark).
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
These files are associated with the publication "Benchmarking the in vitro toxicity and chemical composition of plastic consumer products" published in Environmental Science & Technology at https://doi.org/10.1021/acs.est.9b02293.
Raw data files for the non-target chemical analysis performed on consumer plastics using GC-QTOF-MS (Agilent .D format).
FTIR spectra of the plastic products (pdf format).
Multiphysics Bench
Dataset: huggingface.co/datasets/Indulge-Bai/Multiphysics_Bench Paper: Multiphysics Bench: Benchmarking and Investigating Scientific Machine Learning for Multiphysics PDEs We propose the first general multiphysics benchmark dataset that encompasses six canonical coupled scenarios across domains such as electromagnetics, heat transfer, fluid flow, solid mechanics, pressure acoustics, and mass transport. This benchmark features the most comprehensive coupling types… See the full description on the dataset page: https://huggingface.co/datasets/Indulge-Bai/Multiphysics_Bench.
Tidal Datum GIS outputsShapefiles are provided that present the approximate shore-parallel extent of tidal datums across coastal Massachusetts. These shapefiles are provided for 2030, 2050, and 2070 sea level rise scenarios. Individual shapefiles are provided for the north and south model domains for a total of 6 tidal datum shapefiles (2 model domains, 3 sea level rise scenarios). The results presented within these polygons are based upon tidal model simulations conducted using the MC-FRM, with north shapefiles created using the north model domain, and south using the south model domain. Separate polygons (zones) are provided for approximate location where MHW values vary to the nearest 0.1 ft interval. These zones are derived based on the variation in the MHW datum, and as such other datums (MHHW, MTL, MLW, and MLLW) may vary withineach segmented polygon, especially in areas of varied bathymetry. Data are presented in units of feet relative to the NAVD88 datum.These shapefiles contain the following fields: FID, Shape, Hatch, MHHW, MHW, MTL, MLW, and MLLW. The MHHW, MHW, MTL, MLW, and MLLW fields contain float type values representing the tidal datums calculated for each polygon rounded to the nearest tenth of a foot. The Hatch field contains a binary value (0 or 1), with 1 representing zones of uncertainty for tidal datums. These uncertain zones are either dynamic in terms of geomorphology or are restricted by smaller anthropogenic features (culverts, tide gates, etc.) that were not fully resolved in the MC-FRM. Zones with a 1 Hatch value may or may not contain tidal datum information. It is recommended that care be taken when utilizing the tidal benchmark information in these hatched zones and site-specific data observations (tide data) are recommended to verify the values in these areas. If datum information is not available 9999 values are located in the datum fields for that polygon. The FID and Shape fields contain an ID number and shape type contained in each polygon.The shapefiles provided are not intended to represent a spatial extent of the tidal benchmark (i.e., they do not present the geospatial location of water level). Rather, these shapefiles provide the tidal benchmark values that should be applied over each of the geospatial zones.