17 datasets found

d
WLCI - Important Agricultural Lands Assessment (Input Raster: Normalized...
catalog.data.gov
data.usgs.gov
+2more
Updated Jul 6, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. Geological Survey (2024). WLCI - Important Agricultural Lands Assessment (Input Raster: Normalized Antelope Damage Claims) [Dataset]. https://catalog.data.gov/dataset/wlci-important-agricultural-lands-assessment-input-raster-normalized-antelope-damage-claim
Explore at:
Dataset updated
Jul 6, 2024
Dataset provided by
U.S. Geological Survey
Description
The values in this raster are unit-less scores ranging from 0 to 1 that represent normalized dollars per acre damage claims from antelope on Wyoming lands. This raster is one of 9 inputs used to calculate the "Normalized Importance Index."
J
Identification of parameters in normal error component logit-mixture (NECLM)...
journaldata.zbw.eu
Updated Nov 15, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Joan L. Walker; Moshe Ben-Akiva; Denis Bolduc; Joan L. Walker; Moshe Ben-Akiva; Denis Bolduc (2022). Identification of parameters in normal error component logit-mixture (NECLM) models (replication data) [Dataset]. https://journaldata.zbw.eu/dataset/identification-of-parameters-in-normal-error-component-logitmixture-neclm-models?activity_id=59ef31c7-ad1c-4bcd-8e84-9ca56dc00bf2
Explore at:
Dataset updated
Nov 15, 2022
Dataset provided by
ZBW - Leibniz Informationszentrum Wirtschaft
Authors
Joan L. Walker; Moshe Ben-Akiva; Denis Bolduc; Joan L. Walker; Moshe Ben-Akiva; Denis Bolduc
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Although the basic structure of logit-mixture models is well understood, important identification and normalization issues often get overlooked. This paper addresses issues related to the identification of parameters in logit-mixture models containing normally distributed error components associated with alternatives or nests of alternatives (normal error component logit mixture, or NECLM, models). NECLM models include special cases such as unrestricted, fixed covariance matrices; alternative-specific variances; nesting and cross-nesting structures; and some applications to panel data. A general framework is presented for determining which parameters are identified as well as what normalization to impose when specifying NECLM models. It is generally necessary to specify and estimate NECLM models at the levels, or structural, form. This precludes working with utility differences, which would otherwise greatly simplify the identification and normalization process. Our results show that identification is not always intuitive; for example, normalization issues present in logit-mixture models are not present in analogous probit models. To identify and properly normalize the NECLM, we introduce the equality condition, an addition to the standard order and rank conditions. The identifying conditions are worked through for a number of special cases, and our findings are demonstrated with empirical examples using both synthetic and real data.
f
Data from: Targeted Workflow Investigating Variations in the Tear Proteome...
acs.figshare.com
bin
Updated Aug 14, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Maggy Lépine; Oriana Zambito; Lekha Sleno (2023). Targeted Workflow Investigating Variations in the Tear Proteome by Liquid Chromatography Tandem Mass Spectrometry [Dataset]. http://doi.org/10.1021/acsomega.3c03186.s001
Explore at:
binAvailable download formats
Unique identifier
https://doi.org/10.1021/acsomega.3c03186.s001
Dataset updated
Aug 14, 2023
Dataset provided by
ACS Publications
Authors
Maggy Lépine; Oriana Zambito; Lekha Sleno
License
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Description
Proteins in tears have an important role in eye health and have been shown as a promising source of disease biomarkers. The goal of this study was to develop a robust, sensitive, and targeted method for profiling tear proteins to examine the variability within a group of healthy volunteers over three days. Inter-individual and inter-day variabilities were examined to contribute to understanding the normal variations in the tear proteome, as well as to establish which proteins may be better candidates as eventual biomarkers of specific diseases. Tear samples collected on Schirmer strips were subjected to bottom-up proteomics, and resulting peptides were analyzed using an optimized targeted method measuring 226 proteins by liquid chromatography-scheduled multiple reaction monitoring. This method was developed using an in-house database of identified proteins from tears compiled from high-resolution data-dependent liquid chromatography tandem mass spectrometry data. The measurement of unique peptide signals can help better understand the dynamics of each of these proteins in tears. Some interesting trends were seen in specific pathways or protein classes, including higher variabilities for those involved in glycolysis, glutathione metabolism, and cytoskeleton proteins and lower variation for those involving the degradation of the extracellular matrix. The overall aim of this study was to contribute to the field of tear proteomics with the development of a novel and targeted method that is highly amenable to the clinical laboratory using high flow LC and commonly used triple quadrupole mass spectrometry while ensuring that protein quantitation was reported based on unique peptides for each protein and robust peak areas with data normalization. These results report on variabilities on over 200 proteins that are robustly detected in tear samples from healthy volunteers with a simple sample preparation procedure.
t
Transformer Network trained on Simulated X-ray photoelectron spectroscopy...
researchdata.tuwien.at
Updated Jul 1, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Florian Simperl; Florian Simperl; Florian Simperl; Florian Simperl (2025). Transformer Network trained on Simulated X-ray photoelectron spectroscopy data for organic and inorganic compounds [Dataset]. http://doi.org/10.48436/mvrkc-dz146
Explore at:
Unique identifier
https://doi.org/10.48436/mvrkc-dz146
Dataset updated
Jul 1, 2025
Dataset provided by
TU Wien
Authors
Florian Simperl; Florian Simperl; Florian Simperl; Florian Simperl
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Dataset Description

This data repository provides the underlying data and neural network training scripts associated with the manuscript titled "A Transformer Network for High-Throughput Material Characterisation with X-ray Photoelectron Spectroscopy" by Simperl and Werner.

All data files are released under the Creative Commons Attribution 4.0 International (CC-BY) license, while all code files are distributed under the MIT license.

The repository contains simulated X-ray photoelectron spectroscopy (XPS) spectra stored as hdf5 files in the zipped (h5_files.zip) folder, which was generated using the software developed by the authors. The NIST Standard Reference Database 100 – Simulation of Electron Spectra for Surface Analysis (SESSA) is freely available at https://www.nist.gov/srd/nist-standard-reference-database-100.

The neural network architecture is implemented using the PyTorch Lightning framework and is fully available within the attached materials as Transformer_SimulatedSpectra.py contained in the python_scripts.zip.

The trained model and the list of materials for the train, test and validation sets are contained in the models.zip folder.

The repository contains all the data necessary to replot the figures from the manuscript. These data are available in the form of .csv files or .h5 files for the spectra. In addition, the repository also contains a Python script (Plot_Data_Manuscript.ipynb) which is contained in the python_scripts.zip file.

Context and methodology

The dataset and accompanying Python code files included in this repository were used to train a transformer-based neural network capable of directly inferring chemical concentrations from simulated survey X-ray photoelectron spectroscopy (XPS) spectra of bulk compounds.

The spectral dataset provided here represents the raw output from the SESSA software (version 2.2.2), prior to the normalization procedure described in the associated manuscript. This step of normalisation is of paramount importance for the effective training of the neural network.

The repository contains the Python scripts utilised to execute the spectral simulations and the neural network training on the Vienna Scientific Cluster (VSC5). In order to obtain guidance on the proper configuration of the Command Line Interface (CLI) tools required for SESSA, users are advised to consult the official SESSA manual, which is available at the following address: https://nvlpubs.nist.gov/nistpubs/NSRDS/NIST.NSRDS.100-2024.pdf.

To run the neural network training we provided the requirements_nn_training.txt file that contains all the necessary python packages and version numbers. All other python scripts can be run locally with the python libraries listed in requirements_data_analysis.txt.

Data details

HDF5 (in zip folder): As described in the manuscript, we simulate X-ray photoelectron spectra for each of the 7,587 inorganic [1] and organic [2] materials in our dataset. To reflect realistic experimental conditions, each simulated spectrum was augmented by systematically varying parameters such as peak width, peak shift, and peak type—all configurable within the SESSA software—as well as by applying statistical Poisson noise to simulate varying signal-to-noise ratios. These modifications account for experimentally observed and material-specific spectral broadening, peak shifts, and detector-induced noise. Each material is represented by an individual HDF5 (.h5) file, named according to its chemical formula and mass density (in g/cm³). For example, the file for SiO2 with a density of 2.196 gcm-3 is named SiO2_2.196.h5. For more complex chemical formulas, such as Co(ClO4)2 with a density of 3.33 gcm-3, the file is named Co_ClO4_2_3.33.h5. Within each HDF5 file, the metadata for each spectrum is stored alongside a fixed energy axis and the corresponding intensity values. The spectral data are organized hierarchically by augmentation parameters in the following directory structure, e.g. for Ac_10.0.h5 we have SNR_0/WIDTH_0.3/SHIFT_-3.0/PEAK_gauss/Ac_10.0/. These files can be easily inspected with H5Web in Visual Studio Code or using h5py in Python or any other h5 interpretable program.

Session Files: The .ses files are SESSA specific input files that can be directly loaded into SESSA to specify certain input parameters for the initilization (ini), the geometry (geo) and the simulation parameters (sim_para) and are required by the python script Simulation_Script_VSC_json.py to run the simulation on the cluster.

Json Files: The two json files (MaterialsListVSC_gauss.json, MaterialsListVSC_lorentz.json) are used as the input files to the Python script Simulation_Script_VSC_json.py. These files contain all the material specific information for the SESSA simulation.

csv files: The csv files are used to generate the plots from the manuscript described in the section "Plotting Scripts".

npz files: The two .npz files (element_counts.npz, single_elements.npz) are python arrays that are needed by the Transformer_SimulatedSpectra.py script and contain the number of each single element in the dataset and an array of each single element present, respectively.

SESSA Simulation Script

There is one python file that sets the communication with SESSA:

Simulation_Script_VSC_json.py: This script is the heart of the simulation as it controls the communication through the CLI with SESSA using the specified input paramters in the .json and .ses files together with external functions specified in VSC_function.py

Technical Details

Simulation_Script_VSC_json.py: This script uses the functions of the VSC_function.py script (therefore needs to be placed in the same directory as this script) and can be called with the following command:

python3 Simulation_Script_VSC_json.py MaterialsListVSC_gauss.json 0

It simulates the spectrum for the material at index 0 in the .json file and with the corresponding parameters specified in the .json file.

It is important that before running this script the following paths need to be specified:

sessa_path: The path to their SESSA installation in sessa_path and the path to their session files in

folder_path: The path to their .ses files. In this directory an output folder will be generated where all the output files, including the simulated spectra, are written to.

To run SESSA on a computing cluster it is important to have a working Xvfb (virtual frame buffer) or a similar tool available to which any graphical output from SESSA can be written to.

Neural Network Training Script

Before running the training script it is important to normalize the data such that the squared integral of the spectrum is 1 (as described in the manuscript) and shown in the code: normalize_spectra.py

For the neural network training we use the Transformer_SimulatedSpectra.py where the external functions used are specified in external_functions.py. This script contains the full description of the neural network architecture, the hyperparameter tuning and the Wandb logging.

In the models.zip folder the fully trained network final_trained_model.ckpt presented in the manuscript is available as well as the list of training, validation and testing materials (test_materials_list.pt, train_materials_list.pt, val_materials_list.pt) where the corresponding spectra are extracted from the hdf5 files. The file types .ckpt and .pt can be read in by using the pytorch specific load functions in Python, e.g.

torch.load(train_materials_list)

Technical Details

normalize_spectra.py: To run this script properly it is important to set up a python environment with the necessary libraries specified in the requirements_data_analysis.txt file. Then it can be called with

python3 normalize_spectra.py

where it is important to specify the path to the .h5 files containing the unnormalized spectra.

Transformer_SimulatedSpectra.py: To run this script properly on the cluster it is important to set up a python environment with the necessary libraries specified in the requirements_nn_training.txt file. This script also relies on external_functions.py, single_elements.npz and element_counts.npz (that should be placed in the same directory as the python script) file. This is important for creating the datasets for training, validation and testing and ensures that all the single elements appear in the testing set. You can call this script (on the cluster) within a slurm script to start the GPU training.

python3 Transformer_SimulatedSpectra.py

It is important that before running this script the following paths need to be specified:

data_path: General path where all the data is stored

neural_network_data: The location where you keep your normalized hdf5 files

wandb_api_key: The api key to use wandb

ray_tesults: The location where you want to save your tuning results

checkpoints: The location where you want to save your ray
d
Residential Existing Homes (One to Four Units) Energy Efficiency Meter...
catalog.data.gov
datasets.ai
+2more
Updated Sep 15, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
data.ny.gov (2023). Residential Existing Homes (One to Four Units) Energy Efficiency Meter Evaluated Project Data: 2007 – 2012 [Dataset]. https://catalog.data.gov/dataset/residential-existing-homes-one-to-four-units-energy-efficiency-meter-evaluated-projec-2007
Explore at:
Dataset updated
Sep 15, 2023
Dataset provided by
data.ny.gov
Description
IMPORTANT! PLEASE READ DISCLAIMER BEFORE USING DATA. This dataset backcasts estimated modeled savings for a subset of 2007-2012 completed projects in the Home Performance with ENERGY STAR® Program against normalized savings calculated by an open source energy efficiency meter available at https://www.openee.io/. Open source code uses utility-grade metered consumption to weather-normalize the pre- and post-consumption data using standard methods with no discretionary independent variables. The open source energy efficiency meter allows private companies, utilities, and regulators to calculate energy savings from energy efficiency retrofits with increased confidence and replicability of results. This dataset is intended to lay a foundation for future innovation and deployment of the open source energy efficiency meter across the residential energy sector, and to help inform stakeholders interested in pay for performance programs, where providers are paid for realizing measurable weather-normalized results. To download the open source code, please visit the website at https://github.com/openeemeter/eemeter/releases D I S C L A I M E R: Normalized Savings using open source OEE meter. Several data elements, including, Evaluated Annual Elecric Savings (kWh), Evaluated Annual Gas Savings (MMBtu), Pre-retrofit Baseline Electric (kWh), Pre-retrofit Baseline Gas (MMBtu), Post-retrofit Usage Electric (kWh), and Post-retrofit Usage Gas (MMBtu) are direct outputs from the open source OEE meter. Home Performance with ENERGY STAR® Estimated Savings. Several data elements, including, Estimated Annual kWh Savings, Estimated Annual MMBtu Savings, and Estimated First Year Energy Savings represent contractor-reported savings derived from energy modeling software calculations and not actual realized energy savings. The accuracy of the Estimated Annual kWh Savings and Estimated Annual MMBtu Savings for projects has been evaluated by an independent third party. The results of the Home Performance with ENERGY STAR impact analysis indicate that, on average, actual savings amount to 35 percent of the Estimated Annual kWh Savings and 65 percent of the Estimated Annual MMBtu Savings. For more information, please refer to the Evaluation Report published on NYSERDA’s website at: http://www.nyserda.ny.gov/-/media/Files/Publications/PPSER/Program-Evaluation/2012ContractorReports/2012-HPwES-Impact-Report-with-Appendices.pdf. This dataset includes the following data points for a subset of projects completed in 2007-2012: Contractor ID, Project County, Project City, Project ZIP, Climate Zone, Weather Station, Weather Station-Normalization, Project Completion Date, Customer Type, Size of Home, Volume of Home, Number of Units, Year Home Built, Total Project Cost, Contractor Incentive, Total Incentives, Amount Financed through Program, Estimated Annual kWh Savings, Estimated Annual MMBtu Savings, Estimated First Year Energy Savings, Evaluated Annual Electric Savings (kWh), Evaluated Annual Gas Savings (MMBtu), Pre-retrofit Baseline Electric (kWh), Pre-retrofit Baseline Gas (MMBtu), Post-retrofit Usage Electric (kWh), Post-retrofit Usage Gas (MMBtu), Central Hudson, Consolidated Edison, LIPA, National Grid, National Fuel Gas, New York State Electric and Gas, Orange and Rockland, Rochester Gas and Electric. How does your organization use this dataset? What other NYSERDA or energy-related datasets would you like to see on Open NY? Let us know by emailing OpenNY@nyserda.ny.gov.
A
‘Residential Existing Homes (One to Four Units) Energy Efficiency Meter...
analyst-2.ai
Updated Sep 20, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com) (2018). ‘Residential Existing Homes (One to Four Units) Energy Efficiency Meter Evaluated Project Data: 2007 – 2012’ analyzed by Analyst-2 [Dataset]. https://analyst-2.ai/analysis/data-gov-residential-existing-homes-one-to-four-units-energy-efficiency-meter-evaluated-project-data-2007-2012-25d8/e044eb3c/?iid=033-758&v=presentation
Explore at:
Dataset updated
Sep 20, 2018
Dataset authored and provided by
Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com)
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Analysis of ‘Residential Existing Homes (One to Four Units) Energy Efficiency Meter Evaluated Project Data: 2007 – 2012’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://catalog.data.gov/dataset/f0c9c585-5788-4b49-83b9-3733ea7b5e30 on 12 February 2022.

--- Dataset description provided by original source is as follows ---

IMPORTANT! PLEASE READ DISCLAIMER BEFORE USING DATA. This dataset backcasts estimated modeled savings for a subset of 2007-2012 completed projects in the Home Performance with ENERGY STAR® Program against normalized savings calculated by an open source energy efficiency meter available at https://www.openee.io/. Open source code uses utility-grade metered consumption to weather-normalize the pre- and post-consumption data using standard methods with no discretionary independent variables. The open source energy efficiency meter allows private companies, utilities, and regulators to calculate energy savings from energy efficiency retrofits with increased confidence and replicability of results. This dataset is intended to lay a foundation for future innovation and deployment of the open source energy efficiency meter across the residential energy sector, and to help inform stakeholders interested in pay for performance programs, where providers are paid for realizing measurable weather-normalized results. To download the open source code, please visit the website at https://github.com/openeemeter/eemeter/releases

D I S C L A I M E R: Normalized Savings using open source OEE meter. Several data elements, including, Evaluated Annual Elecric Savings (kWh), Evaluated Annual Gas Savings (MMBtu), Pre-retrofit Baseline Electric (kWh), Pre-retrofit Baseline Gas (MMBtu), Post-retrofit Usage Electric (kWh), and Post-retrofit Usage Gas (MMBtu) are direct outputs from the open source OEE meter.

Home Performance with ENERGY STAR® Estimated Savings. Several data elements, including, Estimated Annual kWh Savings, Estimated Annual MMBtu Savings, and Estimated First Year Energy Savings represent contractor-reported savings derived from energy modeling software calculations and not actual realized energy savings. The accuracy of the Estimated Annual kWh Savings and Estimated Annual MMBtu Savings for projects has been evaluated by an independent third party. The results of the Home Performance with ENERGY STAR impact analysis indicate that, on average, actual savings amount to 35 percent of the Estimated Annual kWh Savings and 65 percent of the Estimated Annual MMBtu Savings. For more information, please refer to the Evaluation Report published on NYSERDA’s website at: http://www.nyserda.ny.gov/-/media/Files/Publications/PPSER/Program-Evaluation/2012ContractorReports/2012-HPwES-Impact-Report-with-Appendices.pdf.

This dataset includes the following data points for a subset of projects completed in 2007-2012: Contractor ID, Project County, Project City, Project ZIP, Climate Zone, Weather Station, Weather Station-Normalization, Project Completion Date, Customer Type, Size of Home, Volume of Home, Number of Units, Year Home Built, Total Project Cost, Contractor Incentive, Total Incentives, Amount Financed through Program, Estimated Annual kWh Savings, Estimated Annual MMBtu Savings, Estimated First Year Energy Savings, Evaluated Annual Electric Savings (kWh), Evaluated Annual Gas Savings (MMBtu), Pre-retrofit Baseline Electric (kWh), Pre-retrofit Baseline Gas (MMBtu), Post-retrofit Usage Electric (kWh), Post-retrofit Usage Gas (MMBtu), Central Hudson, Consolidated Edison, LIPA, National Grid, National Fuel Gas, New York State Electric and Gas, Orange and Rockland, Rochester Gas and Electric.

How does your organization use this dataset? What other NYSERDA or energy-related datasets would you like to see on Open NY? Let us know by emailing OpenNY@nyserda.ny.gov.

--- Original source retains full ownership of the source dataset ---
Extracting Clinical Significance for Drug-Gene Interactions using FDA Label...
zenodo.org
bin, csv, png
Updated Jan 28, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Matthew Cannon; Matthew Cannon (2025). Extracting Clinical Significance for Drug-Gene Interactions using FDA Label Packages [Dataset]. http://doi.org/10.5281/zenodo.14757436
Explore at:
bin, png, csvAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.14757436
Dataset updated
Jan 28, 2025
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Matthew Cannon; Matthew Cannon
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The drug-gene interaction database (DGIdb) is a resource that aggregates interaction data from over 40 different resources into one platform with the primary goal of making the druggable genome accessible to clinicians and researchers. By providing a public, computationally accessible database, DGIdb enables therapeutic insights through broad aggregation of drug-gene interaction data.

As part of our aggregation process, DGIdb preserves data regarding interaction types, directionality, and other attributes that enable filtering or biochemical insight. However, source data are often incomplete and may not contain the therapeutic relevance of the interaction. In this report, we address these missing data and demonstrate a pipeline for extracting physiological context from free-text sources. We apply existing large language models (LLMs) to tag and extract indications, cancer types, and relevant pharmacogenomics from free-text, FDA approved labels. We are then able to utilize the Variant Interpretation for Cancer Consortium (VICC) normalization services to ground extracted data back to formally grouped concepts.

In a preliminary test set of 355 FDA labels, we were able to normalize 86.5% of extracted chemical entities back to ontologically-grounded therapeutic concepts. We can link this therapeutic context data back to interaction records already searchable within DGIdb. By using LLMs to extract this data set, we can supplement our existing interaction data with relevant indications, pharmacogenomic data and mutational statuses that may inform the therapeutic relevance of a particular interaction. Inclusion of these data will be invaluable for variant interpretation pipelines where mutational status can lead to the identification of a lifesaving therapeutic.
f
DataSheet1_Validation of Reference Genes for Quantitative Real-Time PCR...
frontiersin.figshare.com
zip
Updated Jun 5, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Meiqin Mao; Yanbin Xue; Yehua He; Xuzixing Zhou; Hao Hu; Jiawen Liu; Lijun Feng; Wei Yang; Jiaheng Luo; Huiling Zhang; Xi Li; Jun Ma (2023). DataSheet1_Validation of Reference Genes for Quantitative Real-Time PCR Normalization in Ananas comosus var. bracteatus During Chimeric Leaf Development and Response to Hormone Stimuli.ZIP [Dataset]. http://doi.org/10.3389/fgene.2021.716137.s001
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.3389/fgene.2021.716137.s001
Dataset updated
Jun 5, 2023
Dataset provided by
Frontiers
Authors
Meiqin Mao; Yanbin Xue; Yehua He; Xuzixing Zhou; Hao Hu; Jiawen Liu; Lijun Feng; Wei Yang; Jiaheng Luo; Huiling Zhang; Xi Li; Jun Ma
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Reverse transcription quantitative real-time PCR (RT-qPCR) is a common way to study gene regulation at the transcriptional level due to its sensibility and specificity, but it needs appropriate reference genes to normalize data. Ananas comosus var. bracteatus, with white-green chimeric leaves, is an important pantropical ornamental plant. Up to date, no reference genes have been evaluated in Ananas comosus var. bracteatus. In this work, we used five common statistics tools (geNorm, NormFinder, BestKeeper, ΔCt method, RefFinder) to evaluate 10 candidate reference genes. The results showed that Unigene.16454 and Unigene.16459 were the optimal reference genes for different tissues, Unigene.16454 and zinc finger ran-binding domain-containing protein 2 (ZRANB2) for chimeric leaf at different developmental stages, isocitrate dehydrogenase NADP and triacylglycerol lipase SDP1-like (SDP) for seedlings under different hormone treatments. The comprehensive results showed IDH, pentatricopeptide repeat-containing protein (PPRC), Unigene.16454, and caffeoyl-CoA O methyltransferase 5-like (CCOAOMT) are the top-ranked stable genes across all the samples. The stability of glyceraldehyde-3-phosphate dehydrogenase (GAPDH) was the least during all experiments. Furthermore, the reliability of recommended reference gene was validated by the detection of porphobilinogen deaminase (HEMC) expression levels in chimeric leaves. Overall, this study provides appropriate reference genes under three specific experimental conditions and will be useful for future research on spatial and temporal regulation of gene expression and multiple hormone regulation pathways in Ananas comosus var. bracteatus.
f
Train and validation AUC accuracy per class of all methods and alternative...
plos.figshare.com
xlsx
Updated Mar 26, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Daniel Rawlinson; Chenxi Zhou; Myrsini Kaforou; Kim-Anh Lê Cao; Lachlan J. M. Coin (2025). Train and validation AUC accuracy per class of all methods and alternative normalisations on RAPIDS dataset, where models were fully trained on all data in the train partition. [Dataset]. http://doi.org/10.1371/journal.pdig.0000780.s015
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pdig.0000780.s015
Dataset updated
Mar 26, 2025
Dataset provided by
PLOS Digital Health
Authors
Daniel Rawlinson; Chenxi Zhou; Myrsini Kaforou; Kim-Anh Lê Cao; Lachlan J. M. Coin
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Train and validation AUC accuracy per class of all methods and alternative normalisations on RAPIDS dataset, where models were fully trained on all data in the train partition.
f
Per-fold number of features on Álvez dataset, weighted and unweighted...
plos.figshare.com
xlsx
Updated Mar 26, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Daniel Rawlinson; Chenxi Zhou; Myrsini Kaforou; Kim-Anh Lê Cao; Lachlan J. M. Coin (2025). Per-fold number of features on Álvez dataset, weighted and unweighted models. [Dataset]. http://doi.org/10.1371/journal.pdig.0000780.s010
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pdig.0000780.s010
Dataset updated
Mar 26, 2025
Dataset provided by
PLOS Digital Health
Authors
Daniel Rawlinson; Chenxi Zhou; Myrsini Kaforou; Kim-Anh Lê Cao; Lachlan J. M. Coin
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Per-fold number of features on Álvez dataset, weighted and unweighted models.
f
Per-fold number of features on RAPIDS dataset, weighted and unweighted...
plos.figshare.com
xlsx
Updated Mar 26, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Daniel Rawlinson; Chenxi Zhou; Myrsini Kaforou; Kim-Anh Lê Cao; Lachlan J. M. Coin (2025). Per-fold number of features on RAPIDS dataset, weighted and unweighted models. [Dataset]. http://doi.org/10.1371/journal.pdig.0000780.s008
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pdig.0000780.s008
Dataset updated
Mar 26, 2025
Dataset provided by
PLOS Digital Health
Authors
Daniel Rawlinson; Chenxi Zhou; Myrsini Kaforou; Kim-Anh Lê Cao; Lachlan J. M. Coin
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Per-fold number of features on RAPIDS dataset, weighted and unweighted models.
f
Sensitivity, Specificity, and AUC accuracy per class of FS-PLS vs StepAIC on...
plos.figshare.com
xlsx
Updated Mar 26, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Daniel Rawlinson; Chenxi Zhou; Myrsini Kaforou; Kim-Anh Lê Cao; Lachlan J. M. Coin (2025). Sensitivity, Specificity, and AUC accuracy per class of FS-PLS vs StepAIC on RAPIDS dataset. [Dataset]. http://doi.org/10.1371/journal.pdig.0000780.s013
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pdig.0000780.s013
Dataset updated
Mar 26, 2025
Dataset provided by
PLOS Digital Health
Authors
Daniel Rawlinson; Chenxi Zhou; Myrsini Kaforou; Kim-Anh Lê Cao; Lachlan J. M. Coin
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Sensitivity, Specificity, and AUC accuracy per class of FS-PLS vs StepAIC on RAPIDS dataset.
f
Sensitivity, Specificity, and AUC accuracy per class of FS-PLS vs StepAIC on...
plos.figshare.com
xlsx
Updated Mar 26, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Daniel Rawlinson; Chenxi Zhou; Myrsini Kaforou; Kim-Anh Lê Cao; Lachlan J. M. Coin (2025). Sensitivity, Specificity, and AUC accuracy per class of FS-PLS vs StepAIC on Álvez dataset. [Dataset]. http://doi.org/10.1371/journal.pdig.0000780.s014
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pdig.0000780.s014
Dataset updated
Mar 26, 2025
Dataset provided by
PLOS Digital Health
Authors
Daniel Rawlinson; Chenxi Zhou; Myrsini Kaforou; Kim-Anh Lê Cao; Lachlan J. M. Coin
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Sensitivity, Specificity, and AUC accuracy per class of FS-PLS vs StepAIC on Álvez dataset.
f
Per-fold AUC accuracy of FS-PLS vs Stagewise-FS on binary datasets.
plos.figshare.com
figshare.com
xlsx
Updated Mar 26, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Daniel Rawlinson; Chenxi Zhou; Myrsini Kaforou; Kim-Anh Lê Cao; Lachlan J. M. Coin (2025). Per-fold AUC accuracy of FS-PLS vs Stagewise-FS on binary datasets. [Dataset]. http://doi.org/10.1371/journal.pdig.0000780.s012
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pdig.0000780.s012
Dataset updated
Mar 26, 2025
Dataset provided by
PLOS Digital Health
Authors
Daniel Rawlinson; Chenxi Zhou; Myrsini Kaforou; Kim-Anh Lê Cao; Lachlan J. M. Coin
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Per-fold AUC accuracy of FS-PLS vs Stagewise-FS on binary datasets.
f
Per-fold AUC accuracy of all models on binary datasets.
figshare.com
plos.figshare.com
xlsx
Updated Mar 26, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Daniel Rawlinson; Chenxi Zhou; Myrsini Kaforou; Kim-Anh Lê Cao; Lachlan J. M. Coin (2025). Per-fold AUC accuracy of all models on binary datasets. [Dataset]. http://doi.org/10.1371/journal.pdig.0000780.s006
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pdig.0000780.s006
Dataset updated
Mar 26, 2025
Dataset provided by
PLOS Digital Health
Authors
Daniel Rawlinson; Chenxi Zhou; Myrsini Kaforou; Kim-Anh Lê Cao; Lachlan J. M. Coin
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Per-fold AUC accuracy of all models on binary datasets.
f
Summary of the ten features selected by FS-PLS in the RAPIDS...
plos.figshare.com
xlsx
Updated Mar 26, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Daniel Rawlinson; Chenxi Zhou; Myrsini Kaforou; Kim-Anh Lê Cao; Lachlan J. M. Coin (2025). Summary of the ten features selected by FS-PLS in the RAPIDS ordinary-normalisation signature, including their biological roles. [Dataset]. http://doi.org/10.1371/journal.pdig.0000780.s005
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pdig.0000780.s005
Dataset updated
Mar 26, 2025
Dataset provided by
PLOS Digital Health
Authors
Daniel Rawlinson; Chenxi Zhou; Myrsini Kaforou; Kim-Anh Lê Cao; Lachlan J. M. Coin
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Summary of the ten features selected by FS-PLS in the RAPIDS ordinary-normalisation signature, including their biological roles.
f
Per-fold number of features selected by all models on binary datasets.
plos.figshare.com
xlsx
Updated Mar 26, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Daniel Rawlinson; Chenxi Zhou; Myrsini Kaforou; Kim-Anh Lê Cao; Lachlan J. M. Coin (2025). Per-fold number of features selected by all models on binary datasets. [Dataset]. http://doi.org/10.1371/journal.pdig.0000780.s007
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pdig.0000780.s007
Dataset updated
Mar 26, 2025
Dataset provided by
PLOS Digital Health
Authors
Daniel Rawlinson; Chenxi Zhou; Myrsini Kaforou; Kim-Anh Lê Cao; Lachlan J. M. Coin
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Per-fold number of features selected by all models on binary datasets.
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

U.S. Geological Survey (2024). WLCI - Important Agricultural Lands Assessment (Input Raster: Normalized Antelope Damage Claims) [Dataset]. https://catalog.data.gov/dataset/wlci-important-agricultural-lands-assessment-input-raster-normalized-antelope-damage-claim

WLCI - Important Agricultural Lands Assessment (Input Raster: Normalized Antelope Damage Claims)

Explore at:

Dataset updated

Jul 6, 2024

Dataset provided by

U.S. Geological Survey

Description

The values in this raster are unit-less scores ranging from 0 to 1 that represent normalized dollars per acre damage claims from antelope on Wyoming lands. This raster is one of 9 inputs used to calculate the "Normalized Importance Index."

Clear search

Close search

Google apps

Main menu

WLCI - Important Agricultural Lands Assessment (Input Raster: Normalized...

Identification of parameters in normal error component logit-mixture (NECLM)...

Data from: Targeted Workflow Investigating Variations in the Tear Proteome...

Transformer Network trained on Simulated X-ray photoelectron spectroscopy...

Dataset Description

Context and methodology

Data details

SESSA Simulation Script

Technical Details

Neural Network Training Script

Technical Details

Residential Existing Homes (One to Four Units) Energy Efficiency Meter...

‘Residential Existing Homes (One to Four Units) Energy Efficiency Meter...

Extracting Clinical Significance for Drug-Gene Interactions using FDA Label...

DataSheet1_Validation of Reference Genes for Quantitative Real-Time PCR...

Train and validation AUC accuracy per class of all methods and alternative...

Per-fold number of features on Álvez dataset, weighted and unweighted...

Per-fold number of features on RAPIDS dataset, weighted and unweighted...

Sensitivity, Specificity, and AUC accuracy per class of FS-PLS vs StepAIC on...

Sensitivity, Specificity, and AUC accuracy per class of FS-PLS vs StepAIC on...

Per-fold AUC accuracy of FS-PLS vs Stagewise-FS on binary datasets.

Per-fold AUC accuracy of all models on binary datasets.

Summary of the ten features selected by FS-PLS in the RAPIDS...

Per-fold number of features selected by all models on binary datasets.

WLCI - Important Agricultural Lands Assessment (Input Raster: Normalized Antelope Damage Claims)See More Versions

WLCI - Important Agricultural Lands Assessment (Input Raster: Normalized Antelope Damage Claims)