21 datasets found

q
REMNet Tutorial, R Part 5: Normalizing Microbiome Data in R 5.2.19
qubeshub.org
Updated Aug 28, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jessica Joyner (2019). REMNet Tutorial, R Part 5: Normalizing Microbiome Data in R 5.2.19 [Dataset]. http://doi.org/10.25334/M13H-XT81
Explore at:
Unique identifier
https://doi.org/10.25334/M13H-XT81
Dataset updated
Aug 28, 2019
Dataset provided by
QUBES
Authors
Jessica Joyner
Description
Video on normalizing microbiome data from the Research Experiences in Microbiomes Network
n
Methods for normalizing microbiome data: an ecological perspective
data.niaid.nih.gov
search.dataone.org
+1more
zip
Updated Oct 30, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Donald T. McKnight; Roger Huerlimann; Deborah S. Bower; Lin Schwarzkopf; Ross A. Alford; Kyall R. Zenger (2018). Methods for normalizing microbiome data: an ecological perspective [Dataset]. http://doi.org/10.5061/dryad.tn8qs35
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5061/dryad.tn8qs35
Dataset updated
Oct 30, 2018
Dataset provided by
James Cook University
University of New England
Authors
Donald T. McKnight; Roger Huerlimann; Deborah S. Bower; Lin Schwarzkopf; Ross A. Alford; Kyall R. Zenger
License
https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
Description
Microbiome sequencing data often need to be normalized due to differences in read depths, and recommendations for microbiome analyses generally warn against using proportions or rarefying to normalize data and instead advocate alternatives, such as upper quartile, CSS, edgeR-TMM, or DESeq-VS. Those recommendations are, however, based on studies that focused on differential abundance testing and variance standardization, rather than community-level comparisons (i.e., beta diversity), Also, standardizing the within-sample variance across samples may suppress differences in species evenness, potentially distorting community-level patterns. Furthermore, the recommended methods use log transformations, which we expect to exaggerate the importance of differences among rare OTUs, while suppressing the importance of differences among common OTUs. 2. We tested these theoretical predictions via simulations and a real-world data set. 3. Proportions and rarefying produced more accurate comparisons among communities and were the only methods that fully normalized read depths across samples. Additionally, upper quartile, CSS, edgeR-TMM, and DESeq-VS often masked differences among communities when common OTUs differed, and they produced false positives when rare OTUs differed. 4. Based on our simulations, normalizing via proportions may be superior to other commonly used methods for comparing ecological communities.
Additional file 3: of DBNorm: normalizing high-density oligonucleotide...
springernature.figshare.com
txt
Updated May 31, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Qinxue Meng; Daniel Catchpoole; David Skillicorn; Paul Kennedy (2023). Additional file 3: of DBNorm: normalizing high-density oligonucleotide microarray data based on distributions [Dataset]. http://doi.org/10.6084/m9.figshare.5648932.v1
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.5648932.v1
Dataset updated
May 31, 2023
Dataset provided by
Figsharehttp://figshare.com/
Authors
Qinxue Meng; Daniel Catchpoole; David Skillicorn; Paul Kennedy
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
DBNorm test script. Code of how we test DBNorm package. (TXT 2Â kb)
Additional file 4: of DBNorm: normalizing high-density oligonucleotide...
springernature.figshare.com
txt
Updated May 30, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Qinxue Meng; Daniel Catchpoole; David Skillicorn; Paul Kennedy (2023). Additional file 4: of DBNorm: normalizing high-density oligonucleotide microarray data based on distributions [Dataset]. http://doi.org/10.6084/m9.figshare.5648956.v1
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.5648956.v1
Dataset updated
May 30, 2023
Dataset provided by
Figsharehttp://figshare.com/
Authors
Qinxue Meng; Daniel Catchpoole; David Skillicorn; Paul Kennedy
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
DBNorm installation. Describes how to install DBNorm via devtools in R. (TXT 4Â kb)
Dataset supporting: Normalizing and denoising protein expression data from...
nih.figshare.com
figshare.com
zip
Updated May 30, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Matthew P. Mulé; Andrew J. Martins; John Tsang (2023). Dataset supporting: Normalizing and denoising protein expression data from droplet-based single cell profiling [Dataset]. http://doi.org/10.35092/yhjc.13370915.v2
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.35092/yhjc.13370915.v2
Dataset updated
May 30, 2023
Dataset provided by
Figsharehttp://figshare.com/
Authors
Matthew P. Mulé; Andrew J. Martins; John Tsang
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Data for reproducing analysis in the manuscript:Normalizing and denoising protein expression data from droplet-based single cell profilinglink to manuscript: https://www.biorxiv.org/content/10.1101/2020.02.24.963603v1

Data deposited here are for the purposes of reproducing the analysis results and figures reported in the manuscript above. These data are all publicly available downloaded and converted to R datasets prior to Dec 4, 2020. For a full description of all the data included in this repository and instructions for reproducing all analysis results and figures, please see the repository: https://github.com/niaid/dsb_manuscript.

For usage of the dsb R package for normalizing CITE-seq data please see the repository: https://github.com/niaid/dsb

If you use the dsb R package in your work please cite:Mulè MP, Martins AJ, Tsang JS. Normalizing and denoising protein expression data from droplet-based single cell profiling. bioRxiv. 2020;2020.02.24.963603.

General contact: John Tsang (john.tsang AT nih.gov)

Questions about software/code: Matt Mulè (mulemp AT nih.gov)
N
Single cell RNA-seq data of human hESCs to evaluate SCnorm: robust...
data.niaid.nih.gov
Updated May 15, 2019
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Bacher R; Chu L; Kendziorski C; Swanson S (2019). Single cell RNA-seq data of human hESCs to evaluate SCnorm: robust normalization of single-cell rna-seq data [Dataset]. https://data.niaid.nih.gov/resources?id=gse85917
Explore at:
Dataset updated
May 15, 2019
Dataset provided by
University of Florida
Authors
Bacher R; Chu L; Kendziorski C; Swanson S
Description
Normalization of RNA-sequencing data is essential for accurate downstream inference, but the assumptions upon which most methods are based do not hold in the single-cell setting. Consequently, applying existing normalization methods to single-cell RNA-seq data introduces artifacts that bias downstream analyses. To address this, we introduce SCnorm for accurate and efficient normalization of scRNA-seq data. Total 183 single cells (92 H1 cells, 91 H9 cells), sequenced twice, were used to evaluate SCnorm in normalizing single cell RNA-seq experiments. Total 48 bulk H1 samples were used to compare bulk and single cell properties. For single-cell RNA-seq, the identical single-cell indexed and fragmented cDNA were pooled at 96 cells per lane or at 24 cells per lane to test the effects of sequencing depth, resulting in approximately 1 million and 4 million mapped reads per cell in the two pooling groups, respectively.
d
R script to reproduce \"Improved normalization of species count data in...
search.dataone.org
Updated Mar 21, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
BonaRes Repository (2025). R script to reproduce \"Improved normalization of species count data in ecology by scaling with ranked subsampling (SRS): application to microbial communities\".@en [Dataset]. https://search.dataone.org/view/sha256%3Aa934b23425b0e7e7d9d4278f89745fc842e75fdfe8b47de25c797034dadc1f51
Explore at:
Dataset updated
Mar 21, 2025
Dataset provided by
BonaRes Repository
Area covered

Description
R script to reproduce "Improved normalization of species count data in ecology by scaling with ranked subsampling (SRS): application to microbial communities"..
m
Mitoplate S-1 analysis using R
data.mendeley.com
Updated Mar 5, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Flavia Radogna (2020). Mitoplate S-1 analysis using R [Dataset]. http://doi.org/10.17632/b9mprfdvmv.1
Explore at:
Unique identifier
https://doi.org/10.17632/b9mprfdvmv.1
Dataset updated
Mar 5, 2020
Authors
Flavia Radogna
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This R script performs normalisation of data obtained with the MitoPlate S-1 commercialised by Biolog. In addition, it creates a scatterplot of initial rate values between conditions of interest. The script includes a first normalisation step using the "No substrate" well (A1) required for the rows A to H and a second normalisation step using the "L-Malic Acid 100 µM" (G1) only required for the rows G and H. Initial rate values are calculated as the slope of a linear regression fitted between 30 minutes and 2 hours.
f
Table_1_Comparison of Normalization Methods for Analysis of TempO-Seq...
figshare.com
frontiersin.figshare.com
xlsx
Updated Jun 3, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Pierre R. Bushel; Stephen S. Ferguson; Sreenivasa C. Ramaiahgari; Richard S. Paules; Scott S. Auerbach (2023). Table_1_Comparison of Normalization Methods for Analysis of TempO-Seq Targeted RNA Sequencing Data.XLSX [Dataset]. http://doi.org/10.3389/fgene.2020.00594.s001
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.3389/fgene.2020.00594.s001
Dataset updated
Jun 3, 2023
Dataset provided by
Frontiers
Authors
Pierre R. Bushel; Stephen S. Ferguson; Sreenivasa C. Ramaiahgari; Richard S. Paules; Scott S. Auerbach
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Analysis of bulk RNA sequencing (RNA-Seq) data is a valuable tool to understand transcription at the genome scale. Targeted sequencing of RNA has emerged as a practical means of assessing the majority of the transcriptomic space with less reliance on large resources for consumables and bioinformatics. TempO-Seq is a templated, multiplexed RNA-Seq platform that interrogates a panel of sentinel genes representative of genome-wide transcription. Nuances of the technology require proper preprocessing of the data. Various methods have been proposed and compared for normalizing bulk RNA-Seq data, but there has been little to no investigation of how the methods perform on TempO-Seq data. We simulated count data into two groups (treated vs. untreated) at seven-fold change (FC) levels (including no change) using control samples from human HepaRG cells run on TempO-Seq and normalized the data using seven normalization methods. Upper Quartile (UQ) performed the best with regard to maintaining FC levels as detected by a limma contrast between treated vs. untreated groups. For all FC levels, specificity of the UQ normalization was greater than 0.84 and sensitivity greater than 0.90 except for the no change and +1.5 levels. Furthermore, K-means clustering of the simulated genes normalized by UQ agreed the most with the FC assignments [adjusted Rand index (ARI) = 0.67]. Despite having an assumption of the majority of genes being unchanged, the DESeq2 scaling factors normalization method performed reasonably well as did simple normalization procedures counts per million (CPM) and total counts (TCs). These results suggest that for two class comparisons of TempO-Seq data, UQ, CPM, TC, or DESeq2 normalization should provide reasonably reliable results at absolute FC levels ≥2.0. These findings will help guide researchers to normalize TempO-Seq gene expression data for more reliable results.
Species level size-normalised weight data for at depth analysis
doi.pangaea.de
html, tsv
Updated Jan 13, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ruby Barrett (2025). Species level size-normalised weight data for at depth analysis [Dataset]. http://doi.org/10.1594/PANGAEA.973594
Explore at:
tsv, htmlAvailable download formats
Unique identifier
https://doi.org/10.1594/PANGAEA.973594
Dataset updated
Jan 13, 2025
Dataset provided by
PANGAEA
Authors
Ruby Barrett
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Variables measured
Basin, Ecogroup, LATITUDE, Salinity, Author(s), Data type, ELEVATION, LONGITUDE, Phosphate, Sample ID, and 9 more
Description
This dataset contains a compilation of published and new SNW data with corresponding environmental data extracted from CMIP6 that are used in the at depth species level Bayesian regression modelling. Environmental data for G. truncatulinoides comes from 200m depth, all other environmental data is from the sea surface (≤ 20 m).
Naturalistic Neuroimaging Database
openneuro.org
Updated Apr 20, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sarah Aliko; Jiawen Huang; Florin Gheorghiu; Stefanie Meliss; Jeremy I Skipper (2021). Naturalistic Neuroimaging Database [Dataset]. http://doi.org/10.18112/openneuro.ds002837.v2.0.0
Explore at:
Unique identifier
https://doi.org/10.18112/openneuro.ds002837.v2.0.0
Dataset updated
Apr 20, 2021
Dataset provided by
OpenNeurohttps://openneuro.org/
Authors
Sarah Aliko; Jiawen Huang; Florin Gheorghiu; Stefanie Meliss; Jeremy I Skipper
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
Overview

The Naturalistic Neuroimaging Database (NNDb v2.0) contains datasets from 86 human participants doing the NIH Toolbox and then watching one of 10 full-length movies during functional magnetic resonance imaging (fMRI).The participants were all right-handed, native English speakers, with no history of neurological/psychiatric illnesses, with no hearing impairments, unimpaired or corrected vision and taking no medication. Each movie was stopped in 40-50 minute intervals or when participants asked for a break, resulting in 2-6 runs of BOLD-fMRI. A 10 minute high-resolution defaced T1-weighted anatomical MRI scan (MPRAGE) is also provided.

The NNDb V2.0 is now on Neuroscout, a platform for fast and flexible re-analysis of (naturalistic) fMRI studies. See: https://neuroscout.org/

v2.0 Changes

Overview

We have replaced our own preprocessing pipeline with that implemented in AFNI’s afni_proc.py, thus changing only the derivative files. This introduces a fix for an issue with our normalization (i.e., scaling) step and modernizes and standardizes the preprocessing applied to the NNDb derivative files. We have done a bit of testing and have found that results in both pipelines are quite similar in terms of the resulting spatial patterns of activity but with the benefit that the afni_proc.py results are 'cleaner' and statistically more robust.

Normalization

Emily Finn and Clare Grall at Dartmouth and Rick Reynolds and Paul Taylor at AFNI, discovered and showed us that the normalization procedure we used for the derivative files was less than ideal for timeseries runs of varying lengths. Specifically, the 3dDetrend flag -normalize makes 'the sum-of-squares equal to 1'. We had not thought through that an implication of this is that the resulting normalized timeseries amplitudes will be affected by run length, increasing as run length decreases (and maybe this should go in 3dDetrend’s help text). To demonstrate this, I wrote a version of 3dDetrend’s -normalize for R so you can see for yourselves by running the following code:

# Generate a resting state (rs) timeseries (ts) # Install / load package to make fake fMRI ts # install.packages("neuRosim") library(neuRosim) # Generate a ts ts.rs <- simTSrestingstate(nscan=2000, TR=1, SNR=1) # 3dDetrend -normalize # R command version for 3dDetrend -normalize -polort 0 which normalizes by making "the sum-of-squares equal to 1" # Do for the full timeseries ts.normalised.long <- (ts.rs-mean(ts.rs))/sqrt(sum((ts.rs-mean(ts.rs))^2)); # Do this again for a shorter version of the same timeseries ts.shorter.length <- length(ts.normalised.long)/4 ts.normalised.short <- (ts.rs[1:ts.shorter.length]- mean(ts.rs[1:ts.shorter.length]))/sqrt(sum((ts.rs[1:ts.shorter.length]- mean(ts.rs[1:ts.shorter.length]))^2)); # By looking at the summaries, it can be seen that the median values become larger summary(ts.normalised.long) summary(ts.normalised.short) # Plot results for the long and short ts # Truncate the longer ts for plotting only ts.normalised.long.made.shorter <- ts.normalised.long[1:ts.shorter.length] # Give the plot a title title <- "3dDetrend -normalize for long (blue) and short (red) timeseries"; plot(x=0, y=0, main=title, xlab="", ylab="", xaxs='i', xlim=c(1,length(ts.normalised.short)), ylim=c(min(ts.normalised.short),max(ts.normalised.short))); # Add zero line lines(x=c(-1,ts.shorter.length), y=rep(0,2), col='grey'); # 3dDetrend -normalize -polort 0 for long timeseries lines(ts.normalised.long.made.shorter, col='blue'); # 3dDetrend -normalize -polort 0 for short timeseries lines(ts.normalised.short, col='red');

Standardization/modernization

The above individuals also encouraged us to implement the afni_proc.py script over our own pipeline. It introduces at least three additional improvements: First, we now use Bob’s @SSwarper to align our anatomical files with an MNI template (now MNI152_2009_template_SSW.nii.gz) and this, in turn, integrates nicely into the afni_proc.py pipeline. This seems to result in a generally better or more consistent alignment, though this is only a qualitative observation. Second, all the transformations / interpolations and detrending are now done in fewers steps compared to our pipeline. This is preferable because, e.g., there is less chance of inadvertently reintroducing noise back into the timeseries (see Lindquist, Geuter, Wager, & Caffo 2019). Finally, many groups are advocating using tools like fMRIPrep or afni_proc.py to increase standardization of analyses practices in our neuroimaging community. This presumably results in less error, less heterogeneity and more interpretability of results across studies. Along these lines, the quality control (‘QC’) html pages generated by afni_proc.py are a real help in assessing data quality and almost a joy to use.

New afni_proc.py command line

The following is the afni_proc.py command line that we used to generate blurred and censored timeseries files. The afni_proc.py tool comes with extensive help and examples. As such, you can quickly understand our preprocessing decisions by scrutinising the below. Specifically, the following command is most similar to Example 11 for ‘Resting state analysis’ in the help file (see https://afni.nimh.nih.gov/pub/dist/doc/program_help/afni_proc.py.html): afni_proc.py \ -subj_id "$sub_id_name_1" \ -blocks despike tshift align tlrc volreg mask blur scale regress \ -radial_correlate_blocks tcat volreg \ -copy_anat anatomical_warped/anatSS.1.nii.gz \ -anat_has_skull no \ -anat_follower anat_w_skull anat anatomical_warped/anatU.1.nii.gz \ -anat_follower_ROI aaseg anat freesurfer/SUMA/aparc.a2009s+aseg.nii.gz \ -anat_follower_ROI aeseg epi freesurfer/SUMA/aparc.a2009s+aseg.nii.gz \ -anat_follower_ROI fsvent epi freesurfer/SUMA/fs_ap_latvent.nii.gz \ -anat_follower_ROI fswm epi freesurfer/SUMA/fs_ap_wm.nii.gz \ -anat_follower_ROI fsgm epi freesurfer/SUMA/fs_ap_gm.nii.gz \ -anat_follower_erode fsvent fswm \ -dsets media_?.nii.gz \ -tcat_remove_first_trs 8 \ -tshift_opts_ts -tpattern alt+z2 \ -align_opts_aea -cost lpc+ZZ -giant_move -check_flip \ -tlrc_base "$basedset" \ -tlrc_NL_warp \ -tlrc_NL_warped_dsets \ anatomical_warped/anatQQ.1.nii.gz \ anatomical_warped/anatQQ.1.aff12.1D \ anatomical_warped/anatQQ.1_WARP.nii.gz \ -volreg_align_to MIN_OUTLIER \ -volreg_post_vr_allin yes \ -volreg_pvra_base_index MIN_OUTLIER \ -volreg_align_e2a \ -volreg_tlrc_warp \ -mask_opts_automask -clfrac 0.10 \ -mask_epi_anat yes \ -blur_to_fwhm -blur_size $blur \ -regress_motion_per_run \ -regress_ROI_PC fsvent 3 \ -regress_ROI_PC_per_run fsvent \ -regress_make_corr_vols aeseg fsvent \ -regress_anaticor_fast \ -regress_anaticor_label fswm \ -regress_censor_motion 0.3 \ -regress_censor_outliers 0.1 \ -regress_apply_mot_types demean deriv \ -regress_est_blur_epits \ -regress_est_blur_errts \ -regress_run_clustsim no \ -regress_polort 2 \ -regress_bandpass 0.01 1 \ -html_review_style pythonic We used similar command lines to generate ‘blurred and not censored’ and the ‘not blurred and not censored’ timeseries files (described more fully below). We will provide the code used to make all derivative files available on our github site (https://github.com/lab-lab/nndb).

We made one choice above that is different enough from our original pipeline that it is worth mentioning here. Specifically, we have quite long runs, with the average being ~40 minutes but this number can be variable (thus leading to the above issue with 3dDetrend’s -normalise). A discussion on the AFNI message board with one of our team (starting here, https://afni.nimh.nih.gov/afni/community/board/read.php?1,165243,165256#msg-165256), led to the suggestion that '-regress_polort 2' with '-regress_bandpass 0.01 1' be used for long runs. We had previously used only a variable polort with the suggested 1 + int(D/150) approach. Our new polort 2 + bandpass approach has the added benefit of working well with afni_proc.py.

Which timeseries file you use is up to you but I have been encouraged by Rick and Paul to include a sort of PSA about this. In Paul’s own words: * Blurred data should not be used for ROI-based analyses (and potentially not for ICA? I am not certain about standard practice). * Unblurred data for ISC might be pretty noisy for voxelwise analyses, since blurring should effectively boost the SNR of active regions (and even good alignment won't be perfect everywhere). * For uncensored data, one should be concerned about motion effects being left in the data (e.g., spikes in the data). * For censored data: * Performing ISC requires the users to unionize the censoring patterns during the correlation calculation. * If wanting to calculate power spectra or spectral parameters like ALFF/fALFF/RSFA etc. (which some people might do for naturalistic tasks still), then standard FT-based methods can't be used because sampling is no longer uniform. Instead, people could use something like 3dLombScargle+3dAmpToRSFC, which calculates power spectra (and RSFC params) based on a generalization of the FT that can handle non-uniform sampling, as long as the censoring pattern is mostly random and, say, only up to about 10-15% of the data. In sum, think very carefully about which files you use. If you find you need a file we have not provided, we can happily generate different versions of the timeseries upon request and can generally do so in a week or less.

Effect on results

From numerous tests on our own analyses, we have qualitatively found that results using our old vs the new afni_proc.py preprocessing pipeline do not change all that much in terms of general spatial patterns. There is, however, an
New size-normalised weight (SNW) data
doi.pangaea.de
html, tsv
Updated Jan 13, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ruby Barrett (2025). New size-normalised weight (SNW) data [Dataset]. http://doi.org/10.1594/PANGAEA.973571
Explore at:
tsv, htmlAvailable download formats
Unique identifier
https://doi.org/10.1594/PANGAEA.973571
Dataset updated
Jan 13, 2025
Dataset provided by
PANGAEA
Authors
Ruby Barrett
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
Sep 1, 1984 - Oct 19, 1999
Area covered

Variables measured
LATITUDE, Data type, ELEVATION, LONGITUDE, Sample ID, Event label, Size fraction, Sieve-based weight, Number of specimens, DEPTH, sediment/rock, and 5 more
Description
This table includes the new SNW data produced for this manuscript. The foraminiferal weight data is normalized using the measurement-based weight (MBW) method of Barker (2002). SNW measurements were collected from Atlantic core-tops and sediment cores for G. truncatulinoides, G. ruber, O. universa, N. pachyderma, N. incompta and G. bulloides.
d
(high-temp) No 8. Metadata Analysis (16S rRNA/ITS) Output
search.dataone.org
Updated Aug 15, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jarrod Scott (2024). (high-temp) No 8. Metadata Analysis (16S rRNA/ITS) Output [Dataset]. https://search.dataone.org/view/urn%3Auuid%3A718e0794-b5ff-4919-95ef-4a90a7890a5b
Explore at:
Dataset updated
Aug 15, 2024
Dataset provided by
Smithsonian Research Data Repository
Authors
Jarrod Scott
Description
Output files from the 8. Metadata Analysis Workflow page of the SWELTR high-temp study. In this workflow, we compared environmental metadata with microbial communities. The workflow is split into two parts.

metadata_ssu18_wf.rdata : Part 1 contains all variables and objects for the 16S rRNA analysis. To see the Objects, in R run _load("metadata_ssu18_wf.rdata", verbose=TRUE)_

metadata_its18_wf.rdata : Part 2 contains all variables and objects for the ITS analysis. To see the Objects, in R run _load("metadata_its18_wf.rdata", verbose=TRUE)_
Additional files:

In both workflows, we run the following steps:

1) Metadata Normality Tests: Shapiro-Wilk Normality Test to test whether each matadata parameter is normally distributed.
2) Normalize Parameters: R package bestNormalize to find and execute the best normalizing transformation.
3) Split Metadata parameters into groups: a) Environmental and edaphic properties, b) Microbial functional responses, and c) Temperature adaptation properties.
4) Autocorrelation Tests: Test all possible pair-wise comparisons, on both normalized and non-normalized data sets, for each group.
5) Remove autocorrelated parameters from each group.
6) Dissimilarity Correlation Tests: Use Mantel Tests to see if any on the metadata groups are significantly correlated with the community data.
7) Best Subset of Variables: Determine which of the metadata parameters from each group are the most strongly correlated with the community data. For this we use the bioenv function from the vegan package.
8) Distance-based Redundancy Analysis: Ordination analysis of samples and metadata vector overlays using capscale, also from the vegan package.

Source code for the workflow can be found here:
https://github.com/sweltr/high-temp/blob/master/metadata.Rmd
b
Microbial counts, Picophytoplankton from the R/V Melville IronEx II cruise...
datacart.bco-dmo.org
bco-dmo.org
csv
Updated Mar 10, 2011
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ken Johnson; Kenneth H. Coale (2011). Microbial counts, Picophytoplankton from the R/V Melville IronEx II cruise in the Equatorial Pacific Ocean in 1995 (IronEx II project) [Dataset]. https://datacart.bco-dmo.org/dataset/3446
Explore at:
csv(17.59 KB)Available download formats
Dataset updated
Mar 10, 2011
Dataset provided by
Biological and Chemical Data Management Office
Authors
Ken Johnson; Kenneth H. Coale
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Variables measured
lat, lon, cast, date, time, Patch, depth, redFL, yrday, cruise, and 53 more
Measurement technique
CTD - profiler
Description
Microbial Counts - Picophytoplankton

Data were normalized with the following values:

# values used for normalizing from "out" by group # group fals(rel) redFL(rel) FL/fals ratio # group1 0.09 0.62 7.19 # group2 0.92 0.61 6.84 #
d
Data from: Quantitative proteomics reveals rapid divergence in the...
datadryad.org
data.niaid.nih.gov
+1more
zip
Updated Jun 24, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Erin McCullough; Caitlin McDonough; Scott Pitnick; Steve Dorus (2020). Quantitative proteomics reveals rapid divergence in the postmating response of female reproductive tracts among sibling species [Dataset]. http://doi.org/10.5061/dryad.8cz8w9gm8
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5061/dryad.8cz8w9gm8
Dataset updated
Jun 24, 2020
Dataset provided by
Dryad
Authors
Erin McCullough; Caitlin McDonough; Scott Pitnick; Steve Dorus
Time period covered
2020
Description
Fertility depends, in part, on interactions between male and female reproductive proteins inside the female reproductive tract (FRT) that mediate postmating changes in female behavior, morphology, and physiology. Coevolution between interacting proteins within species may drive reproductive incompatibilities between species, yet the mechanisms underlying postmating-prezygotic isolating barriers remain poorly resolved. Here, we used quantitative proteomics in sibling Drosophila species to investigate the molecular composition of the FRT environment and its role in mediating species-specific postmating responses. We found that (1) FRT proteomes in D. simulans and D. mauritiana virgin females express unique combinations of secreted proteins and are enriched for distinct functional categories, (2) mating induces substantial changes to the FRT proteome in D. mauritiana but not in D. simulans, and (3) the D. simulans FRT pr...
Z
Sila National Park - 3D Point cloud data
data.niaid.nih.gov
Updated Feb 2, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Puletti, Nicola (2020). Sila National Park - 3D Point cloud data [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_3633628
Explore at:
Dataset updated
Feb 2, 2020
Dataset authored and provided by
Puletti, Nicola
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset contains 3 types of data.

GPS data (the ones starting with "GPS") of sampling plot centers collected with a Trimble GPS and post processed to ensure positioning errors lower than 2 meters.

TLS data, (the ones starting with "ID_"): such data were collected in the end of August 2019 with a mobile terrestrial laser scanner (mobile ZEB TLS) in a squared area of approximatively 30x30m. Data have been normalized using TreeLS package in R.

ALS data collected in the end of July 2019. For the entire study area, we upload 2 different ALS data: "merged.las" is the original point cloud; "myLas_norm_lt22.las" is the normalised point cloud, cut at 22 meters from the ground in order to perform specific analysis (i.e. paper under submission).

Data collection was founded by the AGRIDIGIT Selvicoltura project.
d
Microbial counts, Eukaryotes from the R/V Melville IronEx II cruise in the...
search.dataone.org
bco-dmo.org
+1more
Updated Dec 5, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Kenneth H. Coale; Ken Johnson; Evelyn Armstrong (2021). Microbial counts, Eukaryotes from the R/V Melville IronEx II cruise in the Equatorial Pacific Ocean in 1995 (IronEx II project) [Dataset]. https://search.dataone.org/view/sha256%3A0a421420bf7715f2ca68243b90eae214c895d22c62e5bd9240695e49ce3dbf0e
Explore at:
Dataset updated
Dec 5, 2021
Dataset provided by
Biological and Chemical Oceanography Data Management Office (BCO-DMO)
Authors
Kenneth H. Coale; Ken Johnson; Evelyn Armstrong
Description
Microbial counts - Eukaryote

Data were normalized with the following values:

# values used for normalizing from "out" by group # group fals(rel) redFL(rel) FL/fals ratio # group1 0.45 0.64 1.58 # group2 2.55 9.01 3.57 # group3 0.33 7.94 27.74 # group4 nd nd nd #
Group level size-normalised weight data
doi.pangaea.de
html, tsv
Updated Jan 13, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ruby Barrett (2025). Group level size-normalised weight data [Dataset]. http://doi.org/10.1594/PANGAEA.973592
Explore at:
html, tsvAvailable download formats
Unique identifier
https://doi.org/10.1594/PANGAEA.973592
Dataset updated
Jan 13, 2025
Dataset provided by
PANGAEA
Authors
Ruby Barrett
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Variables measured
Basin, Ecogroup, LATITUDE, Salinity, Author(s), Data type, ELEVATION, LONGITUDE, Phosphate, Sample ID, and 9 more
Description
This dataset contains a compilation of published and new SNW data with corresponding sea surface (≤ 20 m) environmental data extracted from CMIP6 that are used in the group level Bayesian regression modelling.
Z
Example subjects for Mobilise-D data standardization
data.niaid.nih.gov
Updated Oct 11, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Chiari, Lorenzo (2022). Example subjects for Mobilise-D data standardization [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_7185428
Explore at:
Dataset updated
Oct 11, 2022
Dataset provided by
Bertuletti, Stefano
Micó-Amigo, Encarna
Mazzà, Claudia
Gazit, Eran
Rochester, Lynn
Paraschiv-Ionescu, Anisoara
Salis, Francesca
Hansen, Clint
Ullrich, Martin
Palmerini, Luca
Bonci, Tecla
Cereatti, Andrea
Caruso, Marco
Hiden, Hugo
Chiari, Lorenzo
Küderle, Arne
on behalf of the Mobilise-D consortium
D'Ascanio, Ilaria
Reggi, Luca
Soltani, Abolfazl
Kluge, Felix
Kirk, Cameron
Del Din, Silvia
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Standardized data from Mobilise-D participants (YAR dataset) and pre-existing datasets (ICICLE, MSIPC2, Gait in Lab and real-life settings, MS project, UNISS-UNIGE) are provided in the shared folder, as an example of the procedures proposed in the publication "Mobility recorded by wearable devices and gold standards: the Mobilise-D procedure for data standardization" that is currently under review in Scientific data. Please refer to that publication for further information. Please cite that publication if using these data.

The code to standardize an example subject (for the ICICLE dataset) and to open the standardized Matlab files in other languages (Python, R) is available in github (https://github.com/luca-palmerini/Procedure-wearable-data-standardization-Mobilise-D).
f
Data from: Ethanol's Energy Return on Investment: A Survey of the Literature...
acs.figshare.com
xls
Updated May 30, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Roel Hammerschlag (2023). Ethanol's Energy Return on Investment: A Survey of the Literature 1990−Present [Dataset]. http://doi.org/10.1021/es052024h.s002
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1021/es052024h.s002
Dataset updated
May 30, 2023
Dataset provided by
ACS Publications
Authors
Roel Hammerschlag
License
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Description
Various authors have reported conflicting values for the energy return on investment (rE) of ethanol manufacture. Energy policy analysts predisposed to or against ethanol frequently cite selections from these studies to support their positions. This literature review takes an objective look at the disagreement by normalizing and comparing the data sets from ten such studies. Six of the reviewed studies treat starch ethanol from corn, and four treat cellulosic ethanol. Each normalized data set is also submitted to a uniform calculation of rE defined as the total product energy divided by nonrenewable energy input to its manufacture. Defined this way rE > 1 indicates that the ethanol product has nominally captured at least some renewable energy, and rE > 0.76 indicates that it consumes less nonrenewable energy in its manufacture than gasoline. The reviewed corn ethanol studies imply 0.84 ≤ rE ≤ 1.65; three of the cellulosic ethanol studies imply 4.40 ≤ rE ≤ 6.61. The fourth cellulosic ethanol study reports rE = 0.69 and may reasonably be considered an outlier.