7 datasets found

Intermediate data for TE calculation
zenodo.org
bin, csv
Updated May 9, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yue Liu; Yue Liu (2025). Intermediate data for TE calculation [Dataset]. http://doi.org/10.5281/zenodo.10373032
Explore at:
csv, binAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.10373032
Dataset updated
May 9, 2025
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Yue Liu; Yue Liu
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset includes intermediate data from RiboBase that generates translation efficiency (TE). The code to generate the files can be found at https://github.com/CenikLab/TE_model.

We uploaded demo HeLa .ribo files, but due to the large storage requirements of the full dataset, I recommend contacting Dr. Can Cenik directly to request access to the complete version of RiboBase if you need the original data.

The detailed explanation for each file:

human_flatten_ribo_clr.rda: ribosome profiling clr normalized data with GEO GSM ids in columns and genes in rows in human.

human_flatten_rna_clr.rda: matched RNA-seq clr normalized data with GEO GSM ids in columns and genes in rows in human.

human_flatten_te_clr.rda: TE clr data with GEO GSM ids in columns and genes in rows in human.

human_TE_cellline_all_plain.csv: TE clr data with genes in rows and cell lines in rows in human.

human_RNA_rho_new.rda: matched RNA-seq proportional similarity data as genes by genes matrix in human.

human_TE_rho.rda: TE proportional similarity data as genes by genes matrix in human.

mouse_flatten_ribo_clr.rda: ribosome profiling clr normalized data with GEO GSM ids in columns and genes in rows in mouse.

mouse_flatten_rna_clr.rda: matched RNA-seq clr normalized data with GEO GSM ids in columns and genes in rows in mouse.

mouse_flatten_te_clr.rda: TE clr data with GEO GSM ids in columns and genes in rows in mouse.

mouse_TE_cellline_all_plain.csv: TE clr data with genes in rows and cell lines in rows in mouse.

mouse_RNA_rho_new.rda: matched RNA-seq proportional similarity data as genes by genes matrix in mouse.

mouse_TE_rho.rda: TE proportional similarity data as genes by genes matrix in mouse.

All the data was passed quality control. There are 1054 mouse samples and 835 mouse samples:
* coverage > 0.1 X
* CDS percentage > 70%
* R2 between RNA and RIBO >= 0.188 (remove outliers)

All ribosome profiling data here is non-dedup winsorizing data paired with RNA-seq dedup data without winsorizing (even though it names as flatten, it just the same format of the naming)

####code
If you need to read rda data please use load("rdaname.rda") with R

If you need to calculate proportional similarity from clr data:
library(propr)
human_TE_homo_rho <- propr:::lr2rho(as.matrix(clr_data))
rownames(human_TE_homo_rho) <- colnames(human_TE_homo_rho) <- rownames(clr_data)
Data_Sheet_3_Compositional Data Analysis of Periodontal Disease Microbial...
frontiersin.figshare.com
datasetcatalog.nlm.nih.gov
zip
Updated May 31, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Laura Sisk-Hackworth; Adrian Ortiz-Velez; Micheal B. Reed; Scott T. Kelley (2023). Data_Sheet_3_Compositional Data Analysis of Periodontal Disease Microbial Communities.ZIP [Dataset]. http://doi.org/10.3389/fmicb.2021.617949.s003
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.3389/fmicb.2021.617949.s003
Dataset updated
May 31, 2023
Dataset provided by
Frontiers Mediahttp://www.frontiersin.org/
Authors
Laura Sisk-Hackworth; Adrian Ortiz-Velez; Micheal B. Reed; Scott T. Kelley
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Periodontal disease (PD) is a chronic, progressive polymicrobial disease that induces a strong host immune response. Culture-independent methods, such as next-generation sequencing (NGS) of bacteria 16S amplicon and shotgun metagenomic libraries, have greatly expanded our understanding of PD biodiversity, identified novel PD microbial associations, and shown that PD biodiversity increases with pocket depth. NGS studies have also found PD communities to be highly host-specific in terms of both biodiversity and the response of microbial communities to periodontal treatment. As with most microbiome work, the majority of PD microbiome studies use standard data normalization procedures that do not account for the compositional nature of NGS microbiome data. Here, we apply recently developed compositional data analysis (CoDA) approaches and software tools to reanalyze multiomics (16S, metagenomics, and metabolomics) data generated from previously published periodontal disease studies. CoDA methods, such as centered log-ratio (clr) transformation, compensate for the compositional nature of these data, which can not only remove spurious correlations but also allows for the identification of novel associations between microbial features and disease conditions. We validated many of the studies’ original findings, but also identified new features associated with periodontal disease, including the genera Schwartzia and Aerococcus and the cytokine C-reactive protein (CRP). Furthermore, our network analysis revealed a lower connectivity among taxa in deeper periodontal pockets, potentially indicative of a more “random” microbiome. Our findings illustrate the utility of CoDA techniques in multiomics compositional data analysis of the oral microbiome.
f
DataSheet_1_Optimising high-throughput sequencing data analysis, from gene...
frontiersin.figshare.com
pdf
Updated Mar 4, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Simin Wang; Dominik Schneider; Tamara R. Hartke; Johannes Ballauff; Carina Carneiro de Melo Moura; Garvin Schulz; Zhipeng Li; Andrea Polle; Rolf Daniel; Oliver Gailing; Bambang Irawan; Stefan Scheu; Valentyna Krashevska (2024). DataSheet_1_Optimising high-throughput sequencing data analysis, from gene database selection to the analysis of compositional data: a case study on tropical soil nematodes.pdf [Dataset]. http://doi.org/10.3389/fevo.2024.1168288.s001
Explore at:
pdfAvailable download formats
Unique identifier
https://doi.org/10.3389/fevo.2024.1168288.s001
Dataset updated
Mar 4, 2024
Dataset provided by
Frontiers
Authors
Simin Wang; Dominik Schneider; Tamara R. Hartke; Johannes Ballauff; Carina Carneiro de Melo Moura; Garvin Schulz; Zhipeng Li; Andrea Polle; Rolf Daniel; Oliver Gailing; Bambang Irawan; Stefan Scheu; Valentyna Krashevska
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
IntroductionHigh-throughput sequencing (HTS) provides an efficient and cost-effective way to generate large amounts of sequence data, providing a very powerful tool to analyze biodiversity of soil organisms. However, marker-based methods and the resulting datasets come with a range of challenges and disputes, including incomplete reference databases, controversial sequence similarity thresholds for delimitating taxa, and downstream compositional data analysis. MethodsHere, we use HTS data from a soil nematode biodiversity experiment to explore standardized HTS data processing procedures. We compared the taxonomic assignment performance of two main rDNA reference databases (SILVA and PR2). We tested whether the same ecological patterns are detected with Amplicon Sequence Variants (ASV; 100% similarity) versus classical Operational Taxonomic Units (OTU; 97% similarity). Further, we tested how different HTS data normalization methods affect the recovery of beta diversity patterns and the identification of differentially abundant taxa.ResultsAt this time, the SILVA 138 eukaryotic database performed better than the PR2 4.12 database, assigning more reads to family level and providing higher phylogenetic resolution. ASV- and OTU-based alpha and beta diversity of nematodes correlated closely, indicating that OTU-based studies represent useful reference points. For downstream data analyses, our results indicate that loss of data during subsampling under rarefaction-based methods might reduce the sensitivity of the method, e.g. underestimate the differences between nematode communities under different treatments, while the clr-transformation-based methods may overestimate effects. The Analysis of Compositions of Microbiome with Bias Correction approach (ANCOM-BC) retains all data and accounts for uneven sampling fractions for each sample, suggesting that this is currently the optimal method to analyze compositional data.DiscussionOverall, our study highlights the importance of comparing and selecting taxonomic reference databases before data analyses, and provides solid evidence for the similarity and comparability between OTU- and ASV-based nematode studies. Further, the results highlight the potential weakness of rarefaction-based and clr-transformation-based methods. We recommend future studies use ASV and that both the taxonomic reference databases and normalization strategies are carefully tested and selected before analyzing the data.
Normalized element ratios of sediment core LVL15-1 from Lake Vouliagmeni,...
doi.pangaea.de
html, tsv
Updated Nov 20, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Andreas Koutsodendris; Achim Brauer; Oliver Friedrich; Rik Tjallingii; Victoria Putyrskaya; Barbara Hennrich; Robert Kühn; Eckehard Klemt; Jörg Pross (2023). Normalized element ratios of sediment core LVL15-1 from Lake Vouliagmeni, Greece [Dataset]. http://doi.org/10.1594/PANGAEA.963433
Explore at:
tsv, htmlAvailable download formats
Unique identifier
https://doi.org/10.1594/PANGAEA.963433
Dataset updated
Nov 20, 2023
Dataset provided by
PANGAEA
Authors
Andreas Koutsodendris; Achim Brauer; Oliver Friedrich; Rik Tjallingii; Victoria Putyrskaya; Barbara Hennrich; Robert Kühn; Eckehard Klemt; Jörg Pross
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
Jul 7, 2015
Area covered

Variables measured
Core, Section, File name, Sample ID, Replicates, Logger voltage, Logger Amperage, Iron, normalized, Position, length, Total count rate, and 16 more
Description
The XRF core scanning for the elements Si, S, Cl, K, Ca, Ti, Mn, Fe, Br, and Sr was performed with an ITRAX core scanner equipped with a chromium X-ray tube at 200 μm step size, 30 kV tube voltage, 30 mA tube current, and a counting time of 10 s. To minimize sample-geometry effects related to differences in water content, surface irregularities, and sediment density, raw-element intensities (cps) were normalized by center-log-ratio (CLR) transformation.
Figure 4 from manuscript Sparsely-Connected Autoencoder (SCA) for single...
figshare.com
zip
Updated Aug 26, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Raffaele Calogero (2020). Figure 4 from manuscript Sparsely-Connected Autoencoder (SCA) for single cell RNAseq data mining [Dataset]. http://doi.org/10.6084/m9.figshare.12866717.v1
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.12866717.v1
Dataset updated
Aug 26, 2020
Dataset provided by
Figsharehttp://figshare.com/
figshare
Authors
Raffaele Calogero
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Dataset used to generate figure 4: QCM/QCC plots using different normalizations for the SCA input counts table. A) Log10 transformed (figure4/setA/Results/setAMIRNA_SIMLR/5/setA_StabilitySignificativityJittered.pdf), B) Centred log-ratio normalization (CLR) (figure4/setA/Results/CLR_FNMIRNA_SIMLR/5/normalized_CLR_FN_StabilitySignificativityJittered.pdf), C) relative log-expression (RLE) (figure4/setA/Results/DESEQ_FNMIRNA_SIMLR/5/normalized_DESEQ_FN_StabilitySignificativityJittered.pdf), D) full-quantile normalization (FQ) (figure4/setA/Results/FQ_FNMIRNA_SIMLR/5/normalized_FQ_FN_StabilitySignificativityJittered.pdf), E) sum scaling normalization (SUM) (/figure4/setA/Results/SUM_FNMIRNA_SIMLR/5/normalized_SUM_FN_StabilitySignificativityJittered.pdf), F) weighted trimmed mean of M-values (TMM) (figure4/setA/Results/TMM_FNMIRNA_SIMLR/5/normalized_TMM_FN_StabilitySignificativityJittered.pdf).
f
Normalized variation matrix of data in table (3).
figshare.com
xls
Updated Jun 11, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Asghar Khan; Muhammad Saleem Khan; Juan José Egozcue; Munib Ahmed Shafique; Sidra Nadeem; Ghulam Saddiq (2023). Normalized variation matrix of data in table (3). [Dataset]. http://doi.org/10.1371/journal.pone.0279083.t006
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0279083.t006
Dataset updated
Jun 11, 2023
Dataset provided by
PLOS ONE
Authors
Asghar Khan; Muhammad Saleem Khan; Juan José Egozcue; Munib Ahmed Shafique; Sidra Nadeem; Ghulam Saddiq
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Normalized variation matrix of data in table (3).
Beta weights during imagined standing, normalized to rest.
plos.figshare.com
xls
Updated May 31, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Daniel S. Peterson; Kristen A. Pickett; Ryan Duncan; Joel Perlmutter; Gammon M. Earhart (2023). Beta weights during imagined standing, normalized to rest. [Dataset]. http://doi.org/10.1371/journal.pone.0090634.t003
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0090634.t003
Dataset updated
May 31, 2023
Dataset provided by
PLOShttp://plos.org/
Authors
Daniel S. Peterson; Kristen A. Pickett; Ryan Duncan; Joel Perlmutter; Gammon M. Earhart
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Paired sample t-test comparing stand and rest beta weights.$Univariate ANCOVA with UPDRS as covariate.*Significantly different from rest at the 0.05 level.**Significantly different from rest at the 0.005 level.Abbreviations: SMA: supplementary motor area, GP: globus pallidus, MLR: mesencephalic locomotor region, CLR: cerebellar locomotor region.
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

Yue Liu; Yue Liu (2025). Intermediate data for TE calculation [Dataset]. http://doi.org/10.5281/zenodo.10373032

Intermediate data for TE calculation

Explore at:

csv, binAvailable download formats

Unique identifier

https://doi.org/10.5281/zenodo.10373032

Dataset updated

May 9, 2025

Dataset provided by

Zenodohttp://zenodo.org/

Authors

Yue Liu; Yue Liu

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

This dataset includes intermediate data from RiboBase that generates translation efficiency (TE). The code to generate the files can be found at https://github.com/CenikLab/TE_model.

We uploaded demo HeLa .ribo files, but due to the large storage requirements of the full dataset, I recommend contacting Dr. Can Cenik directly to request access to the complete version of RiboBase if you need the original data.

The detailed explanation for each file:

human_flatten_ribo_clr.rda: ribosome profiling clr normalized data with GEO GSM ids in columns and genes in rows in human.

human_flatten_rna_clr.rda: matched RNA-seq clr normalized data with GEO GSM ids in columns and genes in rows in human.

human_flatten_te_clr.rda: TE clr data with GEO GSM ids in columns and genes in rows in human.

human_TE_cellline_all_plain.csv: TE clr data with genes in rows and cell lines in rows in human.

human_RNA_rho_new.rda: matched RNA-seq proportional similarity data as genes by genes matrix in human.

human_TE_rho.rda: TE proportional similarity data as genes by genes matrix in human.

mouse_flatten_ribo_clr.rda: ribosome profiling clr normalized data with GEO GSM ids in columns and genes in rows in mouse.

mouse_flatten_rna_clr.rda: matched RNA-seq clr normalized data with GEO GSM ids in columns and genes in rows in mouse.

mouse_flatten_te_clr.rda: TE clr data with GEO GSM ids in columns and genes in rows in mouse.

mouse_TE_cellline_all_plain.csv: TE clr data with genes in rows and cell lines in rows in mouse.

mouse_RNA_rho_new.rda: matched RNA-seq proportional similarity data as genes by genes matrix in mouse.

mouse_TE_rho.rda: TE proportional similarity data as genes by genes matrix in mouse.

All the data was passed quality control. There are 1054 mouse samples and 835 mouse samples:
* coverage > 0.1 X
* CDS percentage > 70%
* R2 between RNA and RIBO >= 0.188 (remove outliers)

All ribosome profiling data here is non-dedup winsorizing data paired with RNA-seq dedup data without winsorizing (even though it names as flatten, it just the same format of the naming)

####code
If you need to read rda data please use load("rdaname.rda") with R

If you need to calculate proportional similarity from clr data:
library(propr)
human_TE_homo_rho <- propr:::lr2rho(as.matrix(clr_data))
rownames(human_TE_homo_rho) <- colnames(human_TE_homo_rho) <- rownames(clr_data)

Clear search

Close search

Google apps

Main menu

Intermediate data for TE calculation

Data_Sheet_3_Compositional Data Analysis of Periodontal Disease Microbial...

DataSheet_1_Optimising high-throughput sequencing data analysis, from gene...

Normalized element ratios of sediment core LVL15-1 from Lake Vouliagmeni,...

Figure 4 from manuscript Sparsely-Connected Autoencoder (SCA) for single...

Normalized variation matrix of data in table (3).

Beta weights during imagined standing, normalized to rest.

Intermediate data for TE calculation