74 datasets found

COVID-19 DGE (GSE152641) edgeR Galaxy Server
kaggle.com
zip
Updated Dec 2, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Dr. Nagendra (2025). COVID-19 DGE (GSE152641) edgeR Galaxy Server [Dataset]. https://www.kaggle.com/datasets/mannekuntanagendra/covid-19-dge-gse152641-edger-galaxy-server
Explore at:
zip(11577100 bytes)Available download formats
Dataset updated
Dec 2, 2025
Authors
Dr. Nagendra
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
This dataset corresponds to GSE152641 — a whole-blood RNA-seq study of COVID-19 patients and healthy controls. OmicsDI +1

It includes expression data processed through edgeR on a Galaxy server — hence the title “COVID-19 DGE (GSE152641) edgeR Galaxy Server”.

The original GSE152641 study profiled peripheral blood from 62 SARS-CoV-2 (COVID-19) patients and 24 healthy controls, for a total of 86 samples. OmicsDI +1

The dataset captures host transcriptomic (gene expression) responses to SARS-CoV-2 infection, enabling analysis of differentially expressed genes (DEGs) in COVID-19 vs healthy individuals. OmicsDI +1

This resource can be used to: identify DEGs, perform immune-cell deconvolution / infiltration analysis, compare COVID-19 transcriptomic signatures with other viral infections, perform downstream pathway analysis, co-expression analysis, or machine learning / biomarker discovery.

Because the original study also compared COVID-19 responses to other viral infections (six viruses: influenza, RSV, HRV, Ebola, Dengue, SARS), the dataset is useful for comparative transcriptomic studies of immune response across infections, though here only the COVID-19 whole-blood data from GSE152641 are included. OmicsDI +1

The data are human (Homo sapiens) whole-blood bulk RNA-seq. OmicsDI +1

The underlying gene expression matrix is a count matrix (digital gene expression), suitable for downstream normalization, differential expression (edgeR, DESeq2, limma-voom, etc.), and other transcriptomics analyses. ffli.dev +1

This dataset enables reproducible computational analyses — for example, detection of DEGs, immune cell composition estimation, pathway enrichment, classifier / signature building for COVID-19.

As such, it can serve as a resource for researchers interested in COVID-19 immunology, biomarker discovery, host response profiling, comparative viral transcriptomics, or meta-analysis with other publicly available datasets.

All required data files (metadata, counts or processed tables as uploaded) are made available to facilitate reanalysis and transparent computational workflows.
Critical Assessment of RNA-Seq Differential Expression
zenodo.org
zip
Updated Feb 12, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Charles Warden; Charles Warden (2024). Critical Assessment of RNA-Seq Differential Expression [Dataset]. http://doi.org/10.5281/zenodo.3378055
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.3378055
Dataset updated
Feb 12, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Charles Warden; Charles Warden
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Warden and Wu Preprint: v1

In general, this primarily focuses on the following types of comparisons:

Cell line experiments with over-expression or knock-down to define a known causal gene, with processing starting with public reads.

Processed TCGA (The Cancer Genome Atlas) data for breast cancer (BRCA) to compare gene expression by immunohistochemistry status (ER/ESR1, PR/PGR, or HER2/ERBB2).

Differential expression methods include the following:

edgeR (GLM)

edgeR-robust (GLM)

edgeR (QL)

edgeR-robust (QL)

DESeq1

DESeq2

limma-voom

limma-trend (CPM)

limma-trend (FPKM/RPKM)

ANOVA (log2 FRPKM/RPKM)

The most common preprocessing strategies include STAR, TopHat2, and Salmon. However, a limited amount of additional processing with HISAT2, kallisto, Bowtie2 (+eXpress), and Bowtie1 (+RSEM) is also provided.

Most STAR and TopHat2 alignments use htseq-count for quantification, as well as running cuffdiff (for single variable 2-group comparisons). However, a limited amount of additional processing with featureCounts is also provided.

Most STAR and TopHat2 alignments start with the public forward reads, even if paired-end data was available.
Additional file 10: Table S6. of Errors in RNA-Seq quantification affect...
springernature.figshare.com
xlsx
Updated Jun 1, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Christelle Robert; Mick Watson (2023). Additional file 10: Table S6. of Errors in RNA-Seq quantification affect genes of relevance to human disease [Dataset]. http://doi.org/10.6084/m9.figshare.c.3643154_D1.v1
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.c.3643154_D1.v1
Dataset updated
Jun 1, 2023
Dataset provided by
Figsharehttp://figshare.com/
figshare
Authors
Christelle Robert; Mick Watson
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
EdgeR results from MMGs. Differential expression results calculated by edgeR for MMG counts produced by the stage 2 analysis. Can be downloaded from [43]. (XLSX 428 kb)
Additional file 9: Table S5. of Errors in RNA-Seq quantification affect...
springernature.figshare.com
figshare.com
xlsx
Updated May 30, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Christelle Robert; Mick Watson (2023). Additional file 9: Table S5. of Errors in RNA-Seq quantification affect genes of relevance to human disease [Dataset]. http://doi.org/10.6084/m9.figshare.c.3643154_D4.v1
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.c.3643154_D4.v1
Dataset updated
May 30, 2023
Dataset provided by
Figsharehttp://figshare.com/
figshare
Authors
Christelle Robert; Mick Watson
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
EdgeR results from unique counts. Differential expression results calculated by edgeR for gene counts produced by the stage 1 analysis. Can be downloaded from [43]. (XLSX 2159 kb)
Z
Data and code for "Differential methylation analysis of reduced...
data.niaid.nih.gov
data-staging.niaid.nih.gov
+1more
Updated Jan 24, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Chen, Yunshun; Pal, Bhupinder; Visvader, Jane E; Smyth, Gordon K (2020). Data and code for "Differential methylation analysis of reduced representation bisulfite sequencing experiments using edgeR" [Dataset]. https://data.niaid.nih.gov/resources?id=ZENODO_1052870
Explore at:
Dataset updated
Jan 24, 2020
Dataset provided by
Walter and Eliza Hall Institute of Medical Research
Authors
Chen, Yunshun; Pal, Bhupinder; Visvader, Jane E; Smyth, Gordon K
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This data set provides data files and R code to accompany the article Differential methylation analysis of reduced representation bisulfite sequencing experiments using edgeR published by F1000Research.

The data consists of Reduced Representation BS-seq methylation profiles of epithelial populations from the mouse mammary gland, with n=2 biological replicates for each of three cell populations.

RNA-seq expression profiles of luminal and basal mammary epithelial populations are also provided.

The R code undertakes an differential methylation analysis of the BS-seq profiles and demonstrates a strong negative correlation between the differential methylation and differential expression results.
ETI.edgeR.Results
figshare.com
txt
Updated Jun 5, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Thomas Hampton; Bruce A. Stanton (2024). ETI.edgeR.Results [Dataset]. http://doi.org/10.6084/m9.figshare.25975324.v1
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.25975324.v1
Dataset updated
Jun 5, 2024
Dataset provided by
Figsharehttp://figshare.com/
figshare
Authors
Thomas Hampton; Bruce A. Stanton
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
ENSEMBL gene identifiers, log base 2 fold change responses to ETI compared to DMSO, and unadjusted P values from edgeR analysis of RNA-seq aligned reads.
f
Table S3 edgeR results.xlsx
datasetcatalog.nlm.nih.gov
Updated Feb 2, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Hemingway, Janet; Ford, Louise; Turner, Joseph D.; Johnston, Kelly L.; Wu, Yang; Quek, Shannon; Cook, Darren A. N.; Archer, John; Taylor, Mark J.; Marriott, Amy E.; Wagstaff, Simon C. (2022). Table S3 edgeR results.xlsx [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0000310096
Explore at:
Dataset updated
Feb 2, 2022
Authors
Hemingway, Janet; Ford, Louise; Turner, Joseph D.; Johnston, Kelly L.; Wu, Yang; Quek, Shannon; Cook, Darren A. N.; Archer, John; Taylor, Mark J.; Marriott, Amy E.; Wagstaff, Simon C.
Description
Supplementary table 2, containing full results from edgeR RNA-sequencing differential expression analysis of B. malayi microfilariae controls compared to treatment with tetracyclines, used in the associated manuscript for drawing conclusions. Full details on data generation are available in the related manuscript.
Data from: Conserved regulation of RNA processing in somatic cell...
data.europa.eu
data-staging.niaid.nih.gov
+2more
unknown
Updated Jan 23, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Zenodo (2020). Conserved regulation of RNA processing in somatic cell reprogramming [Dataset]. https://data.europa.eu/data/datasets/oai-zenodo-org-1303234?locale=cs
Explore at:
unknown(62917)Available download formats
Dataset updated
Jan 23, 2020
Dataset authored and provided by
Zenodohttp://zenodo.org/
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Data set 1. Transcript expression across human RNA-Seq samples: estimated read counts. The file contains estimated read counts, generated by kallisto (https://pachterlab.github.io/kallisto/), for human transcripts and RNA-Seq samples used in this study (see Additional file 2 of the accompanying publication). The format is a compressed (GZIP) tab-separated transcript-by-sample matrix. Ensembl transcript identifiers and a combined Sequence Read Archive study/sample name identifier serve as row and column names, respectively. Data set 2. Transcript expression across murine RNA-Seq samples: estimated read counts. As in Data set 1, but for mouse transcripts. Data set 3. Transcript expression across simian RNA-Seq samples: estimated read counts. As in Data set 1, but for chimpanzee transcripts. Data set 4. Transcript expression across across human RNA-Seq samples: estimated transcript abundances. As in Data set 1, but instead of read counts, transcript abundances in transcripts per million (TPM), as estimated by kallisto (https://pachterlab.github.io/kallisto/), are listed. Format, column and row names as in Data set 1. Data set 5. Transcript expression across murine RNA-Seq samples: estimated transcript abundances. As in Data set 4, but for mouse transcripts. Data set 6. Transcript expression across simian RNA-Seq samples: estimated transcript abundances. As in Data set 4, but for chimpanzee transcripts. Data set 7. Differential expression analyses across human RNA-Seq sample groups: log fold changes. The file contains log fold changes, inferred by edgeR (http://bioconductor.org/packages/release/bioc/html/edgeR.html), for human genes and the RNA-Seq sample group contrasts listed in Additional file 3 of the accompanying publication in a compressed (GZIP) TSV gene-by-comparison matrix. Ensembl gene identifiers and a descriptive contrast identifier serve as row and column names, respectively. Data set 8. Differential expression analyses across murine RNA-Seq sample groups: log fold changes. As in Data set 7, but for mouse genes. Data set 9. Differential expression analyses across simian RNA-Seq sample groups: log fold changes. As in Data set 7, but for chimpanzee genes. Data set 10. Differential expression analyses across human RNA-Seq sample groups: false discovery rates. The file contains false discovery rates (FDR) for the differential expression analyses summarized in Data set 7. Format, column and row names as in Data set 7. Data set 11. Differential expression analyses across murine RNA-Seq sample groups: false discovery rates. As in Data set 10, but for mouse genes. Data set 12. Differential expression analyses across simian RNA-Seq sample groups: false discovery rates. As in Data set 10, but for chimpanzee genes. Data set 13. Quantification of alternative splicing events across human RNA-Seq samples. The file contains ‘percent spliced in’ (PSI) values computed by SUPPA (https://github.com/comprna/SUPPA) for annotated alternative splicing events (inferred from the transcript annotation of the human genome, Ensembl release 84; http://www.ensembl.org/). The format is a compressed (GZIP) tab-separated transcript-by-sample matrix. SUPPA-provided event identifiers and a combined Sequence Read Archive study/sample name identifier serve as row and column names, respectively. Data set 14. Quantification of alternative splicing events across murine RNA-Seq samples. As in Data set 13, but for mouse alternative splicing events. Data set 15. Differential splicing analyses across human RNA-Seq sample groups: differences in ‘percent spliced in’ (ΔPSI). The file contains ΔPSI values for human alternative splicing events (as in Data set 13). The RNA-Seq sample group contrasts are listed in Additional file 3 of the accompanying publication. Values were inferred by SUPPA’s diffSplice functionality (https://github.com/comprna/SUPPA). The format is a compressed (GZIP) tab-separated gene-by-comparison matrix. SUPPA event identifiers and a descriptive contrast identifier serve as row and column names, respectively. Data set 16. Differential splicing analyses across murine RNA-Seq sample groups: differences in ‘percent spliced in’ (ΔPSI). As in Data set 15, but for mouse alternative splicing events. Data set 17. Differential splicing analyses across human RNA-Seq sample groups: P values. The file contains P values for the differential splicing analysis of human alternative splicing events summarized in Data set 15. Format, column and row names as in Data set 15. Data set 18. Differential splicing analyses across murine RNA-Seq sample groups: P values. The file contains P values for the differential splicing analysis of mouse alternative splicing events summarized in Data set 16. Format, column and row names as in Data set 15. Data set 19. Transcript expression across murine RNA-Seq time course data: estimated read counts. As in Data set 2, but for the time course data generated for the accompanying publication. Data set 20. Trans
c
Identification of prognostic burns-related indicators and microRNA...
esango.cput.ac.za
Updated Dec 5, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Tarryn Prinsloo (2025). Identification of prognostic burns-related indicators and microRNA biosignatures in burns patients with inhalation injury [Dataset]. http://doi.org/10.25381/cput.30619340.v1
Explore at:
Unique identifier
https://doi.org/10.25381/cput.30619340.v1
Dataset updated
Dec 5, 2025
Dataset provided by
Cape Peninsula University of Technology
Authors
Tarryn Prinsloo
License
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Description
The study was approved by the Health and Wellness Research Ethics Committee (HW-REC), Cape Peninsula University of Technology (CPUT), and the Stellenbosch University (SU) Health Research Ethics Committee (reference: CPUT/HW-REC 2015/H15; CPUT/HW-REC 2017/H20). Student ethics approval was also granted by the CPUT REC (CPUT/HW-REC 2017/H20). Site approval was provided from the Chief Executive Office and Medical Services Manager/Research Coordinator to conduct research at TBH, Tygerberg, CPT, WC, SA in accordance with the Provincial Policy and TBH Notice No 40/2009 (reference: CPUT/HW-REC 2015/H15). The study was also registered with the WC Government National Health Research Database (reference: 2016RP18364).Study aspects involves (1) medical-files based data to (i) observe epidemiological alignment of inhalation injury with similar clinical settings and other LMICs (comparative tests between parameter subgroups), (ii) determine clinical markers for mortality and the significance of inhalation injury in relation to mortality (using Fisher's Exact test, Spearman and/or Pearson's correlation coefficient and partial least squares regress) and (iii) determine clinical markers for inhalation injury (using Fisher's Exact test, Spearman and/or Pearson's correlation coefficient and partial least squares regress). These were performed on the data set named: 'Demographic, injury, and clinical data of all samples' in Data set 1 or Data set 1.In addition, (2) human whole blood was used for RNA sequencing to determine predictive miRNAs for inhalation injury by using (i) the Illumina platform for RNA sequencing, (ii) sRNA bench and Bowtie for sequence alignment to human genome, (iii) EdgeR and DeSeq2 pipelines for differential expression analysis, and (iv) the Fisher's exact test for comparison between DE miRNAs between mild and severe inhalation injury. The data sets named 'Demographic, injury, clinical, and total RNA-related data of all exemplar samples' and 'Demographic, injury, clinical, and total RNA-related of all exemplar samples that passed QC for Sequencing' were the samples used for RNA sequencing.Finally, (3) DE miRNA meeting the threshold criteria, i.e., overlapped between EdgeR and DeSeq2, fold change ≤1.5 and Padj value
m
Shape-Dependent Interactions of Gold Nanoparticles with Microalgae: Distinct...
data.mendeley.com
Updated Nov 14, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Can Wang (2025). Shape-Dependent Interactions of Gold Nanoparticles with Microalgae: Distinct Cellular and Molecular Responses [Dataset]. http://doi.org/10.17632/x3gyskppmp.1
Explore at:
Unique identifier
https://doi.org/10.17632/x3gyskppmp.1
Dataset updated
Nov 14, 2025
Authors
Can Wang
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
mRNA-sequencing raw data. Method: We collected algae after 72 h exposure to 10 mg/L AuNP and AuNS for RNA-seq to analyze mRNA expression. Chlamydomonas_reinhardtii (Version: CC-503 cw92 mt+). The differentially expressed transcripts and genes were selected with log2 (fold change) ≥ 1 or log2 (fold change) ≤ -1 and p value < 0.05 criteria with the R package edgeR(https://bioconductor.org/packages/edgeR). Results: RNA sequencing identified 9 upregulated and 38 downregulated differentially expressed genes (DEGs) in the 10 mg/L AuNP treated cells, impairing photosynthesis and energy storage via the photosystem II subunit S1 (PSBS1)/ early light-inducible protein (ELI3) pathway. In contrast, the AuNS group exhibits 246 upregulated and 145 downregulated DEGs, affecting membrane integrity and nitrogen metabolism through the nitrate reductase (NIT1)/ aminomethyl transferase (AMT1)/ protein kinase domain-containing protein (A0A2K3CRU5) pathway.
f
Supporting Information S1 - A Comparative Study of Techniques for...
datasetcatalog.nlm.nih.gov
Updated Aug 13, 2014
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Lundberg, Andreas E.; Edson, Janette; Bartlett, Perry F.; Narayanan, Ramesh K.; Marshall, Vikki M.; Wray, Naomi R.; Jhaveri, Dhanisha J.; Zhang, Zong Hong; Robinson, Gregory J.; Bauer, Denis C.; Zhao, Qiong-Yi (2014). Supporting Information S1 - A Comparative Study of Techniques for Differential Expression Analysis on RNA-Seq Data [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0001175228
Explore at:
Dataset updated
Aug 13, 2014
Authors
Lundberg, Andreas E.; Edson, Janette; Bartlett, Perry F.; Narayanan, Ramesh K.; Marshall, Vikki M.; Wray, Naomi R.; Jhaveri, Dhanisha J.; Zhang, Zong Hong; Robinson, Gregory J.; Bauer, Denis C.; Zhao, Qiong-Yi
Description
Figure S1, Venn diagram showing the number of differentially expressed genes identified by two versions of Cuffdiff2. Figure S2, The effects of biological replicates on the differential expression analysis for Cuffdiff v2.0.2. Figure S3, The detected fold changes of all the differentially expressed genes identified by three tools were compared and shown, including DESeq vs. edgeR (top panel), DESeq vs. Cuffdiff2 (middle panel) and edgeR vs. Cuffdiff2 (bottom panel). File S1, Analysis pipelines, methods and examples of commands for differential expression analysis, subsampling fastq files and generating SAM/BAM files based on simulated count values. File S2, The raw count values for genes with high fold changes were picked up by edgeR but not by DESeq. Genes with high fold changes (the absolute value of log2 fold changes larger than 2) identified as DEGs by edgeR but not by DESeq are listed in the file. The gene ID, the log2 fold changes (logFC) and FDR from DESeq, the logFC and FDR from edgeR, the raw count values for the four replicates of sample K (K1–K4) and sample N (N1–N4) are shown in each of the columns. Table S1, Numbers of reads for the human hbr and uhr samples from the MAQC dataset. Table S2, Numbers of reads for the mouse neurosphere samples for treatment groups of K and N (the K_N dataset). Table S3, The number of reads for each individual sample of the LCL3 dataset. Table S4, The definition for TP, FP, TN, FN, TPR and FPR. Table S5, The false positive rate for Cuffdiff2, DESeq and edgeR based on the LCL1 dataset. (ZIP)
RNA-Sequencing Part 3 Generation and characterization of a novel mouse model...
zenodo.org
application/gzip, tsv
Updated Sep 24, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Lucie Perillat; Lucie Perillat (2025). RNA-Sequencing Part 3 Generation and characterization of a novel mouse model of Becker Muscular Dystrophy with a deletion of exons 52 to 55 [Dataset]. http://doi.org/10.5281/zenodo.17095147
Explore at:
application/gzip, tsvAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.17095147
Dataset updated
Sep 24, 2025
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Lucie Perillat; Lucie Perillat
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Becker muscular dystrophy (BMD) is a rare X-linked recessive neuromuscular disorder, frequently caused by in-frame deletions in the DMD gene that result in the production of a truncated, yet functional, dystrophin protein. The consequences of BMD-causing in-frame deletions on the organism are difficult to predict, especially in regard to long-term prognosis. Here, we used CRISPR-Cas9 to generate a new Dmd Δ52-55 mouse model by deleting exons 52-55 in the Dmd gene, resulting in a BMD-like in-frame deletion. To delineate the long-term effects of this deletion, we studied these mice over 52 weeks by performing histology and echocardiography analyses and assessing motor functions. To further delineate the effects of the exons 52-55 in-frame deletion, we performed RNA-Seq pre- and post-exercise and identified several differentially expressed pathways that could explain the abnormal muscle phenotype observed at 52 weeks in the BMD model.

This dataset shows the results and raw data of the RNA-sequencing and transcriptomic analysis for 52-week-old exercised and non-exercised mice (4 BMD, 4 WT and 4 DMD, as mentioned on the names of each file).

Due to size restrictions, this RNA-Seq dataset will be published on Zenodo in 3 parts. This third part contains the data for the non-exercised mice, including the fastq (R1 and R2) that were extracted from alignment index files (bam - see below), and the differentially expressed genes (tsv files). Fastq files were extracted by our team from the alignment indexes (bam) files, as follows:

1. Starting with the original file (Number.Aligned.sortedByCoord.out.bam), using samtools, we sorted by name:

samtools sort -n Number.Aligned.sortedByCoord.out.bam -o Number.Aligned.namesorted.bam

2. We extracted the paired reads into 2 separate files for R1 and R2, and any singleton or orphaned reads into additional RS and R0 files, respectively (many of the RS and R0 files were empty and not added here due to size constraints):

samtools fastq -1 Number_R1.fastq -2 Number_R2.fastq -0 Number_R0.fastq -s Number_RS.fastq

3. We compressed all of the files into ‘.gz’ extension using gzip:

gzip -9 Number_R1.fastq

.bam and RS/R0 files were not added due to size constraints but were available upon request.

Upstream workflow performed by TCAG (SickKids):

2. RNA-Seq Library and Reference Genome Information

Type of library: stranded, paired end

Genome reference sequence: GRCm39, M31 Gencode gene models.

3. Read Pre-processing, Alignment and Obtaining Gene Counts

3.1 Read Pre-processing

The sequencing data is in FASTQ format. The quality of the data is assessed using FastQC v.0.11.5 (http://www.bioinformatics.babraham.ac.uk/projects/fastqc/).

Adaptors are trimmed using Trim Galore (http://www.bioinformatics.babraham.ac.uk/projects/trim_galore/) v. 0.5.0. Trim Galore is running Cutadapt (https://cutadapt.readthedocs.org/en/stable/) v. 1.10. Trim Galore is run with the following parameters:

-q 25 – the reads are trimmed from the 3' end base by base, trimming stops if the quality of the base is greater than 25;

--clip_R1 6, --clip_R2 6 – clip the first 6 nucleotides from the 5' ends of read 1 and read 2;

--stringency 5 – at least 5 nucleotides overlap with the Illumina primer sequence are needed for trimming;

--length 40 – any read that is shorter than 40 nucleotides as a result of trimming is discarded;

--paired – only pairs of reads are retained (for paired-end reads only, not for single reads).

The type of adaptor is automatically detected by screening the first 1 million sequences of the first specified file for the first 12/13 nucleotides of the standard Illumina or Nextera primers and the sequence from the start of the primer to the 3' end of the read is trimmed.

The quality of the trimmed reads is re-assessed with FastQC.

The trimmed reads are also screened for presence of rRNA and mtRNA sequences using FastQ-Screen v.0.10.0 (http://www.bioinformatics.babraham.ac.uk/projects/fastq_screen/).

To assess the read distribution, positional read duplication and to confirm the strandedness of the alignments we use the RSeQC package (http://rseqc.sourceforge.net/), v. 2.6.2. The distribution of reads across exonic, intronic and intergenic sequences is assessed by the read_distribution.py program, infer_experiment.py is used for confirming strandedness, and read_duplication.py is used to obtain the positional read duplication (percentage of reads mapping to exactly the same genomic location). Sufficient proportion of reads should map to the exonic sequences (ideally > 70-80%). Large amounts of reads mapping to intronic sequences in a poly-A mRNA library will suggest significant presence of pre-mRNA or other issues with RNA preparation. For stranded RNA-seq experiments the majority of the reads should map exclusively to one strand, same or opposite to the transcript, depending on the library preparation method. For non-stranded experiments the reads should be equally distributed to both strands.

3.2. Read Alignment

The raw trimmed reads are aligned to the reference genome using the STAR aligner, v.2.6.0c. (https://github.com/alexdobin/STAR, https://academic.oup.com/bioinformatics/article/29/1/15/272537). The alignments are contained in the .bam files. The “.bam” together with the “.bai” files can be used for viewing of the alignments in the Integrative Genomics Viewer (IGV, http://software.broadinstitute.org/software/igv/).

3.3. Obtaining Gene Counts

The filtered STAR alignments are processed to extract raw read counts for genes using htseq-count v.0.6.1p2 (HTSeq, http://www-huber.embl.de/users/anders/HTSeq/doc/overview.html). Assigning reads to genes by htseq-count is done in the mode “intersection_nonempty”, i.e. if a read overlaps with two overlapping genes and the overlap to gene A is greater than the overlap to gene B, the read is counted towards gene A, while if a read overlaps equally with gene A and gene B, then it is not counted towards either gene. Htseq_count does not count reads with multiple alignments to avoid introducing bias in the expression results. Only uniquely mapping reads are counted.

4. Pre-processing, Alignment and Gene Counts QC

MultiQC (https://multiqc.info/) is a reporting tool that aggregates statistics generated by bioinformatics analyses across multiple samples. MultiQC v. 1.14 was used to generate a consolidated report from FastQC screening of both untrimmed and trimmed reads, and from RSeQC, FastQ Screen, STAR and htseq-count results. The MultiQC report is contained in MultiQC_Report_*.html file.

5. DGE Analysis with edgeR

Differential expression was done with the edgeR R package v.3.28.1, using R v.3.6.1 (http://www.bioconductor.org/packages/release/bioc/html/edgeR.html). The data set was filtered to retain only genes whose gene counts were >50 in at least 3 samples. This is intended to remove genes that are not expressed, or expressed at a very low level.

The method used for normalizing the data was TMM, implemented by the calcNormFactors(y) function. All samples were normalized and filtered together. The glmLRT functionality in edgeR was used for the differential expression tests, with sample group taken into account.

EdgeR Results Legend:

· GeneID – Ensembl Gene ID;

· Chr.Start.End - gene coordinates;

· GeneName, GeneType, etc. – Gene attributes, derived from the genome annotation;

· logFC - Log2 Fold Change (use this column for selection of DEGs);

· logCPM - Log2 Counts Per Million, average for all libraries;

· LR – Statistic calculated by the LR-Test;

· PValue - Differential expression P value;

· FDR – Differential expression False Discovery Rate, calculated by the Benjamini-Hochberg method (use this column for selection of DEGs);

· (columns labeled with sample names) – Fragments Per Kilobase of transcript per Million mapped reads (FPKMs) for the given samples.
A comparative study of RNA-Seq and microarray data analysis on the two...
plos.figshare.com
datasetcatalog.nlm.nih.gov
application/gzip
Updated Jun 1, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Alexander Wolff; Michaela Bayerlová; Jochen Gaedcke; Dieter Kube; Tim Beißbarth (2023). A comparative study of RNA-Seq and microarray data analysis on the two examples of rectal-cancer patients and Burkitt Lymphoma cells [Dataset]. http://doi.org/10.1371/journal.pone.0197162
Explore at:
application/gzipAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0197162
Dataset updated
Jun 1, 2023
Dataset provided by
PLOShttp://plos.org/
Authors
Alexander Wolff; Michaela Bayerlová; Jochen Gaedcke; Dieter Kube; Tim Beißbarth
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
BackgroundPipeline comparisons for gene expression data are highly valuable for applied real data analyses, as they enable the selection of suitable analysis strategies for the dataset at hand. Such pipelines for RNA-Seq data should include mapping of reads, counting and differential gene expression analysis or preprocessing, normalization and differential gene expression in case of microarray analysis, in order to give a global insight into pipeline performances.MethodsFour commonly used RNA-Seq pipelines (STAR/HTSeq-Count/edgeR, STAR/RSEM/edgeR, Sailfish/edgeR, TopHat2/Cufflinks/CuffDiff)) were investigated on multiple levels (alignment and counting) and cross-compared with the microarray counterpart on the level of gene expression and gene ontology enrichment. For these comparisons we generated two matched microarray and RNA-Seq datasets: Burkitt Lymphoma cell line data and rectal cancer patient data.ResultsThe overall mapping rate of STAR was 98.98% for the cell line dataset and 98.49% for the patient dataset. Tophat’s overall mapping rate was 97.02% and 96.73%, respectively, while Sailfish had only an overall mapping rate of 84.81% and 54.44%. The correlation of gene expression in microarray and RNA-Seq data was moderately worse for the patient dataset (ρ = 0.67–0.69) than for the cell line dataset (ρ = 0.87–0.88). An exception were the correlation results of Cufflinks, which were substantially lower (ρ = 0.21–0.29 and 0.34–0.53). For both datasets we identified very low numbers of differentially expressed genes using the microarray platform. For RNA-Seq we checked the agreement of differentially expressed genes identified in the different pipelines and of GO-term enrichment results.ConclusionIn conclusion the combination of STAR aligner with HTSeq-Count followed by STAR aligner with RSEM and Sailfish generated differentially expressed genes best suited for the dataset at hand and in agreement with most of the other transcriptomics pipelines.
b
Data and code for: Dihydrothiazolo ring-fused 2-pyridone antimicrobial...
nde-dev.biothings.io
data-staging.niaid.nih.gov
+4more
zip
Updated Apr 26, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Zongsen Zou (2024). Data and code for: Dihydrothiazolo ring-fused 2-pyridone antimicrobial compounds effectively treat Streptococcus pyogenes skin and soft tissue infection [Dataset]. http://doi.org/10.5061/dryad.pvmcvdntj
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5061/dryad.pvmcvdntj
Dataset updated
Apr 26, 2024
Dataset provided by
University of Washington School of Medicine
Authors
Zongsen Zou
License
https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
Description
We have developed GmPcides from a peptidomimetic dihydrothiazolo ring-fused 2-pyridone scaffold that have antimicrobial activities against a broad-spectrum of Gram-positive pathogens. Here we examine the treatment efficacy of GmPcides using skin and soft tissue infection (SSTI) and biofilm formation models by Streptococcus pyogenes. Screening our compound library for minimal inhibitory (MIC) and minimal bactericidal (MBC) concentrations identified GmPcide PS757 as highly active against S. pyogenes . Treatment of S. pyogenes biofilm with PS757 revealed robust efficacy against all phases of biofilm formation by preventing initial biofilm development, ceasing biofilm maturation and eradicating mature biofilm. In a murine model of S. pyogenes SSTI, subcutaneous delivery of PS757 resulted in reduced levels of tissue damage, decreased bacterial burdens and accelerated rates of wound-healing, which were associated with down-regulation of key virulence factors, including M protein and the SpeB cysteine protease. These data demonstrate that GmPcides show considerable promise for treating S. pyogenes infections. Methods RNA Sequencing. Microplate (96-well) culture in C medium was conducted as described above with the addition of 0.4 µM PS757 or vehicle (DMSO). At 24 hrs, multiple wells were harvested and pooled for further processing, with the experiment repeated in triplicate. Extraction of RNA utilized the Direct-zol RNA Miniprep Plus Kit (Zymo Research, R2072) with the quality of the purified RNA determined by spectroscopy (NanoDrop 2000, Thermo Fisher). Libraries for Illumina sequencing were prepared using the FastSelect RNA kit (Qiagen, 334222), according to the manufacture’s protocol and sequences determined using an Illumina NovaSeq 6000. Basecalls and demultiplexing were performed with Illumina’s bcl2fastq software and a custom python demultiplexing program with a maximum of one mismatch in the indexing read. RNA-seq reads were then aligned to the Ensembl release 101 primary assembly with STAR version 2.7.9a (1). Gene counts were derived from the number of uniquely aligned unambiguous reads by Subread:featureCount version 2.0.3 (2). Isoform expression of known Ensembl transcripts were quantified with Salmon version 1.5.2 (3) and assessed for the total number of aligned reads, total number of uniquely aligned reads, and features detected. The ribosomal fraction, known junction saturation, and read distribution over known gene models were quantified with RSeQC version 4.0 (4). Comparative Transcriptomic Analysis. All gene counts obtained from RNA-seq were then imported into the R/Bioconductor package EdgeR (5) and TMM normalization size factors calculated to adjust for differences in library size. Ribosomal genes and genes not expressed in the smallest group size minus one sample greater than one count-per-million were excluded from further analysis. The TMM size factors and the matrix of counts were then imported into the R/Bioconductor package Limma (6). Weighted likelihoods based on the observed mean-variance relationship of every gene and sample were calculated for all samples and the count matrix transformed to moderated log2-counts-per-million with Limma’s voomWithQualityWeights (7). The performance of all genes was assessed with plots of the residual standard deviation of every gene to their average log-count with a robustly fitted trend line of the residuals. Differential expression analysis was then performed to analyze for differences between conditions with results filtered for only those genes with Benjamini-Hochberg false-discovery rate adjusted p-values less than or equal to 0.05. A principal component analysis (PCA) was performed on differential expression data to distinguish differences between conditions (8). To find the significantly regulated genes, the Limma voomWithQualityWeights transformed log2-counts-per-million expression data was then analyzed via weighted gene correlation network analysis with the R/Bioconductor package WGCNA (9). Briefly, all genes were correlated across each other by Pearson correlations and clustered by expression similarity into unsigned modules using a power threshold empirically determined from the data. An eigengene was then created for each de novo cluster and its expression profile was then correlated across all coefficients of the model matrix. Because these clusters of genes were created by expression profile rather than known functional similarity, the clustered modules were given the names of random colors where grey is the only module that has any pre-existing definition of containing genes that do not cluster well with others. The information for all clustered genes for each module were then combined with their respective statistical significance results from Limma to determine whether or not those features were also found to be significantly differentially expressed. References 1. A. Dobin, C. A. Davis, F. Schlesinger, J. Drenkow, C. Zaleski, S. Jha, P. Batut, M. Chaisson, T. R. Gingeras, STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29, 15-21 (2013). 2. Y. Liao, G. K. Smyth, W. Shi, featureCounts: an efficient general purpose program for assigning sequence reads to genomic features. Bioinformatics 30, 923-930 (2014). 3. R. Patro, G. Duggal, M. I. Love, R. A. Irizarry, C. Kingsford, Salmon provides fast and bias-aware quantification of transcript expression. Nat Methods 14, 417-419 (2017). 4. L. Wang, S. Wang, W. Li, RSeQC: quality control of RNA-seq experiments. Bioinformatics 28, 2184-2185 (2012). 5. M. D. Robinson, D. J. McCarthy, G. K. Smyth, edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics 26, 139-140 (2010). 6. M. E. Ritchie, B. Phipson, D. Wu, Y. Hu, C. W. Law, W. Shi, G. K. Smyth, limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Res 43, e47 (2015). 7. R. Liu, A. Z. Holik, S. Su, N. Jansz, K. Chen, H. S. Leong, M. E. Blewitt, M. L. Asselin-Labat, G. K. Smyth, M. E. Ritchie, Why weight? Modelling sample and observational level variability improves power in RNA-seq analyses. Nucleic Acids Res 43, e97 (2015). 8. Z. Zou, R. F. Potter, W. H. t. McCoy, J. A. Wildenthal, G. L. Katumba, P. J. Mucha, G. Dantas, J. P. Henderson, E. coli catheter-associated urinary tract infections are associated with distinctive virulence and biofilm gene determinants. JCI Insight 8, (2023). 9. P. Langfelder, S. Horvath, WGCNA: an R package for weighted correlation network analysis. BMC Bioinformatics 9, 559 (2008).
u
Data from: Annotations of Unigenes Assembled from Schizaphis graminum and...
agdatacommons.nal.usda.gov
txt
Updated Feb 15, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Erin Scully; Kyle G. Koch; Nathan A. Palmer; Scott M. Geib; Gautam Sarath; Tiffany M. Heng-Moss; Jeffery D. Bradshaw (2024). Annotations of Unigenes Assembled from Schizaphis graminum and Sipha flava [Dataset]. http://doi.org/10.15482/USDA.ADC/1471686
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.15482/USDA.ADC/1471686
Dataset updated
Feb 15, 2024
Dataset provided by
Ag Data Commons
Authors
Erin Scully; Kyle G. Koch; Nathan A. Palmer; Scott M. Geib; Gautam Sarath; Tiffany M. Heng-Moss; Jeffery D. Bradshaw
License
U.S. Government Workshttps://www.usa.gov/government-works
License information was derived automatically
Description
Transcriptomes were assembled de novo from pools of adult aphids that were feeding on sorghum and switchgrass. Reads from all replicates were pooled, normalized in silico to 25X coverage, and assembled using Trinity. Only the most abundant isoform for each unigene was retained for annotation and unigenes with transcripts per million mapped reads (TPM) less than 0.5 were removed from the dataset. The remaining unigenes were annotated using Trinotate with BLASTP comparisons against the Swiss-Prot/UniProt database. In addition, Pfam-A assignments were computed using hmmer, signal peptide predictions were performed using SignalP, and transmembrane domain predictions were performed using tmHMM. Gene ontology (GO assignments) were retrieved from Trinotate using the highest scoring BLASTp matches as queries. [Note: Supplemental files 1-6 added 2/5/2019] Resources in this dataset:Resource Title: Trinotate annotations for unigenes assembled from Schizaphis graminum (greenbug). File Name: GB_annotation_trinotatefinal.xlsResource Description: ::Note:: Data file is large and may take time to load. Resource Software Recommended: Excel,url: https://products.office.com/en-us/excel Resource Title: Trinotate annotations of unigenes assembled from Sipha flava (yellow sugarcane aphid). File Name: YSA_annotation_trinotatefinal.xlsResource Description: ::Note:: Data file is large and may take time to load. Resource Software Recommended: Excel,url: https://products.office.com/en-us/excel Resource Title: Supplemental Data 1: Differential expression analysis of starvation and BCK60 sorghum treatments in S. graminum at 12 and 24 hours. File Name: Supplemental Data 1.xlsxResource Description: Reads were mapped back to the transcriptome assembly using bowtie2 and RSEM and differential expression analysis was performed using edgeR as described in the Materials and Methods. Unigenes with log fold-change values > 0 are upregulated in the BCK60 treatment relative to the starvation treatment while unigenes with log fold-change values < 0 are downregulated in this comparison. Log fold-change and FDR corrected-p-value thresholds were set at 0.25 and 0.05, respectively. Tab labeled 12 represents 12 hr post infestation and tab labeled 24 represents 24 hr post infestation. Resource Software Recommended: Microsoft Excel,url: https://products.office.com/en-us/excel Resource Title: Supplemental Data 2. Differential expression analysis of starvation and BCK60 sorghum treatments in S. flava at 12 and 24 hours. File Name: Supplemental Data 2.xlsxResource Description: Reads were mapped back to the transcriptome assembly using bowtie2 and RSEM and differential expression analysis was performed using edgeR as described in the Materials and Methods. Unigenes with log fold-change values < 0 are downregulated in the BCK60 treatment relative to the starvation treatment while unigenes with log fold-change values < 0 are downregulated in this comparison. Log fold-change and FDR corrected-p-value thresholds were set at 0.25 and 0.05, respectively. Tab labeled 12 represents 12 hr post infestation and tab labeled 24 represents 24 hr post infestation. Resource Software Recommended: Microsoft Excel,url: https://products.office.com/en-us/excel Resource Title: Supplemental Data 3. Differential expression analysis of Summer switchgrass and BCK60 sorghum treatments in S. graminum at 12 and 24 hours. File Name: Supplemental Data 3.xlsxResource Description: Reads were mapped back to the transcriptome assembly using bowtie2 and RSEM and differential expression analysis was performed using edgeR as described in the Materials and Methods. Unigenes with log fold-change values > 0 are upregulated in the Summer treatment relative to the BCK60 treatment while unigenes with log fold-change values < 0 are downregulated in this comparison. Log fold-change and FDR corrected-p-value thresholds were set at 0.25 and 0.05, respectively. Tab labeled 12 represents 12 hr post infestation and tab labeled 24 represents 24 hr post infestation. Resource Software Recommended: Microsoft Excel,url: https://products.office.com/en-us/excel Resource Title: Supplemental Data 4. Differential expression analysis of Summer switchgrass and BCK60 sorghum treatments in S. flava at 12 and 24 hours. File Name: Supplemental Data 4.xlsxResource Description: Reads were mapped back to the transcriptome assembly using bowtie2 and RSEM and differential expression analysis was performed using edgeR as described in the Materials and Methods. Unigenes with log fold-change values > 0 are upregulated in the Summer treatment relative to the BCK60 treatment while unigenes with log fold-change values < 0 are downregulated in this comparison. Log fold-change and FDR corrected-p-value thresholds were set at 0.25 and 0.05, respectively. Tab labeled 12 represents 12 hr post infestation and tab labeled 24 represents 24 hr post infestation. Resource Software Recommended: Microsoft Excel,url: https://products.office.com/en-us/excel Resource Title: Supplemental Data 5. Differential expression analysis of Kanlow switchgrass and BCK60 sorghum treatments in S. graminum t 12 and 24 hours. File Name: Supplemental Data 5.xlsxResource Description: Reads were mapped back to the transcriptome assembly using bowtie2 and RSEM and differential expression analysis was performed using edgeR as described in the Materials and Methods. Unigenes with log fold-change values > 0 are upregulated in the Kanlow treatment relative to the BCK60 treatment while unigenes with log fold-change values < 0 are downregulated in this comparison. Log fold-change and FDR corrected-p-value thresholds were set at 0.25 and 0.05, respectively. Tab labeled 12 represents 12 hr post infestation and tab labeled 24 represents 24 hr post infestation. Resource Software Recommended: Microsoft Excel,url: https://products.office.com/en-us/excel Resource Title: Supplemental Data 6. Differential expression analysis of Kanlow switchgrass and BCK60 sorghum treatments in S. flava t 12 and 24 hours. File Name: Supplemental Data 6.xlsxResource Description: Reads were mapped back to the transcriptome assembly using bowtie2 and RSEM and differential expression analysis was performed using edgeR as described in the Materials and Methods. Unigenes with log fold-change values > 0 are upregulated in the Kanlow treatment relative to the BCK60 treatment while unigenes with log fold-change values < 0 are downregulated in this comparison. Log fold-change and FDR corrected-p-value thresholds were set at 0.25 and 0.05, respectively. Tab labeled 12 represents 12 hr post infestation and tab labeled 24 represents 24 hr post infestation. Resource Software Recommended: Microsoft Excel,url: https://products.office.com/en-us/excel
Dataset and R script - Non-linear transcriptomic responses to compounded...
zenodo.org
Updated Dec 13, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Beth McCaw; Beth McCaw; Aoife Leonard; Aoife Leonard; Lesley Lancaster; Lesley Lancaster (2024). Dataset and R script - Non-linear transcriptomic responses to compounded environmental changes across temperature and resources in a pest beetle, Callosobruchus maculatus [Dataset]. http://doi.org/10.5281/zenodo.13835524
Explore at:
Unique identifier
https://doi.org/10.5281/zenodo.13835524
Dataset updated
Dec 13, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Beth McCaw; Beth McCaw; Aoife Leonard; Aoife Leonard; Lesley Lancaster; Lesley Lancaster
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset contains data and R script for analysis on life history and transcriptomic responses to single dimensional changes in resource (chickpea-27°C) and temperature (cowpea-35°C) and multi-dimensional environmental changes in resource and temperature (chickpea-35°C) in a pest beetle, Callosobruchus maculatus (control treatment = cowpea-27°C). Dataset contains life history data collected in laboratory conditions (tab 1), logFC data (RNA-sequencing; Novogene Co. Ltd.) for Spearman rank correlation tests between treatments (tabs 2-4), read count data (RNA-sequencing; Novogene Co. Ltd.) for differential expression analysis using edgeR (R1-5 = Four samples at cowpea-27°C; R7-12 = Four samples at cowpea-35°C; R13-17 = Four samples at chickpea-27°C; R25-28 = Four samples at chickpea-35°C; tab 5) and edgeR output data for plotting in R (tab 6).
e
RNA-seq of Acidiphilium sp. C61 culture supplemented with 10 µM...
ebi.ac.uk
Updated Feb 29, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Qianqian Li (2020). RNA-seq of Acidiphilium sp. C61 culture supplemented with 10 µM 2-Phenethylamine (PEA) against controls [Dataset]. https://www.ebi.ac.uk/biostudies/arrayexpress/studies/E-MTAB-8619
Explore at:
Dataset updated
Feb 29, 2020
Authors
Qianqian Li
Description
Acidiphilium sp. C61 cultures were cultivated in APPW+YE+Glucose medium with 0 µM or 10 µM PEA. RNA was extracted and library preparation was done using the NEBNext Ultra II directional RNA library prep kit for Illumina. Data was demultiplied by GATC sequencing company and adaptor was trimmed by Trimgalore. After trimming, data was processed quality control by sickle and mRNA was sorted by SortmeRNA. mRNA transcripts were mapped to the assembled genome of Acidiphilium sp. C61 and read counts table was produced by featurecounts. Differential gene expression analysis was done by edgeR package.
C3_R3_48h_IL4_Mature macrophage_mouse3
researchdata.edu.au
Updated 2020
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Fiona Pixley; David Joyce; Julie Proudfoot; Eloise Greenland; James Steer; Michael Murrey; School of Biomedical Sciences (2020). C3_R3_48h_IL4_Mature macrophage_mouse3 [Dataset]. https://researchdata.edu.au/c3r348hil4mature-macrophagemouse3/1445927
Explore at:
Dataset updated
2020
Dataset provided by
National Center for Biotechnology Informationhttp://www.ncbi.nlm.nih.gov/
The University of Western Australia
Authors
Fiona Pixley; David Joyce; Julie Proudfoot; Eloise Greenland; James Steer; Michael Murrey; School of Biomedical Sciences
Description
Sample type: SRA

Source name: bone marrow derived Organism: Mus musculus Characteristics
strain: C57BL/6 Sex: male age: 8 to 10 weeks Growth protocol: After 1 day in 0.6ng/ml CSF1 in alpha+ MEM /15% FCS, non-adherent cells were incubated for two days in a fresh dish containing 12ng/ml CSF1 in alpha+ MEM /10% FCS and then for 7 days in 120ng/ml CSF1 in alpha+ MEM /10% FCS. Cells were incubated for a further two days in fresh alpha+ MEM /10% FCS containing 120ng/ml CSF1 and 20ng/ml IL4. Extracted molecule: total RNA Extraction protocol: mRNA was harvested using RNeasy kit( QIAGEN) with DNase treatment on column. 1 ug of total RNA was used for the construction of sequencing libraries. RNA libraries were prepared for sequencing using standard Ion Torrent protocols

Library strategy: RNA-Seq Library source: transcriptomic Library selection: cDNA Instrument model: Ion Torrent S5

Description: IL4_48h_M26_BMM Wt_vs_IL4_allprobes_reads.txt Wt_vs_IL4_allprobes_log2_RPM.txt Data processing: Torrent Suite Software 5.10 used for basecalling and sequenced reads were trimmed for adaptor sequence, and masked for low-complexity or low-quality sequence- returning a fastq file (raw data) Reads were then mapped to the GRCm38.p6 genome using the open source Hisat2-2.0.5 aligner. The Hisat2 generated BAM files were uploaded into SeqMonk (version 1.42) with minimum mapping quality set to 60 The edgeR platform, within SequeMonk, was uesed to generate lists of differential gene expression from the raw reads as is required in analysis of negative binomial distributions Tab-delimited text files of all genes and differentially expressed genes (at p<0.05, p<0.01 and p<0.001) showing raw reads or log2 RPM were output (processed files) Ampliseq Torrent Suite Software 5.10 used for basecalling and sequenced reads were trimmed for adaptor sequence, and masked for low-complexity or low-quality sequence. Returning a fastq file (raw data) of reads associated with each of the 16000 barcoded primer pairs. Reads were then mapped to the GRCm38.p6 genome using the open source Hisat2-2.0.5 aligner. The Hisat2 generated BAM files were uploaded into SeqMonk (version 1.42) with minimum mapping quality set to 60 The edgeR platform, within SequeMonk, was uesed to generate lists of differential gene expression from the raw reads as is required in analysis of negative binomial distributions Tab-delimited text files of all genes and differentially expressed genes (at p<0.05, p<0.01 and p<0.001) showing raw reads or log2 RPM were output (processed files) Genome_build: Genome Reference Consortium mouse genome (GRCm39.p6) Supplementary_files_format_and_content: tab-delimited text files include reads or log2 RPM for each sample showing all genes or differential expression between conditions.
d
Data from: Challenges and strategies in transcriptome assembly and...
datadryad.org
zip
Updated Aug 13, 2012
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Nagarjun Vijay; Jelmer W. Poelstra; Axel Künstner; Jochen B. W. Wolf (2012). Challenges and strategies in transcriptome assembly and differential gene expression quantification. A comprehensive in silico assessment of RNA-seq experiments. [Dataset]. http://doi.org/10.5061/dryad.3t3n7
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5061/dryad.3t3n7
Dataset updated
Aug 13, 2012
Dataset provided by
Dryad
Authors
Nagarjun Vijay; Jelmer W. Poelstra; Axel Künstner; Jochen B. W. Wolf
Time period covered
Jul 18, 2012
Description
Transcriptome Shotgun Sequencing (RNA-seq) has been readily embraced by geneticists and molecular ecologists alike. As with all high-throughput technologies, it is critical to understand which analytic strategies are best suited and which parameters may bias the interpretation of the data. Here we use a comprehensive simulation approach to explore how various features of the transcriptome (complexity, degree of polymorphism π, alternative splicing), technological processing (sequencing error ε, library normalization) and bioinformatic workflow (de novo vs. mapping assembly, reference genome quality) impact transcriptome quality and inference of differential gene expression (DE). We find that transcriptome assembly and gene expression profiling (edgeR vs. baySeq software) works well even in the absence of a reference genome, and is robust across a broad range of parameters. We advise against library normalization, and in most situations advocate mapping assemblies to an annotated genome ...
e
Investigating the mechansisms associated with Folfiri-induced muscle wasting...
ebi.ac.uk
Updated May 17, 2016
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Andrea Bonetto (2016). Investigating the mechansisms associated with Folfiri-induced muscle wasting in normal mice [Dataset]. https://www.ebi.ac.uk/arrayexpress/experiments/E-GEOD-80473
Explore at:
Dataset updated
May 17, 2016
Authors
Andrea Bonetto
Description
Purpose of the study: the goal of this study was to use Next-generation sequencing to investigate the gene expression profile of skeletal muscle from mice exposed to chemotherapy (Folfiri) for up to 5 weeks. Methods: Standard methods were used for polyA mRNA-seq library construction, EZBead preparation and Next-Gen sequencing, based on Life Technologies SOLiD5000xl system. Briefly, one microgram total RNA per sample was applied for library preparation. PolyA mRNA was first captured using the standard protocol of DynabeadsÂ® mRNA DIRECTâ„¢ Micro Kit (Life Technologies, #61021). Following the enrichment of polyA mRNA, the cDNA library was prepared and barcoded per sample using the standard protocol of SOLiD Total RNA-seq Kit (Life Technologies, #4445374). Each barcoded library was quantified by Bioanalyzer High Sensitivity DNA chip (Agilent, #5067-4626) and pooled in equal molarity. EZBead preparation, bead library amplification, and bead enrichment were then conducted using Life Technologies EZ Beadâ„¢ E80 System (Life technologies, #4453095). Approximately four hundred forty million library-enriched beads per lane were deposited onto a SOLiD5500xl FlowChip (6 lanes/chip). Finally sequencing by ligation was carried out using standard single-read, 5â€™-3â€™ strand-specific sequencing procedure (75b-read) on SOLiD5500xl Sequencer. The resulting 75 bp solid reads were mapped to Mus musculus mm9 reference genome using in-house mapping pipelines that utilizes bfast-0.7.0a [67]. In brief, using our RNA-seq pipeline, low quality reads and reads mapped to rRNA/tRNAs were first discarded. The remaining reads were mapped to reference genome mm9 and a splice-junction library, respectively; the genomic and splice-junction library mapping were merged at the end. The gene based expression levels were calculated using bamutils from NGSUtils based on the RefSeq gene annotation of mm9. Differential expression of genes across different treatments was determined with edgeR Results: Gene expression profiling, performed by means of RNA-Sequencing analysis, identified a limited number of genes that were significantly modulated (False Discovery Rate < 0.05) following Folfiri administration. Interestingly, the pathway analysis showed marked down-regulation of the regulators of mitochondrial metabolism Ucp1, Cidea1 and Acot2, as well as the marker of muscle cell proliferation/pluripotency Fhl3 . Further, we found increased expression of regulators of lipid metabolism and transport (such as Fabp1, Apoa1, Apob, Apoa2, Alb, Prkcz and Scd2) as well as acute phase response proteins (such as Alb, Fga and Fgb), and members of the PPAR signaling and markers of the Energy metabolism (such as Dnah5) were reported . Conclusions: Our study represents the first gene expression profiling of muscle from mice exposed to chemotherapy. The findings from our study identified several signaling pathways that were modulated following chemotherapy administration. These results will contribute to identify new modulators of muscle wasting associated with Folfiri administration, that might also represent putative new targets of interventions. Next-generation sequencing of RNA extracted from quadriceps muscle of CD2F1 male mice exposed to either Folfiri (a combination of 5-FU, Leucovorin and CPT-11) or vehicle alone (n=4) for up to 5 weeks.

Facebook

Twitter

Click to copy link

Link copied

Cite

Dr. Nagendra (2025). COVID-19 DGE (GSE152641) edgeR Galaxy Server [Dataset]. https://www.kaggle.com/datasets/mannekuntanagendra/covid-19-dge-gse152641-edger-galaxy-server

COVID-19 DGE (GSE152641) edgeR Galaxy Server

COVID-19 whole-blood RNA-seq (GSE152641) processed via edgeR Galaxy Server

Explore at:

zip(11577100 bytes)Available download formats

Dataset updated

Dec 2, 2025

Authors

Dr. Nagendra

License

MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically

Description

This dataset corresponds to GSE152641 — a whole-blood RNA-seq study of COVID-19 patients and healthy controls. OmicsDI +1

It includes expression data processed through edgeR on a Galaxy server — hence the title “COVID-19 DGE (GSE152641) edgeR Galaxy Server”.

The original GSE152641 study profiled peripheral blood from 62 SARS-CoV-2 (COVID-19) patients and 24 healthy controls, for a total of 86 samples. OmicsDI +1

The dataset captures host transcriptomic (gene expression) responses to SARS-CoV-2 infection, enabling analysis of differentially expressed genes (DEGs) in COVID-19 vs healthy individuals. OmicsDI +1

This resource can be used to: identify DEGs, perform immune-cell deconvolution / infiltration analysis, compare COVID-19 transcriptomic signatures with other viral infections, perform downstream pathway analysis, co-expression analysis, or machine learning / biomarker discovery.

Because the original study also compared COVID-19 responses to other viral infections (six viruses: influenza, RSV, HRV, Ebola, Dengue, SARS), the dataset is useful for comparative transcriptomic studies of immune response across infections, though here only the COVID-19 whole-blood data from GSE152641 are included. OmicsDI +1

The data are human (Homo sapiens) whole-blood bulk RNA-seq. OmicsDI +1

The underlying gene expression matrix is a count matrix (digital gene expression), suitable for downstream normalization, differential expression (edgeR, DESeq2, limma-voom, etc.), and other transcriptomics analyses. ffli.dev +1

This dataset enables reproducible computational analyses — for example, detection of DEGs, immune cell composition estimation, pathway enrichment, classifier / signature building for COVID-19.

As such, it can serve as a resource for researchers interested in COVID-19 immunology, biomarker discovery, host response profiling, comparative viral transcriptomics, or meta-analysis with other publicly available datasets.

All required data files (metadata, counts or processed tables as uploaded) are made available to facilitate reanalysis and transparent computational workflows.

Clear search

Close search

Google apps

Main menu

COVID-19 DGE (GSE152641) edgeR Galaxy Server

Critical Assessment of RNA-Seq Differential Expression

Additional file 10: Table S6. of Errors in RNA-Seq quantification affect...

Additional file 9: Table S5. of Errors in RNA-Seq quantification affect...

Data and code for "Differential methylation analysis of reduced...

ETI.edgeR.Results

Table S3 edgeR results.xlsx

Data from: Conserved regulation of RNA processing in somatic cell...

Identification of prognostic burns-related indicators and microRNA...

Shape-Dependent Interactions of Gold Nanoparticles with Microalgae: Distinct...

Supporting Information S1 - A Comparative Study of Techniques for...

RNA-Sequencing Part 3 Generation and characterization of a novel mouse model...

A comparative study of RNA-Seq and microarray data analysis on the two...

Data and code for: Dihydrothiazolo ring-fused 2-pyridone antimicrobial...

Data from: Annotations of Unigenes Assembled from Schizaphis graminum and...

Dataset and R script - Non-linear transcriptomic responses to compounded...

RNA-seq of Acidiphilium sp. C61 culture supplemented with 10 µM...

C3_R3_48h_IL4_Mature macrophage_mouse3

Data from: Challenges and strategies in transcriptome assembly and...

Investigating the mechansisms associated with Folfiri-induced muscle wasting...

COVID-19 DGE (GSE152641) edgeR Galaxy Server

COVID-19 whole-blood RNA-seq (GSE152641) processed via edgeR Galaxy Server