https://ega-archive.org/dacs/EGAC00001001974https://ega-archive.org/dacs/EGAC00001001974
Bulk RNA-Sequencing of 18 primary breast cancers from Wu et al. (2021) study.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Data repository for the scMappR manuscript:
Abstract from biorXiv (https://www.biorxiv.org/content/10.1101/2020.08.24.265298v1.full).
RNA sequencing (RNA-seq) is widely used to identify differentially expressed genes (DEGs) and reveal biological mechanisms underlying complex biological processes. RNA-seq is often performed on heterogeneous samples and the resulting DEGs do not necessarily indicate the cell types where the differential expression occurred. While single-cell RNA-seq (scRNA-seq) methods solve this problem, technical and cost constraints currently limit its widespread use. Here we present single cell Mapper (scMappR), a method that assigns cell-type specificity scores to DEGs obtained from bulk RNA-seq by integrating cell-type expression data generated by scRNA-seq and existing deconvolution methods. After benchmarking scMappR using RNA-seq data obtained from sorted blood cells, we asked if scMappR could reveal known cell-type specific changes that occur during kidney regeneration. We found that scMappR appropriately assigned DEGs to cell-types involved in kidney regeneration, including a relatively small proportion of immune cells. While scMappR can work with any user supplied scRNA-seq data, we curated scRNA-seq expression matrices for ∼100 human and mouse tissues to facilitate its use with bulk RNA-seq data alone. Overall, scMappR is a user-friendly R package that complements traditional differential expression analysis available at CRAN.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The dataset consists of 31 bulk samples obtained from human embryonic stem cells undergoing undirected differentiation. The samples were collected at 16 time points on days 0 and 7-21 of differentiation. The file contains the raw read counts.
More details can be found in:
SPIRAL: Significant Process InfeRence ALgorithm for single cell RNA-sequencing and spatial transcriptomics\
Hadas Biran, Tamar Hashimshony, Tamar Lahav, Or Efrat, Yael Mandel-Gutfreund, and Zohar Yakhini
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
This folder contains the dataset and R code for analyzing the bulk skin RNA sequencing on male female irradiated and control skin.
cts_Female.csv: gene counts of female no UVB and UVB samples Female.txt: sample index (Control or UV) of Female cts_male.csv: gene counts of male no UVB and UVB samples Male.txt: sample index (Control or UV) of Male DESeq2_Male.R: R code for analyzing male samples DESeq2_Female.R: R code for analyzing female samples UVSex.R: R code for comparing the gene regulation changes in Males over females.
Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically
This dataset consists of bulk RNA sequencing data of MACS-separated bone marrow cells (CD34+ stem cells, GPA+ erythroblasts, CD71+ reticulocytes, ring sideroblasts and siderocytes) obtained from multiple healthy bone marrow donors and MDS-RS patients. A second minibulk dataset is included, where CD34+ and GPA+ cells were treated with cycloheximide or left untreated. The objective of this data collection was to assess several parameters on how the bone marrow of MDS-RS patients differs from that of healthy donors.
This dataset includes raw sequencing data in .fastq format, processed count matrices and associated pseudonymized metadata.
Processing: Bulk: CD34+ HSPC samples, mixed GPA+ erythroblast samples and CD71+ PB reticulocyte samples (RetPB) were isolated through MACS. RS and siderocytes were obtained through MACS+FACS. Cells were lysed in RLT (Qiagen) + 40 mM dithiothreitol (Sigma-Aldrich) and RNA extraction was performed with RNeasy Micro Kit (Qiagen) with RNase-free DNase treatment according to the manufacturer’s protocol. RNA integrity numbers (RIN) were estimated using Agilent RNA 6000 Pico Kits (Agilent Technologies, CA, USA). A minimum RIN value of 6.5 was considered adequate.
Minibulk: Minibulk RNAseq was performed for assessment of cycloheximide treatment effects in CD34+ and GPA+ cell populations. The library preparation procedure was performed using the Xpress Genomics bulk RNA-seq kit v1, automated on a SP960 liquid handler (MGI Tech). In short, the library preparation procedure denatures RNA samples in presence of oligo-dT primer, which is followed by reverse transcription of RNA with a template-switching procedure and pre-amplification of full-length cDNA for 10 PCR cycles. cDNA was subsequently tagmented using Tn5 (TDE1 Tagment DNA Enzyme; Illumina) and reactions quenched after 10 min at 55 °C by addition of 0.2 % SDS (Sigma-Aldrich). Tagmented DNA was indexed using custom dual-unique Nextera index primers in a 12 cycle PCR reaction. Indexed libraries were cleaned up using SPRI beads in 22% PEG8000 buffer and eluted in 12 µL H2O.
The dataset consists of 2 folders: - Bulk_Main - Minibulk_Cycloheximide
The folder Bulk_Main contains 67 GNU zipped fastq files, 1 tsv file, and 1 txt file. The folder Minibulk_Cycloheximide contains 2 GNU zipped fastq files, and 4 txt files.
The documentation file File_list_BulkMinibulk.txt contains a full list of the files in the dataset.
The total size of the dataset is approximately 360 GB.
https://ega-archive.org/dacs/EGAC00001002508https://ega-archive.org/dacs/EGAC00001002508
RNA-seq libraries were prepared using the KAPA Stranded RNA-Seq Kit with RiboErase (Kapa Biosystems, Wilmington, MA) and sequenced to a target depth of 200-M reads on the Illumina HiSeq platform (Illumina, San Diego, CA).
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
DEG and GO analyses of bulk-RNA sequencing from aging mouse cortex and spinal cord.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Bulk RNA-seq data (smartseq2; raw freature counts) of naive murine CD4+ T cells co-cultured with murine HSPCs (THSPC), or with murine DCs (TDC), or murine LSKs as control condition, in the presence or absence of antigen (ova,ctrl)
https://ega-archive.org/dacs/EGAC00001000581https://ega-archive.org/dacs/EGAC00001000581
Bulk RNA-seq
https://ega-archive.org/dacs/EGAC00001001270https://ega-archive.org/dacs/EGAC00001001270
We profiled 43 normal human adult brain and 11 normal human fetal brain specimens by bulk RNA-seq. The raw fastqs are provided.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
List of RNA-Seq datasets used in this study.
This dataset contains files reconstructing single-cell data presented in 'Reference transcriptomics of porcine peripheral immune cells created through bulk and single-cell RNA sequencing' by Herrera-Uribe & Wiarda et al. 2021. Samples of peripheral blood mononuclear cells (PBMCs) were collected from seven pigs and processed for single-cell RNA sequencing (scRNA-seq) in order to provide a reference annotation of porcine immune cell transcriptomics at enhanced, single-cell resolution. Analysis of single-cell data allowed identification of 36 cell clusters that were further classified into 13 cell types, including monocytes, dendritic cells, B cells, antibody-secreting cells, numerous populations of T cells, NK cells, and erythrocytes. Files may be used to reconstruct the data as presented in the manuscript, allowing for individual query by other users. Scripts for original data analysis are available at https://github.com/USDA-FSEPRU/PorcinePBMCs_bulkRNAseq_scRNAseq. Raw data are available at https://www.ebi.ac.uk/ena/browser/view/PRJEB43826. Funding for this dataset was also provided by NRSP8: National Animal Genome Research Program (https://www.nimss.org/projects/view/mrp/outline/18464). Resources in this dataset:Resource Title: Herrera-Uribe & Wiarda et al. PBMCs - All Cells 10X Format. File Name: PBMC7_AllCells.zipResource Description: Zipped folder containing PBMC counts matrix, gene names, and cell IDs. Files are as follows: matrix of gene counts* (matrix.mtx.gx) gene names (features.tsv.gz) cell IDs (barcodes.tsv.gz) *The ‘raw’ count matrix is actually gene counts obtained following ambient RNA removal. During ambient RNA removal, we specified to calculate non-integer count estimations, so most gene counts are actually non-integer values in this matrix but should still be treated as raw/unnormalized data that requires further normalization/transformation. Data can be read into R using the function Read10X().Resource Title: Herrera-Uribe & Wiarda et al. PBMCs - All Cells Metadata. File Name: PBMC7_AllCells_meta.csvResource Description: .csv file containing metadata for cells included in the final dataset. Metadata columns include: nCount_RNA = the number of transcripts detected in a cell nFeature_RNA = the number of genes detected in a cell Loupe = cell barcodes; correspond to the cell IDs found in the .h5Seurat and 10X formatted objects for all cells prcntMito = percent mitochondrial reads in a cell Scrublet = doublet probability score assigned to a cell seurat_clusters = cluster ID assigned to a cell PaperIDs = sample ID for a cell celltypes = cell type ID assigned to a cellResource Title: Herrera-Uribe & Wiarda et al. PBMCs - All Cells PCA Coordinates. File Name: PBMC7_AllCells_PCAcoord.csvResource Description: .csv file containing first 100 PCA coordinates for cells. Resource Title: Herrera-Uribe & Wiarda et al. PBMCs - All Cells t-SNE Coordinates. File Name: PBMC7_AllCells_tSNEcoord.csvResource Description: .csv file containing t-SNE coordinates for all cells.Resource Title: Herrera-Uribe & Wiarda et al. PBMCs - All Cells UMAP Coordinates. File Name: PBMC7_AllCells_UMAPcoord.csvResource Description: .csv file containing UMAP coordinates for all cells.Resource Title: Herrera-Uribe & Wiarda et al. PBMCs - CD4 T Cells t-SNE Coordinates. File Name: PBMC7_CD4only_tSNEcoord.csvResource Description: .csv file containing t-SNE coordinates for only CD4 T cells (clusters 0, 3, 4, 28). A dataset of only CD4 T cells can be re-created from the PBMC7_AllCells.h5Seurat, and t-SNE coordinates used in publication can be re-assigned using this .csv file.Resource Title: Herrera-Uribe & Wiarda et al. PBMCs - CD4 T Cells UMAP Coordinates. File Name: PBMC7_CD4only_UMAPcoord.csvResource Description: .csv file containing UMAP coordinates for only CD4 T cells (clusters 0, 3, 4, 28). A dataset of only CD4 T cells can be re-created from the PBMC7_AllCells.h5Seurat, and UMAP coordinates used in publication can be re-assigned using this .csv file.Resource Title: Herrera-Uribe & Wiarda et al. PBMCs - Gamma Delta T Cells UMAP Coordinates. File Name: PBMC7_GDonly_UMAPcoord.csvResource Description: .csv file containing UMAP coordinates for only gamma delta T cells (clusters 6, 21, 24, 31). A dataset of only gamma delta T cells can be re-created from the PBMC7_AllCells.h5Seurat, and UMAP coordinates used in publication can be re-assigned using this .csv file.Resource Title: Herrera-Uribe & Wiarda et al. PBMCs - Gamma Delta T Cells t-SNE Coordinates. File Name: PBMC7_GDonly_tSNEcoord.csvResource Description: .csv file containing t-SNE coordinates for only gamma delta T cells (clusters 6, 21, 24, 31). A dataset of only gamma delta T cells can be re-created from the PBMC7_AllCells.h5Seurat, and t-SNE coordinates used in publication can be re-assigned using this .csv file.Resource Title: Herrera-Uribe & Wiarda et al. PBMCs - Gene Annotation Information. File Name: UnfilteredGeneInfo.txtResource Description: .txt file containing gene nomenclature information used to assign gene names in the dataset. 'Name' column corresponds to the name assigned to a feature in the dataset.Resource Title: Herrera-Uribe & Wiarda et al. PBMCs - All Cells H5Seurat. File Name: PBMC7.tarResource Description: .h5Seurat object of all cells in PBMC dataset. File needs to be untarred, then read into R using function LoadH5Seurat().
Bulk RNA sequencing from Placenta; Site B; Maternal region
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
We developed a single-cell transcriptomics pipeline for high-throughput pharmacotranscriptomic screening. We explored the transcriptional landscape of three HGSOC models (JHOS2, a representative cell line; PDC2 and PDC3, two patient-derived samples) after treating their cells for 24 hours with 45 drugs representing 13 distinct classes of mechanism of action. Our work establishes a new precision oncology framework for the study of molecular mechanisms activated by a broad array of drug responses in cancer. . ├── 3D UMAPs/ → Interactive 3D UMAPs of cells treated with the 45 drugs used for multiplexed scRNA-seq. Related to Figure 4. Coordinates: x = UMAP 1; y = UMAP 2; z = UMAP 3. Legend: green = PDC1; blue = PDC2; red = JHOS2. │ ├── DMSO_3D_UMAP_Dini.et.al.html → 3D UMAP of untreated cells. │ └── drug_3D_UMAP_Dini.et.al.html → 3D UMAP of cells treated with (drug). ├── QC_plots/ → Diagnostic plots. Related to Figures 2–4. │ ├── model_QC_violin_plot_2023.pdf → Violin plots of the QC metrics used to filter the data. │ ├── model_col_HTO or model_row_HTO before and after filt → Heatmaps of the row or column HTO expression in each cell. │ └── model_counts_histogram_2023.pdf → Histogram of the distribution of the total counts per cell after filtering for high-quality cells. ├── scRNAseq/ → scRNA-seq data. Related to Figures 2–4. │ ├── AllData_subsampled_DGE_edgeR.csv.gz → Differential gene expression analyses results between treated and untreated cells via pseudobulk of aggregate subsamples, for each of the three models. Related to Figure 3. │ └── All_vs_all_RNAclusters_DEG_signif.txt → Differential gene expression analysis results (p.adj < 0.05) of FindAllMarkers for the Leiden/RNA clusters. ├── PDCs.transcript.counts.tsv → Bulk RNA-seq count data for PDCs 1–3 processed by Kallisto. Related to Figure S6. └── PDCs.transcript.TPM.tsv → Bulk RNA-seq TPM data for PDCs 1–3 processed by Kallisto. Related to Figure S6.
Bulk RNA sequencing from Placenta; Site B; Maternal region
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The description of bulk RNA-seq datasets.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
unikill066/bulk-rna-seq-data-mount dataset hosted on Hugging Face and contributed by the HF Datasets community
Bulk RNA sequencing from Placenta; Site A; Maternal region
In the Rodent Research Reference Mission (RRRM-2), forty female C57BL/6NTac mice were flown on the International Space Station. To assess differences in outcomes due to age, twenty 12 week-old and twenty 29 week-old mice were flown, respectively. To directly assess spaceflight effects, half of the young and old mice (10 old, 10 young) were sacrificed on-orbit after 55-58 days (ISS Terminal, ISS-T), while the other half (10 old, 10 young) were returned live to Earth after 32 days and allowed to recover for 24 days (Live Animal Return, LAR) before sacrifice. ISS-T and LAR mice were the same age at sacrifice. Both the ISS-T and LAR animals had independent ground controls (10 mice per group housed in flight hardware in matched environmental conditions), basal controls (10 mice per group sacrificed 2 days before launch), and vivarium controls (10 mice per group housed within standard vivarium habitats). Thus RRRM-2 included a total of 160 mice. This study includes bulk RNA sequencing and spatially resolved transcriptional profiling data from cerebellums from 4 (bulk RNAseq) or 5 (spatial transcriptomics) old ISS-T flight animals, 3 old ISS-T ground control (GC) animals, 5 young ISS-T flight animals, 3 young ISS-T GC animals, 3 old LAR flight animals, 3 old LAR GC animals, 2 (bulk RNAseq) or 3 (spatial transcriptomics) young LAR flight animals, and 3 young LAR GC animals. Cerebellums from the right hemisphere were embedded and cryosectioned. Cryosections were either processed for bulk RNA sequencing or placed on gene expression arrays, stained and imaged. Imaging was followed by tissue permeabilization to release mRNA molecules from cells for capture onto the array surface. Subsequently, spatial transcriptomics libraries were prepared and sequenced.
https://ega-archive.org/dacs/EGAC50000000055https://ega-archive.org/dacs/EGAC50000000055
The current dataset represents bulk RNA-Seq gene expression profiling (with an exome library preparation kit) of muscle-invasive bladder cancer tissue samples obtained before and after platinum-based chemotherapy. 89 samples are pre-treatment transurethral resection of the bladder tumor (TUR-BT) tissue at diagnosis (baseline), 86 are post-treatment cystectomy tissue (resected tumor bulk), comprising 76 pairs of samples from the same patients. FASTQ files contain gene expression data.
https://ega-archive.org/dacs/EGAC00001001974https://ega-archive.org/dacs/EGAC00001001974
Bulk RNA-Sequencing of 18 primary breast cancers from Wu et al. (2021) study.