36 datasets found

n
Extended data tables to Haering and Habermann, F1000Res, RNfuzzyApp: an R...
data.niaid.nih.gov
search.dataone.org
+1more
zip
Updated Jul 9, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Bianca Habermann; Margaux Haering (2021). Extended data tables to Haering and Habermann, F1000Res, RNfuzzyApp: an R shiny RNA-seq data analysis app for visualisation, differential expression analysis, time-series clustering and enrichment analysis [Dataset]. http://doi.org/10.5061/dryad.8pk0p2nnd
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5061/dryad.8pk0p2nnd
Dataset updated
Jul 9, 2021
Dataset provided by
Institut de Biologie du Développement Marseille
Authors
Bianca Habermann; Margaux Haering
License
https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
Description
Background

RNA-seq is a widely adopted affordable method for large scale gene expression profiling. However, user-friendly and versatile tools for wet-lab biologists to analyse RNA-seq data beyond standard analyses such as differential expression, are rare. Especially, the analysis of time-series data is difficult for wet-lab biologists lacking advanced computational training. Furthermore, most meta-analysis tools are tailored for model organisms and not easily adaptable to other species.

Results

With RNfuzzyApp, we provide a user-friendly, web-based R-shiny app for differential expression analysis, as well as time-series analysis of RNA-seq data. RNfuzzyApp offers several methods for normalization and differential expression analysis of RNA-seq data, providing easy-to-use toolboxes, interactive plots and downloadable results. For time-series analysis, RNfuzzyApp presents the first web-based, automated pipeline for soft clustering with the Mfuzz R package, including methods to aid in cluster number selection, Mfuzz loop computations, cluster overlap analysis, as well as cluster enrichments.

Conclusion

RNfuzzyApp is an intuitive, easy to use and interactive R shiny app for RNA-seq differential expression and time-series analysis, offering a rich selection of interactive plots, providing a quick overview of raw data and generating rapid analysis results. Furthermore, its orthology assignment, enrichment analysis, as well as ID conversion functions are accessible to non-model organisms.

Methods Haering_etal_extendedData_DmdevLeg_GSE143430_mean.txt: mean values calculated from raw reads of replicates, downloaded from gene expression omnibus (dataset GSE143430 https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE143430).

Haering_etal_extendedDatatable_1a_Tabulamurissenis_3vs12m_DEA.txt: Tabula muris senis limb muscle data (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE132040) from 3, 12 and 27month males, processed with RNfuzzyApp (https://gitlab.com/habermann_lab/rna-seq-analysis-app)

Haering_etal_extendedDatatable_1b_Tabulamurissenis_3vs27m_DEA.txt: Tabula muris senis limb muscle data (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE132040) from 3, 12 and 27month males, processed with RNfuzzyApp (https://gitlab.com/habermann_lab/rna-seq-analysis-app)

Haering_etal_extendedDatatable_1c_Tabulamurissenis_12vs27m_DEA.txt: Tabula muris senis limb muscle data (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE132040) from 3, 12 and 27month males, processed with RNfuzzyApp (https://gitlab.com/habermann_lab/rna-seq-analysis-app)

Haering_etal_extendedDatatable_1d_Tabulamurissenis_3vs12m_gpofiler.txt: Tabula muris senis limb muscle data (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE132040) from 3, 12 and 27month males, processed with RNfuzzyApp (https://gitlab.com/habermann_lab/rna-seq-analysis-app)

Haering_etal_extendedDatatable_1e_Tabulamurissenis_3vs27m_gpofiler.txt: Tabula muris senis limb muscle data (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE132040) from 3, 12 and 27month males, processed with RNfuzzyApp (https://gitlab.com/habermann_lab/rna-seq-analysis-app)

Haering_etal_extendedDatatable_1f_Tabulamurissenis_12vs27m_gpofiler.txt: Tabula muris senis limb muscle data (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE132040) from 3, 12 and 27month males, processed with RNfuzzyApp (https://gitlab.com/habermann_lab/rna-seq-analysis-app)

Haering_etal_extendedDatatable_2a_Tabulamurissenis_cluster1_gpofiler.txt: Tabula muris senis limb muscle data (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE132040) from 3, 12 and 27month males, processed with RNfuzzyApp (https://gitlab.com/habermann_lab/rna-seq-analysis-app)

Haering_etal_extendedDatatable_2b_Tabulamurissenis_cluster2_gpofiler.txt: Tabula muris senis limb muscle data (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE132040) from 3, 12 and 27month males, processed with RNfuzzyApp (https://gitlab.com/habermann_lab/rna-seq-analysis-app)

Haering_etal_extendedDatatable_2c_Tabulamurissenis_cluster3_gpofiler.txt: Tabula muris senis limb muscle data (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE132040) from 3, 12 and 27month males, processed with RNfuzzyApp (https://gitlab.com/habermann_lab/rna-seq-analysis-app)

Haering_etal_extendedDatatable_2d_Tabulamurissenis_cluster4_gpofiler.txt: Tabula muris senis limb muscle data (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE132040) from 3, 12 and 27month males, processed with RNfuzzyApp (https://gitlab.com/habermann_lab/rna-seq-analysis-app)

Haering_etal_extendedDatatable_2e_Tabulamurissenis_cluster5_gpofiler.txt: Tabula muris senis limb muscle data (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE132040) from 3, 12 and 27month males, processed with RNfuzzyApp (https://gitlab.com/habermann_lab/rna-seq-analysis-app)

Haering_etal_extendedDatatable_3a_DmLeg_cluster1_gpofiler.txt: Haering_etal_extendedData_DmdevLeg_GSE143430_mean.txt processed with RNfuzzyApp (https://gitlab.com/habermann_lab/rna-seq-analysis-app)

Haering_etal_extendedDatatable_3b_DmLeg_cluster2_gpofiler.txt: Haering_etal_extendedData_DmdevLeg_GSE143430_mean.txt processed with RNfuzzyApp (https://gitlab.com/habermann_lab/rna-seq-analysis-app)

Haering_etal_extendedDatatable_3c_DmLeg_cluster3_gpofiler.txt: Haering_etal_extendedData_DmdevLeg_GSE143430_mean.txt processed with RNfuzzyApp (https://gitlab.com/habermann_lab/rna-seq-analysis-app)

Haering_etal_extendedDatatable_3d_DmLeg_cluster4_gpofiler.txt: Haering_etal_extendedData_DmdevLeg_GSE143430_mean.txt processed with RNfuzzyApp (https://gitlab.com/habermann_lab/rna-seq-analysis-app)

Haering_etal_extendedDatatable_3e_DmLeg_cluster5_gpofiler.txt: Haering_etal_extendedData_DmdevLeg_GSE143430_mean.txt processed with RNfuzzyApp (https://gitlab.com/habermann_lab/rna-seq-analysis-app)

Haering_etal_extendedDatatable_3f_DmLeg_cluster6_gpofiler.txt: Haering_etal_extendedData_DmdevLeg_GSE143430_mean.txt processed with RNfuzzyApp (https://gitlab.com/habermann_lab/rna-seq-analysis-app)

Haering_etal_extendedDatatable_3g_DmLeg_cluster7_gpofiler.txt: Haering_etal_extendedData_DmdevLeg_GSE143430_mean.txt processed with RNfuzzyApp (https://gitlab.com/habermann_lab/rna-seq-analysis-app)

Haering_etal_extendedDatatable_3h_DmLeg_cluster8_gpofiler.txt: Haering_etal_extendedData_DmdevLeg_GSE143430_mean.txt processed with RNfuzzyApp (https://gitlab.com/habermann_lab/rna-seq-analysis-app)

Haering_etal_extendedDatatable_3i_DmLeg_cluster9_gpofiler.txt: Haering_etal_extendedData_DmdevLeg_GSE143430_mean.txt processed with RNfuzzyApp (https://gitlab.com/habermann_lab/rna-seq-analysis-app)

Haering_etal_extendedDatatable_3j_DmLeg_cluster10_gpofiler.txt: Haering_etal_extendedData_DmdevLeg_GSE143430_mean.txt processed with RNfuzzyApp (https://gitlab.com/habermann_lab/rna-seq-analysis-app)

Haering_etal_extendedDatatable_3k_DmLeg_cluster11_gpofiler.txt: Haering_etal_extendedData_DmdevLeg_GSE143430_mean.txt processed with RNfuzzyApp (https://gitlab.com/habermann_lab/rna-seq-analysis-app)

Haering_etal_extendedDatatable_3l_DmLeg_cluster12_gpofiler.txt: Haering_etal_extendedData_DmdevLeg_GSE143430_mean.txt processed with RNfuzzyApp (https://gitlab.com/habermann_lab/rna-seq-analysis-app)
Z
Repository for Single Cell RNA Sequencing Analysis of The EMT6 Dataset
data.niaid.nih.gov
Updated Nov 20, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Hsu, Jonathan; Stoop, Allart (2023). Repository for Single Cell RNA Sequencing Analysis of The EMT6 Dataset [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_10011621
Explore at:
Dataset updated
Nov 20, 2023
Authors
Hsu, Jonathan; Stoop, Allart
Description
Table of Contents

Main Description File Descriptions Linked Files Installation and Instructions

1. Main Description

This is the Zenodo repository for the manuscript titled "A TCR β chain-directed antibody-fusion molecule that activates and expands subsets of T cells and promotes antitumor activity.". The code included in the file titled marengo_code_for_paper_jan_2023.R was used to generate the figures from the single-cell RNA sequencing data. The following libraries are required for script execution:

Seurat scReportoire ggplot2 stringr dplyr ggridges ggrepel ComplexHeatmap

File Descriptions

The code can be downloaded and opened in RStudios. The "marengo_code_for_paper_jan_2023.R" contains all the code needed to reproduce the figues in the paper The "Marengo_newID_March242023.rds" file is available at the following address: https://zenodo.org/badge/DOI/10.5281/zenodo.7566113.svg (Zenodo DOI: 10.5281/zenodo.7566113). The "all_res_deg_for_heat_updated_march2023.txt" file contains the unfiltered results from DGE anlaysis, also used to create the heatmap with DGE and volcano plots. The "genes_for_heatmap_fig5F.xlsx" contains the genes included in the heatmap in figure 5F.

Linked Files

This repository contains code for the analysis of single cell RNA-seq dataset. The dataset contains raw FASTQ files, as well as, the aligned files that were deposited in GEO. The "Rdata" or "Rds" file was deposited in Zenodo. Provided below are descriptions of the linked datasets:

Gene Expression Omnibus (GEO) ID: GSE223311(https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE223311)

Title: Gene expression profile at single cell level of CD4+ and CD8+ tumor infiltrating lymphocytes (TIL) originating from the EMT6 tumor model from mSTAR1302 treatment. Description: This submission contains the "matrix.mtx", "barcodes.tsv", and "genes.tsv" files for each replicate and condition, corresponding to the aligned files for single cell sequencing data. Submission type: Private. In order to gain access to the repository, you must use a reviewer token (https://www.ncbi.nlm.nih.gov/geo/info/reviewer.html).

Sequence read archive (SRA) repository ID: SRX19088718 and SRX19088719

Title: Gene expression profile at single cell level of CD4+ and CD8+ tumor infiltrating lymphocytes (TIL) originating from the EMT6 tumor model from mSTAR1302 treatment. Description: This submission contains the raw sequencing or .fastq.gz files, which are tab delimited text files. Submission type: Private. In order to gain access to the repository, you must use a reviewer token (https://www.ncbi.nlm.nih.gov/geo/info/reviewer.html).

Zenodo DOI: 10.5281/zenodo.7566113(https://zenodo.org/record/7566113#.ZCcmvC2cbrJ)

Title: A TCR β chain-directed antibody-fusion molecule that activates and expands subsets of T cells and promotes antitumor activity. Description: This submission contains the "Rdata" or ".Rds" file, which is an R object file. This is a necessary file to use the code. Submission type: Restricted Acess. In order to gain access to the repository, you must contact the author.

Installation and Instructions

The code included in this submission requires several essential packages, as listed above. Please follow these instructions for installation:

Ensure you have R version 4.1.2 or higher for compatibility.

Although it is not essential, you can use R-Studios (Version 2022.12.0+353 (2022.12.0+353)) for accessing and executing the code.

Download the *"Rdata" or ".Rds" file from Zenodo (https://zenodo.org/record/7566113#.ZCcmvC2cbrJ) (Zenodo DOI: 10.5281/zenodo.7566113).

Open R-Studios (https://www.rstudio.com/tags/rstudio-ide/) or a similar integrated development environment (IDE) for R.

Set your working directory to where the following files are located:

marengo_code_for_paper_jan_2023.R Install_Packages.R Marengo_newID_March242023.rds genes_for_heatmap_fig5F.xlsx all_res_deg_for_heat_updated_march2023.txt

You can use the following code to set the working directory in R:

setwd(directory)

Open the file titled "Install_Packages.R" and execute it in R IDE. This script will attempt to install all the necessary pacakges, and its dependencies in order to set up an environment where the code in "marengo_code_for_paper_jan_2023.R" can be executed.

Once the "Install_Packages.R" script has been successfully executed, re-start R-Studios or your IDE of choice.

Open the file "marengo_code_for_paper_jan_2023.R" file in R-studios or your IDE of choice.

Execute commands in the file titled "marengo_code_for_paper_jan_2023.R" in R-Studios or your IDE of choice to generate the plots.
e
Strand-specific RNA-seq of nine mouse tissues
ebi.ac.uk
Updated Dec 20, 2012
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jason Merkin; Christopher Burge (2012). Strand-specific RNA-seq of nine mouse tissues [Dataset]. https://www.ebi.ac.uk/arrayexpress/experiments/E-MTAB-2801/
Explore at:
Dataset updated
Dec 20, 2012
Authors
Jason Merkin; Christopher Burge
Description
This experiment is contains mouse organism part samples and strand-specific RNA-seq data from experiment E-GEOD-41637 (https://www.ebi.ac.uk/arrayexpress/experiments/E-GEOD-41637/), which aimed at assessing tissue-specific transcriptome variation across mammals, with chicken used as an outgroup in evolutionary analyses. Each organism part (with the exception of heart) was sourced from animals from three different strains: C57BL/6, DBA/2J and CD1. (There is no data for heart from the C57BL/6 strain.) This data set was originally submitted to NCBI Gene Expression Omnibus under accession number GSE41637 (http://www.ncbi.nlm.nih.gov/projects/geo/query/acc.cgi?acc=GSE41637) and later imported to ArrayExpress as E-GEOD-41637.
e
Strand-specific RNA-seq of nine rhesus macaque tissues
ebi.ac.uk
Updated Dec 20, 2012
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jason Merkin; Christopher Burge (2012). Strand-specific RNA-seq of nine rhesus macaque tissues [Dataset]. https://www.ebi.ac.uk/biostudies/arrayexpress/studies/E-MTAB-2799
Explore at:
Dataset updated
Dec 20, 2012
Authors
Jason Merkin; Christopher Burge
Description
This experiment is contains rhesus macaque organism part samples and strand-specific RNA-seq data from experiment E-GEOD-41637 (https://www.ebi.ac.uk/arrayexpress/experiments/E-GEOD-41637/), which aimed at assessing tissue-specific transcriptome variation across mammals, with chicken used as an outgroup in evolutionary analyses. Each organism part was sourced from three different animals as biological replicates. This data set was originally submitted to NCBI Gene Expression Omnibus under accession number GSE41637 (http://www.ncbi.nlm.nih.gov/projects/geo/query/acc.cgi?acc=GSE41637) and later imported to ArrayExpress as E-GEOD-41637.
Raw mouse mammary RNA-Seq data (fastq)
zenodo.org
kaggle.com
application/gzip, bin +1
Updated Nov 6, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Helena Rasche; Helena Rasche (2020). Raw mouse mammary RNA-Seq data (fastq) [Dataset]. http://doi.org/10.5281/zenodo.4249516
Explore at:
bin, application/gzip, txtAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.4249516
Dataset updated
Nov 6, 2020
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Helena Rasche; Helena Rasche
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
12 fastq files with 1000 reads each, 4 index files for chr 1 for mm10, targets files with sample information.

References

http://www.ncbi.nlm.nih.gov/pubmed/25730472

http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE60450

http://hgdownload.soe.ucsc.edu/goldenPath/mm10/chromosomes/

http://www.ncbi.nlm.nih.gov/pubmed/20921232

http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE18508
barechey/PredictIO.data:
zenodo.org
data.niaid.nih.gov
zip
Updated Sep 8, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yacine Bareche; Yacine Bareche (2022). barechey/PredictIO.data: [Dataset]. http://doi.org/10.5281/zenodo.7044234
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.7044234
Dataset updated
Sep 8, 2022
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Yacine Bareche; Yacine Bareche
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Data for our paper titled "Leveraging Big Data of Immune Checkpoint Blockade Response Identifies Novel Potential Targets".

Bareche et al., Annals of Oncology (2022); https://doi.org/10.1016/j.annonc.2022.08.084

----------------------------------------------------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------------------------------------------------

Background: The development of immune checkpoint blockade (ICB) has changed the way we treat various cancers. While ICB produces durable survival benefits in a number of malignancies, a large proportion of treated patients do not derive clinical benefit. Recent clinical profiling studies have shed light on molecular features and mechanisms that modulate response to ICB. Nevertheless, none of these identified molecular features were investigated in large enough cohorts to be of clinical value.

Materials and methods: Literature review was performed to identify relevant studies including clinical dataset of patient treated with ICB (anti-PD1/L1, anti-CTLA4 or the combo) and available sequencing data. Tumor mutational burden (TMB) and 37 previously reported gene expression (GE) signature were computed with respect to the original publication. Biomarker association with ICB response (IR) and survival (PFS/OS) was investigated separately within each study and combined together for meta-analysis.

Results: We performed a comparative meta-analysis of genomic and transcriptomic biomarkers of immune-checkpoint blockade (ICB) responses in over 3,600 patients across 12 tumor types and implemented an open-source web-application (predictIO.ca) for exploration. Tumor mutation burden (TMB) and 21/37 gene signatures were predictive of ICB responses across tumor types. We next developed a de novo gene expression signature (PredictIO) from our pan-cancer analysis and demonstrated its superior predictive value over other biomarkers. To identify novel targets, we computed the T-cell dysfunction score for each gene within PredictIO and their ability to predict dual PD-1/CTLA-4 blockade in mice. Two genes, F2RL1 (encoding protease-activated receptor-2) and RBFOX2 (encoding RNA-binding motif protein 9), were concurrently associated with worse ICB clinical outcomes, T cell dysfunction in ICB-naive patients and resistance to dual PD-1/CTLA-4 blockade in preclinical models.

Conclusions: Our study highlights the potential of large-scale meta-analyses in identifying novel biomarkers and potential therapeutic targets for cancer immunotherapy.

----------------------------------------------------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------------------------------------------------

Data description

mouseModel:

Chen: Expression data of the TNBC mouse model study from Chen et al. (PMID:32907939)

Meskini: Expression data of the Melanoma mouse model study from Meskini et al. (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE172320)

Zemek: Expression data of the AB1 & Renca mouse model study from Zemek et al. (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE117358)

Discovery_cohort:
Expression and SNV data of the discovery cohort

Validation_cohort:
Expression and SNV data of the validation cohort
CITE-seq=scRNA-seq+Proteins: Challenge NeurIPS2021
kaggle.com
zip
Updated Jan 22, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Alexander Chervov (2023). CITE-seq=scRNA-seq+Proteins: Challenge NeurIPS2021 [Dataset]. https://www.kaggle.com/datasets/alexandervc/citeseqscrnaseqproteins-challenge-neurips2021
Explore at:
zip(646191284 bytes)Available download formats
Dataset updated
Jan 22, 2023
Authors
Alexander Chervov
Description
Context

Dataset from NeurIPS2021 challenge similar to Kaggle 2022 competition: https://www.kaggle.com/competitions/open-problems-multimodal "Open Problems - Multimodal Single-Cell Integration Predict how DNA, RNA & protein measurements co-vary in single cells"

CITE-seq - joint single cell RNA sequencing + single cell measurements of CD** proteins. (https://en.wikipedia.org/wiki/CITE-Seq) (For companion dataset on scRNA-seq + scATAC-seq, see: https://www.kaggle.com/datasets/alexandervc/scrnaseq-scatacseq-challenge-at-neurips-2021 )

Single cell RNA sequencing, i.e. rows - correspond to cells, columns to genes (or vice versa). value of the matrix shows how strong is "expression" of the corresponding gene in the corresponding cell. https://en.wikipedia.org/wiki/Single-cell_transcriptomics

See tutorials: https://scanpy.readthedocs.io/en/stable/tutorials.html ("Scanpy" - main Python package to work with scRNA-seq data). Or https://satijalab.org/seurat/ "Seurat" - "R" package

Particular data

https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE194122

Expression profiling by high throughput sequencing Genome binding/occupancy profiling by high throughput sequencing Summary Single-cell multiomics data collected from bone marrow mononuclear cells of 12 healthy human donors. Half the samples were measured using the 10X Multiome Gene Expression and Chromatin Accessability kit and half were measured using the 10X 3' Single-Cell Gene Expression kit with Feature Barcoding in combination with the BioLegend TotalSeq B Universal Human Panel v1.0. The dataset was generated to support Multimodal Single-Cell Data Integration Challenge at NeurIPS 2021. Samples were prepared using a standard protocol at four sites. The resulting data was then annotated to identify cell types and remove doublets. The dataset was designed with a nested batch layout such that some donor samples were measured at multiple sites with some donors measured at a single site. In the competition, participants were tasked with challenges including modality prediction, matching profiles from different modalities, and learning a joint embedding from multiple modalities.

Overall design Single-cell multiomics data collected from bone marrow mononuclear cells of 12 healthy human donors.

Contributor(s) Burkhardt DB, Lücken MD, Lance C, Cannoodt R, Pisco AO, Krishnaswamy S, Theis FJ, Bloom JM Citation https://datasets-benchmarks-proceedings.neurips.cc/paper/2021/hash/158f3069a435b314a80bdcb024f8e422-Abstract-round2.html

Related datasets:

Other single cell RNA seq datasets can be found on kaggle: Look here: https://www.kaggle.com/alexandervc/datasets Or search kaggle for "scRNA-seq"

Inspiration

Single cell RNA sequencing is important technology in modern biology, see e.g. "Eleven grand challenges in single-cell data science" https://genomebiology.biomedcentral.com/articles/10.1186/s13059-020-1926-6

Also see review : Nature. P. Kharchenko: "The triumphs and limitations of computational methods for scRNA-seq" https://www.nature.com/articles/s41592-021-01171-x

Search scholar.google "challenges in single cell rna sequencing" https://scholar.google.fr/scholar?q=challenges+in+single+cell+rna+sequencing&hl=en&as_sdt=0&as_vis=1&oi=scholart gives many interesting and highly cited articles

(Cited 968) Computational and analytical challenges in single-cell transcriptomics Oliver Stegle, Sarah A. Teichmann, John C. Marioni Nat. Rev. Genet., 16 (3) (2015), pp. 133-145 https://www.nature.com/articles/nrg3833
scRNA-seq + scATAC-seq Challenge at NeurIPS 2021
kaggle.com
zip
Updated Sep 16, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Alexander Chervov (2022). scRNA-seq + scATAC-seq Challenge at NeurIPS 2021 [Dataset]. https://www.kaggle.com/datasets/alexandervc/scrnaseq-scatacseq-challenge-at-neurips-2021
Explore at:
zip(2917180928 bytes)Available download formats
Dataset updated
Sep 16, 2022
Authors
Alexander Chervov
Description
Context

Dataset from NeurIPS2021 challenge similar to Kaggle 2022 competition: https://www.kaggle.com/competitions/open-problems-multimodal "Open Problems - Multimodal Single-Cell Integration Predict how DNA, RNA & protein measurements co-vary in single cells"

It is https://en.wikipedia.org/wiki/ATAC-seq#Single-cell_ATAC-seq single cell ATAC-seq data. And single cell RNA-seq data: https://en.wikipedia.org/wiki/Single-cell_transcriptomics#Single-cell_RNA-seq

Single cell RNA sequencing, i.e. rows - correspond to cells, columns to genes (or vice versa). value of the matrix shows how strong is "expression" of the corresponding gene in the corresponding cell. https://en.wikipedia.org/wiki/Single-cell_transcriptomics

See tutorials: https://scanpy.readthedocs.io/en/stable/tutorials.html ("Scanpy" - main Python package to work with scRNA-seq data). Or https://satijalab.org/seurat/ "Seurat" - "R" package

(For companion dataset on CITE-seq = scRNA-seq + Proteomics, see: https://www.kaggle.com/datasets/alexandervc/citeseqscrnaseqproteins-challenge-neurips2021)

Particular data

https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE194122

Expression profiling by high throughput sequencing Genome binding/occupancy profiling by high throughput sequencing Summary Single-cell multiomics data collected from bone marrow mononuclear cells of 12 healthy human donors. Half the samples were measured using the 10X Multiome Gene Expression and Chromatin Accessability kit and half were measured using the 10X 3' Single-Cell Gene Expression kit with Feature Barcoding in combination with the BioLegend TotalSeq B Universal Human Panel v1.0. The dataset was generated to support Multimodal Single-Cell Data Integration Challenge at NeurIPS 2021. Samples were prepared using a standard protocol at four sites. The resulting data was then annotated to identify cell types and remove doublets. The dataset was designed with a nested batch layout such that some donor samples were measured at multiple sites with some donors measured at a single site. In the competition, participants were tasked with challenges including modality prediction, matching profiles from different modalities, and learning a joint embedding from multiple modalities.

Overall design Single-cell multiomics data collected from bone marrow mononuclear cells of 12 healthy human donors.

Contributor(s) Burkhardt DB, Lücken MD, Lance C, Cannoodt R, Pisco AO, Krishnaswamy S, Theis FJ, Bloom JM Citation https://datasets-benchmarks-proceedings.neurips.cc/paper/2021/hash/158f3069a435b314a80bdcb024f8e422-Abstract-round2.html

Related datasets:

Other single cell RNA seq datasets can be found on kaggle: Look here: https://www.kaggle.com/alexandervc/datasets Or search kaggle for "scRNA-seq"

Inspiration

Single cell RNA sequencing is important technology in modern biology, see e.g. "Eleven grand challenges in single-cell data science" https://genomebiology.biomedcentral.com/articles/10.1186/s13059-020-1926-6

Also see review : Nature. P. Kharchenko: "The triumphs and limitations of computational methods for scRNA-seq" https://www.nature.com/articles/s41592-021-01171-x

Search scholar.google "challenges in single cell rna sequencing" https://scholar.google.fr/scholar?q=challenges+in+single+cell+rna+sequencing&hl=en&as_sdt=0&as_vis=1&oi=scholart gives many interesting and highly cited articles

(Cited 968) Computational and analytical challenges in single-cell transcriptomics Oliver Stegle, Sarah A. Teichmann, John C. Marioni Nat. Rev. Genet., 16 (3) (2015), pp. 133-145 https://www.nature.com/articles/nrg3833
e
RNA-seq of coding RNA from two inbred strains of maize, B73 and Mo17
ebi.ac.uk
Updated Oct 19, 2014
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Zhenyuan Lu; Michael Regulski; Jude Kendall; Jon Reinders; Victor Llaca; Stephane Deschamps; Andrew Smith; Dan Levy; W McCombie; Scott Tingey; Antoni Rafalski; Jim Hicks; Doreen Ware; Robert Martienssen (2014). RNA-seq of coding RNA from two inbred strains of maize, B73 and Mo17 [Dataset]. https://www.ebi.ac.uk/arrayexpress/experiments/E-MTAB-3028/
Explore at:
Dataset updated
Oct 19, 2014
Authors
Zhenyuan Lu; Michael Regulski; Jude Kendall; Jon Reinders; Victor Llaca; Stephane Deschamps; Andrew Smith; Dan Levy; W McCombie; Scott Tingey; Antoni Rafalski; Jim Hicks; Doreen Ware; Robert Martienssen
Description
This is a total RNA-seq data set of two inbred lines of maize, B73 and Mo17, extracted from experiment E-GEOD-39232 (https://www.ebi.ac.uk/arrayexpress/experiments/E-GEOD-39232/). E-GEOD-39232 is a larger study which also studied the expression of small RNAs and genome-wide cytosine methylation pattern in the two cultivars using high-throughput sequencing methods. For total RNA-seq, three biological replicates were used per cultivar. E-GEOD-39232 was originally submitted to NCBI Gene Expression Omnibus under accession number GSE39232 (http://www.ncbi.nlm.nih.gov/projects/geo/query/acc.cgi?acc=GSE39232) and later imported to ArrayExpress as E-GEOD-39232.
Investigation of peroxisome proliferator-activated receptor delta...
catalog.data.gov
Updated Aug 9, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. EPA Office of Research and Development (ORD) (2024). Investigation of peroxisome proliferator-activated receptor delta (ppard)-dependent visual startle response hyperactivity in larval zebrafish exposed to structurally similar Per- and Polyfluoroalkyl Substances (PFAS) [Dataset]. https://catalog.data.gov/dataset/investigation-of-peroxisome-proliferator-activated-receptor-delta-ppard-dependent-visual-s
Explore at:
Dataset updated
Aug 9, 2024
Dataset provided by
United States Environmental Protection Agencyhttp://www.epa.gov/
Description
PFHxS, PFOS, and Heptachlor RNA sequencing data. Portions of this dataset are inaccessible because: The files are too large. They can be accessed through the following means: The data can be accessed through the hyperlinks. https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE190490 https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE190009. Format: Raw RNA-sequencing output files (fastq files) for PFOS exposure located https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE190490. Metadata is included with the GEO submission. Raw RNA-sequencing output files (fastq files) for PFHxS exposure located https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE190009. Metadata is included with the GEO submission. This dataset is associated with the following publication: Gutsfeld, S., L. Wehmas, I. Omoyeni, N. Schweiger, D. Leuthold, P. Michaelis, X.M. Howey, S. Gaballah, N. Herold, C. Vogs, C. Wood, L. Becker-Bertotto, G. Wu, N. Kluver, W. Busch, S. Scholz, J. Schor, and T. Tal. Investigation of Peroxisome Proliferator-Activated Receptor Genes as Requirements for Visual Startle Response Hyperactivity in Larval Zebrafish Exposed to Structurally Similar Per- and Polyfluoroalkyl Substances (PFAS). ENVIRONMENTAL HEALTH PERSPECTIVES. National Institute of Environmental Health Sciences (NIEHS), Research Triangle Park, NC, USA, 132(7): 77007, (2024).
Characterizing Novel Olfactory Receptors Expressed in the Murine Renal...
zenodo.org
data.niaid.nih.gov
Updated Jan 21, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Victoria L. Halperin Kuhns; Victoria L. Halperin Kuhns; Jason Sanchez; Dylan C. Sarver; Zoya Khalil; Premraj Rajkumar; Kieren A. Marr; Jennifer L. Pluznick; Jennifer L. Pluznick; Jason Sanchez; Dylan C. Sarver; Zoya Khalil; Premraj Rajkumar; Kieren A. Marr (2020). Characterizing Novel Olfactory Receptors Expressed in the Murine Renal Cortex: Supplemental Table S2 [Dataset]. http://doi.org/10.5281/zenodo.2630934
Explore at:
Unique identifier
https://doi.org/10.5281/zenodo.2630934
Dataset updated
Jan 21, 2020
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Victoria L. Halperin Kuhns; Victoria L. Halperin Kuhns; Jason Sanchez; Dylan C. Sarver; Zoya Khalil; Premraj Rajkumar; Kieren A. Marr; Jennifer L. Pluznick; Jennifer L. Pluznick; Jason Sanchez; Dylan C. Sarver; Zoya Khalil; Premraj Rajkumar; Kieren A. Marr
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Olfactory receptors selected for study. Olfactory receptors (ORs) selected for study based on mapped reads in at least 7 out of 8 murine renal cortex samples. Murine samples are listed as A - M. Samples A - G were fed high fat diet, while samples I - M were fed control diet. (mm10) FPKM counts for ORs selected for study based on the GRCm30/mm10 genome build using previously published OR coordinates. ORs are listed using the "Olfr" gene names, as well as the "CUFFOR" names as determined by Ibarra-Soria X et al. (mm9) FPKM counts for ORs selected for study based on the NCBI37/mm9 genome build using established coordinates. ORs listed in green were identified and cloned from kidney RNA previously.

Data accessible at NCBI GEO database, accession number GSE117249
https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE117249
e
The Parathyroid Hormone-Regulated Transcriptome in Osteocytes: Parallel...
ebi.ac.uk
Updated Nov 13, 2014
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mark Meyer; Hillary St. John; J Pike (2014). The Parathyroid Hormone-Regulated Transcriptome in Osteocytes: Parallel Actions with 1,25-Dihydroxyvitamin D3 to Oppose Gene Expression Changes During Differentiation and to Promote Mature Cell Function [RNA-Seq] [Dataset]. https://www.ebi.ac.uk/biostudies/studies/E-GEOD-62979
Explore at:
Dataset updated
Nov 13, 2014
Authors
Mark Meyer; Hillary St. John; J Pike
Description
Although localized to the mineralized matrix of bone, osteocytes are able to respond to systemic factors such as the calciotropic hormones 1,25(OH)2D3 and PTH. In the present studies, we examine the transcriptomic response to PTH in an osteocyte cell model and found that this hormone regulated an extensive panel of genes. Surprisingly, PTH uniquely modulated two cohorts of genes, one that was expressed and associated with the osteoblast to osteocyte transition and the other a cohort that was expressed only in the mature osteocyte. Interestingly, PTHM-bM-^@M-^Ys effects were largely to oppose the expression of differentiation-related genes in the former cohort, while potentiating the expression of osteocyte-specific genes in the latter cohort. A comparison of the transcriptional effects of PTH with those obtained previously with 1,25(OH)2D3 revealed a subset of genes that was strongly overlapping. While 1,25(OH)2D3 potentiated the expression of osteocyte-specific genes similar to that seen with PTH, the overlap between the two hormones was more limited. Additional experiments identified the PKA-activated phospho-CREB (pCREB) cistrome, revealing that while many of the differentiation-related PTH regulated genes were apparent targets of a PKA-mediated signaling pathway, a reduction in pCREB binding at sites associated with osteocyte-specific PTH targets appeared to involve alternative PTH activation pathways. That pCREB binding activities positioned near important hormone-regulated gene cohorts were localized to control regions of genes was reinforced by the presence of epigenetic enhancer signatures exemplified by unique modifications at histones H3 and H4. These studies suggest that both PTH and 1,25(OH)2D3 may play important and perhaps cooperative roles in limiting osteocyte differentiation from its precursors while simultaneously exerting distinct roles in regulating mature osteocyte function. Our results provide new insight into transcription factor-associated mechanisms through which PTH and 1,25(OH)2D3 regulate a plethora of genes important to the osteoblast/osteocyte lineage. Fully differentiated IDG-SW3 cells were treated in biological triplicate with 100nM PTH for 24 hours prior to mRNA isolation and sequencing. Vehicle treated samples were previously published in GSE54783: http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSM1323967 http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSM1323968 http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSM1323969
scRNA-seq data for A549 MCF7 K562 under drugs
kaggle.com
zip
Updated May 24, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Alexander Chervov (2021). scRNA-seq data for A549 MCF7 K562 under drugs [Dataset]. https://www.kaggle.com/alexandervc/scrnaseq-exposed-to-multiple-compounds
Explore at:
zip(2234283473 bytes)Available download formats
Dataset updated
May 24, 2021
Authors
Alexander Chervov
Description
Remark: for cell cycle analysis - see paper https://arxiv.org/abs/2208.05229 "Computational challenges of cell cycle analysis using single cell transcriptomics" Alexander Chervov, Andrei Zinovyev

Data and Context

Data - results of single cell RNA sequencing, i.e. rows - correspond to cells, columns to genes (or vice versa). value of the matrix shows how strong is "expression" of the corresponding gene in the corresponding cell. https://en.wikipedia.org/wiki/Single-cell_transcriptomics

Particular data:

Data - scRNA expressions for several cell lines affected by drugs with different doses/durations. Subparts of datasets from https://www.kaggle.com/alexandervc/singlecell-rnaseq-exposed-to-multiple-compounds But we extracted some parts, like zero dose, specific cell lines etc. Because main dataset there - sciPlex3 is enormously huge - about 700K cells, and it is not easy even to load it - 16G memory crashes.

The data from https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE139944 Status Public on Dec 05, 2019 Title Massively multiplex chemical transcriptomics at single cell resolution Organisms Homo sapiens; Mus musculus Experiment type Expression profiling by high throughput sequencing Summary Single-cell RNA-seq libraries were generated using two and three level single-cell combinatorial indexing RNA sequencing (sci-RNA-seq) of untreated or small molecule inhibitor exposed HEK293T, NIH3T3, A549, MCF7 and K562 cells. Different cells and different treatment were hashed and pooled prior to sci-RNA-seq using a nuclear barcoding strategy. This nuclear barcoding strategy relies on fixation of barcode containing well-specific oligos that are specific to a given cell type, replicate or treatment condition.

The corresponding paper is here: https://pubmed.ncbi.nlm.nih.gov/31806696/ Science. 2020 Jan 3;367(6473):45-51 "Massively multiplex chemical transcriptomics at single-cell resolution" Sanjay R Srivatsan, ... , Cole Trapnell

The authors splitted data into 4 subdatasets - see sciPlex1, sciPlex2, sciPlex3,sciPlex4 in filenames. The main dataset is the sciPlex3 which contains about 600K cells.

The data splitted into small parts - which one can be easily loaded into memory can be found in https://www.kaggle.com/alexandervc/scrnaseq-exposed-to-multiple-compounds

Related datasets:

Other single cell RNA seq datasets can be found on kaggle: Look here: https://www.kaggle.com/alexandervc/datasets Or search kaggle for "scRNA-seq"

A collection of some bioinformatics related resources on kaggle: https://www.kaggle.com/general/203136

Inspiration

Single cell RNA sequencing is important technology in modern biology, see e.g. "Eleven grand challenges in single-cell data science" https://genomebiology.biomedcentral.com/articles/10.1186/s13059-020-1926-6

Also see review : Nature. P. Kharchenko: "The triumphs and limitations of computational methods for scRNA-seq" https://www.nature.com/articles/s41592-021-01171-x

Search scholar.google "challenges in single cell rna sequencing" https://scholar.google.fr/scholar?q=challenges+in+single+cell+rna+sequencing&hl=en&as_sdt=0&as_vis=1&oi=scholart gives many interesting and highly cited articles

(Cited 968) Computational and analytical challenges in single-cell transcriptomics Oliver Stegle, Sarah A. Teichmann, John C. Marioni Nat. Rev. Genet., 16 (3) (2015), pp. 133-145 https://www.nature.com/articles/nrg3833

Challenges in unsupervised clustering of single-cell RNA-seq data https://www.nature.com/articles/s41576-018-0088-9 Review Article 07 January 2019 Vladimir Yu Kiselev, Tallulah S. Andrews & Martin Hemberg Nature Reviews Genetics volume 20, pages273–282 (2019)

Challenges and emerging directions in single-cell analysis https://link.springer.com/article/10.1186/s13059-017-1218-y Published: 08 May 2017 Guo-Cheng Yuan, Long Cai, Michael Elowitz, Tariq Enver, Guoping Fan, Guoji Guo, Rafael Irizarry, Peter Kharchenko, Junhyong Kim, Stuart Orkin, John Quackenbush, Assieh Saadatpour, Timm Schroeder, Ramesh Shivdasani & Itay Tirosh Genome Biology volume 18, Article number: 84 (2017)

Single-Cell RNA Sequencing in Cancer: Lessons Learned and Emerging Challenges https://www.sciencedirect.com/science/article/pii/S1097276519303569 Molecular Cell Volume 75, Issue 1, 11 July 2019, Pages 7-12 Journal home page for Molecular Cell
Screening an alternative flame retardant using biological and transcriptomic...
catalog.data.gov
datasets.ai
Updated Nov 12, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. EPA Office of Research and Development (ORD) (2020). Screening an alternative flame retardant using biological and transcriptomic endpoints in fish embryos [Dataset]. https://catalog.data.gov/dataset/screening-an-alternative-flame-retardant-using-biological-and-transcriptomic-endpoints-in-
Explore at:
Dataset updated
Nov 12, 2020
Dataset provided by
United States Environmental Protection Agencyhttp://www.epa.gov/
Description
GEO accession information for omics RNA-seq data. This dataset is not publicly accessible because: EPA cannot release personally identifiable information regarding living individuals, according to the Privacy Act and the Freedom of Information Act (FOIA). This dataset contains information about human research subjects. Because there is potential to identify individual participants and disclose personal information, either alone or in combination with other datasets, individual level data are not appropriate to post for public access. Restricted access may be granted to authorized persons by contacting the party listed. It can be accessed through the following means: https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE116393. Format: GEO accession information for omics RNA-seq data. This dataset is associated with the following publication: Huang, W., D. Bencic, R. Flick, D. Nacci, B. Clark, L. Burkhard, T. Lahren, and A. Biales. Characterization of the Fundulus heteroclitus embryo transcriptional response and development of a gene expression-based fingerprint of exposure for the alternative flame retardant, TBPH (bis (2-ethylhexyl)-tetrabromophthalate). ENVIRONMENTAL POLLUTION. Elsevier Science Ltd, New York, NY, USA, 247: 696-705, (2019).
Targeting CDK12 disrupts estrogen-receptor chromatin recruitment and ER-MED1...
figshare.com
xlsx
Updated Nov 16, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Daniela Ottaviani; Mihaela Ola; Damir Varešlija; Leonie Young (2025). Targeting CDK12 disrupts estrogen-receptor chromatin recruitment and ER-MED1 transcription in advanced ER+ breast cancer (RNA-seq: siCDK12 FPKM normalised counts for all samples) [Dataset]. http://doi.org/10.6084/m9.figshare.30618569.v3
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.30618569.v3
Dataset updated
Nov 16, 2025
Dataset provided by
Figsharehttp://figshare.com/
Authors
Daniela Ottaviani; Mihaela Ola; Damir Varešlija; Leonie Young
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
We carried out RNA-seq profiling of LY2 cells transfected with siRNA targeting CDK12 (siCDK12) and non-targeting control (siCtrl) for 48 hours.Raw RNAseq data files associated with this project are made available through GEO at:https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE309900
Supplemental Table S6
figshare.com
xlsx
Updated May 29, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Snezana Kojic (2025). Supplemental Table S6 [Dataset]. http://doi.org/10.6084/m9.figshare.28942037.v1
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.28942037.v1
Dataset updated
May 29, 2025
Dataset provided by
figshare
Figsharehttp://figshare.com/
Authors
Snezana Kojic
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
RNA-seq was performed using RNA isolated from non-injured and skeletal muscle at 5 days post-injury (dpi) of wt and ankrd1a mutant fish (four fish per group). Differential expression was calculated using a multi-factorial statistical analysis based on a negative binomial model that used a generalized linear model approach computed by edgeR (RRID:SCR_012802) from raw counts in each comparison (16 samples in 4 different groups). Gene ontology analysis was performed using clusterProfiler (RRID:SCR_016884) v4.12.0 and org.Dr.eg.db as a reference database. The complete RNA-seq data have been deposited in NCBI´s Gene Expression Omnibus (RRID:SCR_005012) and are accessible through GEO Series accession number GSE277480 (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE277480).
scRNA-seq B cells Nature Immunology 2018 GSE115795
kaggle.com
zip
Updated May 8, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Alexander Chervov (2022). scRNA-seq B cells Nature Immunology 2018 GSE115795 [Dataset]. https://www.kaggle.com/datasets/alexandervc/scrnaseq-b-cells-nature-immunology-2018-gse115795
Explore at:
zip(13784841 bytes)Available download formats
Dataset updated
May 8, 2022
Authors
Alexander Chervov
Description
Remark: for cell cycle analysis - see paper https://arxiv.org/abs/2208.05229 "Computational challenges of cell cycle analysis using single cell transcriptomics" Alexander Chervov, Andrei Zinovyev

Data and Context

Data - results of single cell RNA sequencing, i.e. rows - correspond to cells, columns to genes (or vice versa). value of the matrix shows how strong is "expression" of the corresponding gene in the corresponding cell. https://en.wikipedia.org/wiki/Single-cell_transcriptomics See tutorials: https://scanpy.readthedocs.io/en/stable/tutorials.html ("Scanpy" - main Python package to work with scRNA-seq data). Or https://satijalab.org/seurat/ "Seurat" - "R" package

Particular data: https://pubmed.ncbi.nlm.nih.gov/30104629/ Nat Immunol. 2018 Sep;19(9):1013-1024. doi: 10.1038/s41590-018-0181-4. Epub 2018 Aug 13. Human germinal center transcriptional programs are de-synchronized in B cell lymphoma Pierre Milpied 1, Iñaki Cervera-Marzal 2, Marie-Laure Mollichella 2, Bruno Tesson 3, Gabriel Brisou 2, Alexandra Traverse-Glehen 4, Gilles Salles 4, Lionel Spinelli 2, Bertrand Nadel 5

Data in two variants: 1) Downloaded from GEO: https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSM3190075 https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSM3190076 2) Downloaded from PanglaoDB: https://panglaodb.se/view_data.php?sra=SRA721637&srs=SRS3416994

Related datasets:

Other single cell RNA seq datasets can be found on kaggle: Look here: https://www.kaggle.com/alexandervc/datasets Or search kaggle for "scRNA-seq"

Inspiration

Single cell RNA sequencing is important technology in modern biology, see e.g. "Eleven grand challenges in single-cell data science" https://genomebiology.biomedcentral.com/articles/10.1186/s13059-020-1926-6

Also see review : Nature. P. Kharchenko: "The triumphs and limitations of computational methods for scRNA-seq" https://www.nature.com/articles/s41592-021-01171-x

Search scholar.google "challenges in single cell rna sequencing" https://scholar.google.fr/scholar?q=challenges+in+single+cell+rna+sequencing&hl=en&as_sdt=0&as_vis=1&oi=scholart gives many interesting and highly cited articles

(Cited 968) Computational and analytical challenges in single-cell transcriptomics Oliver Stegle, Sarah A. Teichmann, John C. Marioni Nat. Rev. Genet., 16 (3) (2015), pp. 133-145 https://www.nature.com/articles/nrg3833

Challenges in unsupervised clustering of single-cell RNA-seq data https://www.nature.com/articles/s41576-018-0088-9 Review Article 07 January 2019 Vladimir Yu Kiselev, Tallulah S. Andrews & Martin Hemberg Nature Reviews Genetics volume 20, pages273–282 (2019)

Challenges and emerging directions in single-cell analysis https://link.springer.com/article/10.1186/s13059-017-1218-y Published: 08 May 2017 Guo-Cheng Yuan, Long Cai, Michael Elowitz, Tariq Enver, Guoping Fan, Guoji Guo, Rafael Irizarry, Peter Kharchenko, Junhyong Kim, Stuart Orkin, John Quackenbush, Assieh Saadatpour, Timm Schroeder, Ramesh Shivdasani & Itay Tirosh Genome Biology volume 18, Article number: 84 (2017)

Single-Cell RNA Sequencing in Cancer: Lessons Learned and Emerging Challenges https://www.sciencedirect.com/science/article/pii/S1097276519303569 Molecular Cell Volume 75, Issue 1, 11 July 2019, Pages 7-12 Journal home page for Molecular Cell
scRNA-seq "Tabula muris" - mouse, 85 000+ cells
kaggle.com
zip
Updated Feb 3, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Alexander Chervov (2022). scRNA-seq "Tabula muris" - mouse, 85 000+ cells [Dataset]. https://www.kaggle.com/alexandervc/scrnaseq-tabula-muris-mouse-85-000-cells
Explore at:
zip(1488280068 bytes)Available download formats
Dataset updated
Feb 3, 2022
Authors
Alexander Chervov
Description
Remark: for cell cycle analysis - see paper https://arxiv.org/abs/2208.05229 "Computational challenges of cell cycle analysis using single cell transcriptomics" Alexander Chervov, Andrei Zinovyev

Data and Context

Data - results of single cell RNA sequencing, i.e. rows - correspond to cells, columns to genes (or vice versa). value of the matrix shows how strong is "expression" of the corresponding gene in the corresponding cell. https://en.wikipedia.org/wiki/Single-cell_transcriptomics

Particular data: "Tabula Muris" project https://tabula-muris.ds.czbiohub.org/ Tabula Muris is a compendium of single cell transcriptome data from the model organism Mus musculus, containing nearly 100,000 cells from 20 organs and tissues. The data allow for direct and controlled comparison of gene expression in cell types shared between tissues, such as immune cells from distinct anatomical locations. They also allow for a comparison of two distinct technical approaches:

microfluidic droplet-based 3’-end counting: provides a survey of thousands of cells per organ at relatively low coverage FACS-based full length transcript analysis: provides higher sensitivity and coverage. We hope this rich collection of annotated cells will be a useful resource for:

Defining gene expression in previously poorly-characterized cell populations. Validating findings in future targeted single-cell studies. Developing of methods for integrating datasets (eg between the FACS and droplet experiments), characterizing batch effects, and quantifying the variation of gene expression in a many cell types between organs and animals. The peer reviewed article describing the analysis and findings is available on Nature. https://www.nature.com/articles/s41586-018-0590-4 Nature volume 562, pages367–372 (2018)Cite this article

GEO: https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE109774

See also tutorials:

Course at Sanger's institute https://scrnaseq-course.cog.sanger.ac.uk/website/tabula-muris.html

Course at CZ-hub: https://chanzuckerberg.github.io/scRNA-python-workshop/intro/about

On kaggle - copies of the notebooks and data from the course above https://www.kaggle.com/aayush9753/singlecell-rnaseq-data-from-mouse-brain

Inspiration

Single cell RNA sequencing is important technology in modern biology, see e.g. "Eleven grand challenges in single-cell data science" https://genomebiology.biomedcentral.com/articles/10.1186/s13059-020-1926-6

Also see review : Nature. P. Kharchenko: "The triumphs and limitations of computational methods for scRNA-seq" https://www.nature.com/articles/s41592-021-01171-x
De novo Trinity transcript assembly of Atlantic cod
figshare.com
datasetcatalog.nlm.nih.gov
txt
Updated Jun 2, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Xiaokang Zhang; Tomasz Furmanek; Inge Jonassen; Anders Goksøyr (2023). De novo Trinity transcript assembly of Atlantic cod [Dataset]. http://doi.org/10.6084/m9.figshare.13067324.v2
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.13067324.v2
Dataset updated
Jun 2, 2023
Dataset provided by
figshare
Figsharehttp://figshare.com/
Authors
Xiaokang Zhang; Tomasz Furmanek; Inge Jonassen; Anders Goksøyr
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The reference is based on a de novo Trinity [1,2] assembly because the official gene models versions did not contain full length sequences for all genes. The assembly consists of sequences from the following RNA-Seq data: Gadus morhua Transcriptome or Gene expressionNCBI project : PRJNA277848https://www.ncbi.nlm.nih.gov/bioproject/?term=PRJNA277848Developmental stages: 10-dph, 20-dph, 30-dph, 45-dph, 60-dph, 90-dphSRR2045416 brain ,SRR2045417 gills, SRR2045418 heart, SRR2045419 muscle, SRR2045420 liver, SRR2045421 kidney, SRR2045422 bones, SRR2045423 intestine, SRR2045425 embryo, SRR2045415 ovaryThree cod liver samples:GEO accession: GSE106968 [3] https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE106968Samples: dcod_12_S3, dcod_1_S1, dcod_25_S25Reads were assembled using Trinity through the Agalma pipeline version 0.5.0 [4]. The transcript assemblies from different stages and tissue samples were mapped to both cod genomes (gadMor 1 and 2) [5,6]. Transcripts were mapped to both genomes as each genome is missing some genes.1 Grabherr, M.G. et al. (2011) Full-length transcriptome assembly from RNA-Seq data without a reference genome. Nat. Biotechnol. 29, 644–652 2 Haas, B.J. et al. (2013) De novo transcript sequence reconstruction from RNA-seq using the Trinity platform for reference generation and analysis. Nat. Protoc. 8, 1494–1512 3 Yadetie, F. et al. (2018) RNA-Seq analysis of transcriptome responses in Atlantic cod (Gadus morhua) precision-cut liver slices exposed to benzo[a]pyrene and 17α-ethynylestradiol. Aquat. Toxicol. 201, 174–186 4 Dunn, C.W. et al. (2013) Agalma: an automated phylogenomics workflow. BMC Bioinformatics 14, 330 5 Star, B. et al. (2011) The genome sequence of Atlantic cod reveals a unique immune system. Nature 477, 207–210 6 Tørresen, O.K. et al. (2017) An improved genome assembly uncovers prolific tandem repeats in Atlantic cod. BMC Genomics 18, 95
Data from: Harp: Platform Independent Deconvolution Tool
zenodo.org
bin, txt
Updated Feb 26, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Zahra Nozari; Paul Hüttl; Paul Hüttl; Jakob Simeth; Jakob Simeth; Marian Schön; James A. Hutchinson; Rainer Spang; Rainer Spang; Zahra Nozari; Marian Schön; James A. Hutchinson (2025). Harp: Platform Independent Deconvolution Tool [Dataset]. http://doi.org/10.5281/zenodo.14929934
Explore at:
txt, binAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.14929934
Dataset updated
Feb 26, 2025
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Zahra Nozari; Paul Hüttl; Paul Hüttl; Jakob Simeth; Jakob Simeth; Marian Schön; James A. Hutchinson; Rainer Spang; Rainer Spang; Zahra Nozari; Marian Schön; James A. Hutchinson
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
Feb 26, 2025
Description
Harp is a tool that estimates reference profiles and cell compositions for deconvolution of bulk transcriptomic data.

For evaluation the performance of Harp against other deconvolution tools we employed real bulk expression data (RNA-seq and microarray), along with their corresponding cell compositions from flow cytometry experiment as well as cell expression profiles measured through sorted RNA-seq and microarray technology.

These datasets contain combined processed RNA-seq, flow cytometry and microarray expression data that were utilized in the Harplication package, which applies the Harp algorithm along other deconvolution tools.

The original datasets are derived from the following studies:

Bulk RNA-seq expression with paired flow cytometry from Zimmermann et al., 2016. The datasets were received via SDY67.

Sorted RNA-seq expression data from Monaco et al., 2019 , available on NCBI under GEO accession number GSE107011.

Microarray gene expression data from Newman et al., 2015, available on NCBI under GEO accession number GSE65133.

Flow cytometry data from Newman et al., 2015, was received from the analysis of Vallania et al., 2018, also available on GSE65133.

Microarray based reference from Newman et al., 2015, available on CIBERSORTx.

This project has received funding from

The European Union’s Horizon 2020 research and innovation program under the Marie Skłodowska-Curie grant agreement No 860003.

Bundesministerium für Bildung und Forschung (BMBF, German Federal Ministry of Education and Research) [031L0173].

Facebook

Twitter

Click to copy link

Link copied

Cite

Bianca Habermann; Margaux Haering (2021). Extended data tables to Haering and Habermann, F1000Res, RNfuzzyApp: an R shiny RNA-seq data analysis app for visualisation, differential expression analysis, time-series clustering and enrichment analysis [Dataset]. http://doi.org/10.5061/dryad.8pk0p2nnd

Extended data tables to Haering and Habermann, F1000Res, RNfuzzyApp: an R shiny RNA-seq data analysis app for visualisation, differential expression analysis, time-series clustering and enrichment analysis

Explore at:

zipAvailable download formats

Unique identifier

https://doi.org/10.5061/dryad.8pk0p2nnd

Dataset updated

Jul 9, 2021

Dataset provided by

Institut de Biologie du Développement Marseille

Authors

Bianca Habermann; Margaux Haering

License

https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html

Description

Background

RNA-seq is a widely adopted affordable method for large scale gene expression profiling. However, user-friendly and versatile tools for wet-lab biologists to analyse RNA-seq data beyond standard analyses such as differential expression, are rare. Especially, the analysis of time-series data is difficult for wet-lab biologists lacking advanced computational training. Furthermore, most meta-analysis tools are tailored for model organisms and not easily adaptable to other species.

Results

With RNfuzzyApp, we provide a user-friendly, web-based R-shiny app for differential expression analysis, as well as time-series analysis of RNA-seq data. RNfuzzyApp offers several methods for normalization and differential expression analysis of RNA-seq data, providing easy-to-use toolboxes, interactive plots and downloadable results. For time-series analysis, RNfuzzyApp presents the first web-based, automated pipeline for soft clustering with the Mfuzz R package, including methods to aid in cluster number selection, Mfuzz loop computations, cluster overlap analysis, as well as cluster enrichments.

Conclusion

RNfuzzyApp is an intuitive, easy to use and interactive R shiny app for RNA-seq differential expression and time-series analysis, offering a rich selection of interactive plots, providing a quick overview of raw data and generating rapid analysis results. Furthermore, its orthology assignment, enrichment analysis, as well as ID conversion functions are accessible to non-model organisms.

Methods Haering_etal_extendedData_DmdevLeg_GSE143430_mean.txt: mean values calculated from raw reads of replicates, downloaded from gene expression omnibus (dataset GSE143430 https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE143430).

Haering_etal_extendedDatatable_1a_Tabulamurissenis_3vs12m_DEA.txt: Tabula muris senis limb muscle data (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE132040) from 3, 12 and 27month males, processed with RNfuzzyApp (https://gitlab.com/habermann_lab/rna-seq-analysis-app)

Haering_etal_extendedDatatable_1b_Tabulamurissenis_3vs27m_DEA.txt: Tabula muris senis limb muscle data (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE132040) from 3, 12 and 27month males, processed with RNfuzzyApp (https://gitlab.com/habermann_lab/rna-seq-analysis-app)

Haering_etal_extendedDatatable_1c_Tabulamurissenis_12vs27m_DEA.txt: Tabula muris senis limb muscle data (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE132040) from 3, 12 and 27month males, processed with RNfuzzyApp (https://gitlab.com/habermann_lab/rna-seq-analysis-app)

Haering_etal_extendedDatatable_1d_Tabulamurissenis_3vs12m_gpofiler.txt: Tabula muris senis limb muscle data (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE132040) from 3, 12 and 27month males, processed with RNfuzzyApp (https://gitlab.com/habermann_lab/rna-seq-analysis-app)

Haering_etal_extendedDatatable_1e_Tabulamurissenis_3vs27m_gpofiler.txt: Tabula muris senis limb muscle data (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE132040) from 3, 12 and 27month males, processed with RNfuzzyApp (https://gitlab.com/habermann_lab/rna-seq-analysis-app)

Haering_etal_extendedDatatable_1f_Tabulamurissenis_12vs27m_gpofiler.txt: Tabula muris senis limb muscle data (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE132040) from 3, 12 and 27month males, processed with RNfuzzyApp (https://gitlab.com/habermann_lab/rna-seq-analysis-app)

Haering_etal_extendedDatatable_2a_Tabulamurissenis_cluster1_gpofiler.txt: Tabula muris senis limb muscle data (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE132040) from 3, 12 and 27month males, processed with RNfuzzyApp (https://gitlab.com/habermann_lab/rna-seq-analysis-app)

Haering_etal_extendedDatatable_2b_Tabulamurissenis_cluster2_gpofiler.txt: Tabula muris senis limb muscle data (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE132040) from 3, 12 and 27month males, processed with RNfuzzyApp (https://gitlab.com/habermann_lab/rna-seq-analysis-app)

Haering_etal_extendedDatatable_2c_Tabulamurissenis_cluster3_gpofiler.txt: Tabula muris senis limb muscle data (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE132040) from 3, 12 and 27month males, processed with RNfuzzyApp (https://gitlab.com/habermann_lab/rna-seq-analysis-app)

Haering_etal_extendedDatatable_2d_Tabulamurissenis_cluster4_gpofiler.txt: Tabula muris senis limb muscle data (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE132040) from 3, 12 and 27month males, processed with RNfuzzyApp (https://gitlab.com/habermann_lab/rna-seq-analysis-app)

Haering_etal_extendedDatatable_2e_Tabulamurissenis_cluster5_gpofiler.txt: Tabula muris senis limb muscle data (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE132040) from 3, 12 and 27month males, processed with RNfuzzyApp (https://gitlab.com/habermann_lab/rna-seq-analysis-app)

Haering_etal_extendedDatatable_3a_DmLeg_cluster1_gpofiler.txt: Haering_etal_extendedData_DmdevLeg_GSE143430_mean.txt processed with RNfuzzyApp (https://gitlab.com/habermann_lab/rna-seq-analysis-app)

Haering_etal_extendedDatatable_3b_DmLeg_cluster2_gpofiler.txt: Haering_etal_extendedData_DmdevLeg_GSE143430_mean.txt processed with RNfuzzyApp (https://gitlab.com/habermann_lab/rna-seq-analysis-app)

Haering_etal_extendedDatatable_3c_DmLeg_cluster3_gpofiler.txt: Haering_etal_extendedData_DmdevLeg_GSE143430_mean.txt processed with RNfuzzyApp (https://gitlab.com/habermann_lab/rna-seq-analysis-app)

Haering_etal_extendedDatatable_3d_DmLeg_cluster4_gpofiler.txt: Haering_etal_extendedData_DmdevLeg_GSE143430_mean.txt processed with RNfuzzyApp (https://gitlab.com/habermann_lab/rna-seq-analysis-app)

Haering_etal_extendedDatatable_3e_DmLeg_cluster5_gpofiler.txt: Haering_etal_extendedData_DmdevLeg_GSE143430_mean.txt processed with RNfuzzyApp (https://gitlab.com/habermann_lab/rna-seq-analysis-app)

Haering_etal_extendedDatatable_3f_DmLeg_cluster6_gpofiler.txt: Haering_etal_extendedData_DmdevLeg_GSE143430_mean.txt processed with RNfuzzyApp (https://gitlab.com/habermann_lab/rna-seq-analysis-app)

Haering_etal_extendedDatatable_3g_DmLeg_cluster7_gpofiler.txt: Haering_etal_extendedData_DmdevLeg_GSE143430_mean.txt processed with RNfuzzyApp (https://gitlab.com/habermann_lab/rna-seq-analysis-app)

Haering_etal_extendedDatatable_3h_DmLeg_cluster8_gpofiler.txt: Haering_etal_extendedData_DmdevLeg_GSE143430_mean.txt processed with RNfuzzyApp (https://gitlab.com/habermann_lab/rna-seq-analysis-app)

Haering_etal_extendedDatatable_3i_DmLeg_cluster9_gpofiler.txt: Haering_etal_extendedData_DmdevLeg_GSE143430_mean.txt processed with RNfuzzyApp (https://gitlab.com/habermann_lab/rna-seq-analysis-app)

Haering_etal_extendedDatatable_3j_DmLeg_cluster10_gpofiler.txt: Haering_etal_extendedData_DmdevLeg_GSE143430_mean.txt processed with RNfuzzyApp (https://gitlab.com/habermann_lab/rna-seq-analysis-app)

Haering_etal_extendedDatatable_3k_DmLeg_cluster11_gpofiler.txt: Haering_etal_extendedData_DmdevLeg_GSE143430_mean.txt processed with RNfuzzyApp (https://gitlab.com/habermann_lab/rna-seq-analysis-app)

Haering_etal_extendedDatatable_3l_DmLeg_cluster12_gpofiler.txt: Haering_etal_extendedData_DmdevLeg_GSE143430_mean.txt processed with RNfuzzyApp (https://gitlab.com/habermann_lab/rna-seq-analysis-app)

Clear search

Close search

Google apps

Main menu

Extended data tables to Haering and Habermann, F1000Res, RNfuzzyApp: an R...

Repository for Single Cell RNA Sequencing Analysis of The EMT6 Dataset

1. Main Description

File Descriptions

Linked Files

Installation and Instructions

Strand-specific RNA-seq of nine mouse tissues

Strand-specific RNA-seq of nine rhesus macaque tissues

Raw mouse mammary RNA-Seq data (fastq)

barechey/PredictIO.data:

CITE-seq=scRNA-seq+Proteins: Challenge NeurIPS2021

Context

Particular data

Related datasets:

Inspiration

scRNA-seq + scATAC-seq Challenge at NeurIPS 2021

Context

Particular data

Related datasets:

Inspiration

RNA-seq of coding RNA from two inbred strains of maize, B73 and Mo17

Investigation of peroxisome proliferator-activated receptor delta...

Characterizing Novel Olfactory Receptors Expressed in the Murine Renal...

The Parathyroid Hormone-Regulated Transcriptome in Osteocytes: Parallel...

scRNA-seq data for A549 MCF7 K562 under drugs

Data and Context

Particular data:

Related datasets:

Inspiration

Screening an alternative flame retardant using biological and transcriptomic...

Targeting CDK12 disrupts estrogen-receptor chromatin recruitment and ER-MED1...

Supplemental Table S6

scRNA-seq B cells Nature Immunology 2018 GSE115795

Data and Context

Related datasets:

Inspiration

scRNA-seq "Tabula muris" - mouse, 85 000+ cells

Data and Context

See also tutorials:

Inspiration

De novo Trinity transcript assembly of Atlantic cod

Data from: Harp: Platform Independent Deconvolution Tool

Extended data tables to Haering and Habermann, F1000Res, RNfuzzyApp: an R shiny RNA-seq data analysis app for visualisation, differential expression analysis, time-series clustering and enrichment analysis