29 datasets found

o
Test Data for Galaxy tutorial "Batch Correction and Integration" - Seurat...
ordo.open.ac.uk
bin
Updated Apr 28, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Marisa Loach (2025). Test Data for Galaxy tutorial "Batch Correction and Integration" - Seurat version [Dataset]. http://doi.org/10.5281/zenodo.14713816
Explore at:
binAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.14713816
Dataset updated
Apr 28, 2025
Dataset provided by
The Open University
Authors
Marisa Loach
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This data is used for the Seurat version of the batch correction and integration tutorial on the Galaxy Training Network. The input data was provided by Seurat in the 'Integrative Analysis in Seurat v5' tutorial. The input dataset provided here has been filtered to include only cells for which nFeature_RNA > 1000. The other datasets were produced on Galaxy. The original dataset was published as: Ding, J., Adiconis, X., Simmons, S.K. et al. Systematic comparison of single-cell and single-nucleus RNA-sequencing methods. Nat Biotechnol 38, 737–746 (2020). https://doi.org/10.1038/s41587-020-0465-8.
o
Test Data for Galaxy tutorial "Clustering 3k PBMCs with Seurat" -...
ordo.open.ac.uk
bin
Updated Nov 14, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Marisa Loach (2024). Test Data for Galaxy tutorial "Clustering 3k PBMCs with Seurat" - SCTransform workflow [Dataset]. http://doi.org/10.5281/zenodo.14013637
Explore at:
binAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.14013637
Dataset updated
Nov 14, 2024
Dataset provided by
The Open University
Authors
Marisa Loach
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Test Data for Galaxy tutorial "Clustering 3k PBMCs with Seurat" - SCTransform workflow
scRNA-seq + scATAC-seq Challenge at NeurIPS 2021
kaggle.com
zip
Updated Sep 16, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Alexander Chervov (2022). scRNA-seq + scATAC-seq Challenge at NeurIPS 2021 [Dataset]. https://www.kaggle.com/datasets/alexandervc/scrnaseq-scatacseq-challenge-at-neurips-2021
Explore at:
zip(2917180928 bytes)Available download formats
Dataset updated
Sep 16, 2022
Authors
Alexander Chervov
Description
Context

Dataset from NeurIPS2021 challenge similar to Kaggle 2022 competition: https://www.kaggle.com/competitions/open-problems-multimodal "Open Problems - Multimodal Single-Cell Integration Predict how DNA, RNA & protein measurements co-vary in single cells"

It is https://en.wikipedia.org/wiki/ATAC-seq#Single-cell_ATAC-seq single cell ATAC-seq data. And single cell RNA-seq data: https://en.wikipedia.org/wiki/Single-cell_transcriptomics#Single-cell_RNA-seq

Single cell RNA sequencing, i.e. rows - correspond to cells, columns to genes (or vice versa). value of the matrix shows how strong is "expression" of the corresponding gene in the corresponding cell. https://en.wikipedia.org/wiki/Single-cell_transcriptomics

See tutorials: https://scanpy.readthedocs.io/en/stable/tutorials.html ("Scanpy" - main Python package to work with scRNA-seq data). Or https://satijalab.org/seurat/ "Seurat" - "R" package

(For companion dataset on CITE-seq = scRNA-seq + Proteomics, see: https://www.kaggle.com/datasets/alexandervc/citeseqscrnaseqproteins-challenge-neurips2021)

Particular data

https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE194122

Expression profiling by high throughput sequencing Genome binding/occupancy profiling by high throughput sequencing Summary Single-cell multiomics data collected from bone marrow mononuclear cells of 12 healthy human donors. Half the samples were measured using the 10X Multiome Gene Expression and Chromatin Accessability kit and half were measured using the 10X 3' Single-Cell Gene Expression kit with Feature Barcoding in combination with the BioLegend TotalSeq B Universal Human Panel v1.0. The dataset was generated to support Multimodal Single-Cell Data Integration Challenge at NeurIPS 2021. Samples were prepared using a standard protocol at four sites. The resulting data was then annotated to identify cell types and remove doublets. The dataset was designed with a nested batch layout such that some donor samples were measured at multiple sites with some donors measured at a single site. In the competition, participants were tasked with challenges including modality prediction, matching profiles from different modalities, and learning a joint embedding from multiple modalities.

Overall design Single-cell multiomics data collected from bone marrow mononuclear cells of 12 healthy human donors.

Contributor(s) Burkhardt DB, Lücken MD, Lance C, Cannoodt R, Pisco AO, Krishnaswamy S, Theis FJ, Bloom JM Citation https://datasets-benchmarks-proceedings.neurips.cc/paper/2021/hash/158f3069a435b314a80bdcb024f8e422-Abstract-round2.html

Related datasets:

Other single cell RNA seq datasets can be found on kaggle: Look here: https://www.kaggle.com/alexandervc/datasets Or search kaggle for "scRNA-seq"

Inspiration

Single cell RNA sequencing is important technology in modern biology, see e.g. "Eleven grand challenges in single-cell data science" https://genomebiology.biomedcentral.com/articles/10.1186/s13059-020-1926-6

Also see review : Nature. P. Kharchenko: "The triumphs and limitations of computational methods for scRNA-seq" https://www.nature.com/articles/s41592-021-01171-x

Search scholar.google "challenges in single cell rna sequencing" https://scholar.google.fr/scholar?q=challenges+in+single+cell+rna+sequencing&hl=en&as_sdt=0&as_vis=1&oi=scholart gives many interesting and highly cited articles

(Cited 968) Computational and analytical challenges in single-cell transcriptomics Oliver Stegle, Sarah A. Teichmann, John C. Marioni Nat. Rev. Genet., 16 (3) (2015), pp. 133-145 https://www.nature.com/articles/nrg3833
scMetabolism - pbmc_demo.rda
figshare.com
bin
Updated Jan 31, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yingcheng Wu (2021). scMetabolism - pbmc_demo.rda [Dataset]. http://doi.org/10.6084/m9.figshare.13670038.v1
Explore at:
binAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.13670038.v1
Dataset updated
Jan 31, 2021
Dataset provided by
Figsharehttp://figshare.com/
figshare
Authors
Yingcheng Wu
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The demo datasets for scMetabolismThe demo data is the dataset of Peripheral Blood Mononuclear Cells (PBMC) from 10X Genomics open access dataset (~2,700 single cells, also used by Seurat tutorial).
Data from: Harnessing single cell RNA sequencing to identify dendritic cell...
zenodo.org
tar
Updated Dec 31, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ammar Sabir Cheema; Kaibo Duan; Marc Dalod; Thien-Phong Vu Manh; Ammar Sabir Cheema; Kaibo Duan; Marc Dalod; Thien-Phong Vu Manh (2022). Harnessing single cell RNA sequencing to identify dendritic cell types, characterize their biological states and infer their activation trajectory [Dataset]. http://doi.org/10.5281/zenodo.5385611
Explore at:
tarAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.5385611
Dataset updated
Dec 31, 2022
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Ammar Sabir Cheema; Kaibo Duan; Marc Dalod; Thien-Phong Vu Manh; Ammar Sabir Cheema; Kaibo Duan; Marc Dalod; Thien-Phong Vu Manh
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Summary: Dendritic cells (DCs) orchestrate innate and adaptive immunity, by translating the sensing of distinct danger signals into the induction of different effector lymphocyte responses, to induce different defense mechanisms suited to face distinct types of threats. Hence, DCs are very plastic, which results from two key characteristics. First, DCs encompass distinct cell types specialized in different functions. Second, each DC type can undergo different activation states, fine-tuning its functions depending on its tissue microenvironment and the pathophysiological context, by adapting the output signals it delivers to the input signals it receives. Hence, to better understand DC biology and harness it in the clinic, we must determine which combinations of DC types and activation states mediate which functions, and how.
To decipher the nature, functions and regulation of DC types and their physiological activation states, one of the methods that can be harnessed most successfully is ex vivo single cell RNA sequencing (scRNAseq). However, for new users of this approach, determining which analytics strategy and computational tools to choose can be quite challenging, considering the rapid evolution and broad burgeoning of the field. In addition, awareness must be raised on the need for specific, robust and tractable strategies to annotate cells for cell type identity and activation states. It is also important to emphasize the necessity of examining whether similar cell activation trajectories are inferred by using different, complementary methods. In this chapter, we take these issues into account for providing a pipeline for scRNAseq analysis and illustrating it with a tutorial reanalyzing a public dataset of mononuclear phagocytes isolated from the lungs of naïve or tumor-bearing mice. We describe this pipeline step-by-step, including data quality controls, dimensionality reduction, cell clustering, cell cluster annotation, inference of the cell activation trajectories and investigation of the underpinning molecular regulation. It is accompanied with a more complete tutorial on Github. We anticipate that this method will be helpful for both wet lab and bioinformatics researchers interested in harnessing scRNAseq data for deciphering the biology of DCs or other cell types, and that it will contribute to establishing high standards in the field.

Data:

MDAlab_cDC1_maturation.tar : Docker image used for the analysis
scRNA-seq +CRISPR=Perturb-seq.Norman.SelectedPart
kaggle.com
zip
Updated Jul 20, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Alexander Chervov (2022). scRNA-seq +CRISPR=Perturb-seq.Norman.SelectedPart [Dataset]. https://www.kaggle.com/datasets/alexandervc/scrnaseq-crisprperturbseqnormanselectedpart
Explore at:
zip(158260526 bytes)Available download formats
Dataset updated
Jul 20, 2022
Authors
Alexander Chervov
Description
Remark 0: See https://arxiv.org/abs/2208.05229 "Computational challenges of cell cycle analysis using single cell transcriptomics" Alexander Chervov, Andrei Zinovyev For the cell cycle analysis

Remark 1:

Full dataset in https://www.kaggle.com/datasets/alexandervc/scrnaseq-crisprperturbseq-normanweissman But it is huge, and loading crashes memory, so here are cropped pieces to start with.

Remark 2:

dataset used in: "GEARS: Predicting transcriptional outcomes of novel multi-gene perturbations" Yusuf Roohani, Kexin Huang, Jure Leskovec https://www.biorxiv.org/content/10.1101/2022.07.12.499735v1 https://twitter.com/yusufroohani/status/1547965695744360448 Accepted in ICLR

Data and Context

Data - results of single cell RNA sequencing, i.e. rows - correspond to cells, columns to genes (or vice versa). value of the matrix shows how strong is "expression" of the corresponding gene in the corresponding cell. https://en.wikipedia.org/wiki/Single-cell_transcriptomics

See tutorials: https://scanpy.readthedocs.io/en/stable/tutorials.html ("Scanpy" - main Python package to work with scRNA-seq data). Or https://satijalab.org/seurat/ "Seurat" - "R" package

Particular data - Perturb-seq: Single-cell, pooled CRISPR screening experiment comparing the transcriptional effects of overexpressing genes alone or in combination

Paper: Norman TM, Horlbeck MA, Replogle JM, Ge AY et al. Exploring genetic interaction manifolds constructed from rich single-cell phenotypes. Science 2019 Aug 23;365(6455):786-793. PMID: 31395745 https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6746554/

Data: https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE133344

Related datasets:

Other single cell RNA seq datasets can be found on kaggle: Look here: https://www.kaggle.com/alexandervc/datasets Or search kaggle for "scRNA-seq"

Inspiration

Single cell RNA sequencing is important technology in modern biology, see e.g. "Eleven grand challenges in single-cell data science" https://genomebiology.biomedcentral.com/articles/10.1186/s13059-020-1926-6

Also see review : Nature. P. Kharchenko: "The triumphs and limitations of computational methods for scRNA-seq" https://www.nature.com/articles/s41592-021-01171-x

Search scholar.google "challenges in single cell rna sequencing" https://scholar.google.fr/scholar?q=challenges+in+single+cell+rna+sequencing&hl=en&as_sdt=0&as_vis=1&oi=scholart gives many interesting and highly cited articles

(Cited 968) Computational and analytical challenges in single-cell transcriptomics Oliver Stegle, Sarah A. Teichmann, John C. Marioni Nat. Rev. Genet., 16 (3) (2015), pp. 133-145 https://www.nature.com/articles/nrg3833

Challenges in unsupervised clustering of single-cell RNA-seq data https://www.nature.com/articles/s41576-018-0088-9 Review Article 07 January 2019 Vladimir Yu Kiselev, Tallulah S. Andrews & Martin Hemberg Nature Reviews Genetics volume 20, pages273–282 (2019)

Challenges and emerging directions in single-cell analysis https://link.springer.com/article/10.1186/s13059-017-1218-y Published: 08 May 2017 Guo-Cheng Yuan, Long Cai, Michael Elowitz, Tariq Enver, Guoping Fan, Guoji Guo, Rafael Irizarry, Peter Kharchenko, Junhyong Kim, Stuart Orkin, John Quackenbush, Assieh Saadatpour, Timm Schroeder, Ramesh Shivdasani & Itay Tirosh Genome Biology volume 18, Article number: 84 (2017)

Single-Cell RNA Sequencing in Cancer: Lessons Learned and Emerging Challenges https://www.sciencedirect.com/science/article/pii/S1097276519303569 Molecular Cell Volume 75, Issue 1, 11 July 2019, Pages 7-12 Journal home page for Molecular Cell
Z
Immcantation 10x Tutorial Data
data.niaid.nih.gov
Updated Oct 20, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Gabernet, Gisela; Meng, Hailong; Jensen, Cole (2023). Immcantation 10x Tutorial Data [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_8179845
Explore at:
Dataset updated
Oct 20, 2023
Authors
Gabernet, Gisela; Meng, Hailong; Jensen, Cole
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Necessary datasets to run the Immcantation 10x Tutorial. Below is the description of the files in the data set.

BCR_data_sample1.tsv: data corresponding to the first sample (sample 1) of the two samples analyzed in the 10x tutorial. This is the sample used to show the Change-O steps.

filtered_contig_annotations.csv: filtered contig annotations file for sample 1, output of cellranger vdj.

filtered_contig.fasta: sequence fasta file for sample 1, output of cellranger vdj.

BCR_data.tsv: AIRR rearrangement file containing the data for both samples 1 and 2 used in the 10x tutorial.

BCR.data_08112023.rds: R dataframe object containing the single-cell BCR sequencing data for both samples 1 and 2 used in the 10x tutorial.

GEX.data_08112023.rds: Seurat object containing the single-cell gene expression data used in the 10x tutorial.
E
Data from: Evolution of pallium, hippocampus, and cortical cell types...
edmond.mpg.de
zip
Updated Dec 18, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Tracy Yamawaki; Maria Antonietta Tosches; Georgi Tushev; Robert K. Naumann; Ariel Jacobi; gilles Laurent; Tracy Yamawaki; Maria Antonietta Tosches; Georgi Tushev; Robert K. Naumann; Ariel Jacobi; gilles Laurent (2025). Evolution of pallium, hippocampus, and cortical cell types revealed by single-cell transcriptomics in reptiles [Dataset]. http://doi.org/10.17617/3.OODZLY
Explore at:
zip(5140410567), zip(2169610), zip(5113858203), zip(1265321033)Available download formats
Unique identifier
https://doi.org/10.17617/3.OODZLY
Dataset updated
Dec 18, 2025
Dataset provided by
Edmond
Authors
Tracy Yamawaki; Maria Antonietta Tosches; Georgi Tushev; Robert K. Naumann; Ariel Jacobi; gilles Laurent; Tracy Yamawaki; Maria Antonietta Tosches; Georgi Tushev; Robert K. Naumann; Ariel Jacobi; gilles Laurent
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
Single-cell RNA-seq datasets from turtle and lizard samples, provided as Seurat objects in both .Robj and .h5seurat formats. Each dataset includes the raw gene–cell count matrix, cell and gene metadata, normalized and batch-corrected data, PCA and t-SNE embeddings, and cluster annotations. For more details on Seurat usage and data structures, refer to the Seurat tutorials at this link. The original objects were created using Seurat version 1.4 (October 2016) and have since been updated for compatibility with Seurat version 2.3.4 (.Robj files) and Seurat version 5.3.0 (.h5seurat files). ANNOTATION FILES Genome annotation RefFlat files with the extended annotations (extension based on MACE results). annChrPicBel_19Apr2016.refFlat (turtle) lizard_annotation.refFlat (lizard) Functional annotation EGGNOG chrysemys_eggnog_pruned.txt (turtle) pogona_eggnog_pruned.txt (lizard) mus_eggnog_pruned.txt (mouse) homo_eggnog_pruned.txt (human) These files were generated from the functional annotations of turtle, lizard, mouse and human genes produced by EggNOG Mapper. The original EggNOG Mapper annotations were pruned to remove ambiguous terms (e.g. one-to-many assignments). Matching functional annotations were used to identify one-to-one orthologs across species. Transcription factors list 170712_TFs_list_ensembl.txt List of human transcription factors, annotated in ENSEMBL under the GO terms GO:0003700 (transcription factor activity), GO:0003702 (RNA polymerase II transcription factor activity), GO:0003709 (RNA polymerase III transcription factor activity), GO:0016563 (transcriptional activator activity) and GO:0016564 (transcriptional repressor activity).
CITE-seq=scRNA-seq+Proteins: Challenge NeurIPS2021
kaggle.com
zip
Updated Jan 22, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Alexander Chervov (2023). CITE-seq=scRNA-seq+Proteins: Challenge NeurIPS2021 [Dataset]. https://www.kaggle.com/datasets/alexandervc/citeseqscrnaseqproteins-challenge-neurips2021
Explore at:
zip(646191284 bytes)Available download formats
Dataset updated
Jan 22, 2023
Authors
Alexander Chervov
Description
Context

Dataset from NeurIPS2021 challenge similar to Kaggle 2022 competition: https://www.kaggle.com/competitions/open-problems-multimodal "Open Problems - Multimodal Single-Cell Integration Predict how DNA, RNA & protein measurements co-vary in single cells"

CITE-seq - joint single cell RNA sequencing + single cell measurements of CD** proteins. (https://en.wikipedia.org/wiki/CITE-Seq) (For companion dataset on scRNA-seq + scATAC-seq, see: https://www.kaggle.com/datasets/alexandervc/scrnaseq-scatacseq-challenge-at-neurips-2021 )

Single cell RNA sequencing, i.e. rows - correspond to cells, columns to genes (or vice versa). value of the matrix shows how strong is "expression" of the corresponding gene in the corresponding cell. https://en.wikipedia.org/wiki/Single-cell_transcriptomics

See tutorials: https://scanpy.readthedocs.io/en/stable/tutorials.html ("Scanpy" - main Python package to work with scRNA-seq data). Or https://satijalab.org/seurat/ "Seurat" - "R" package

Particular data

https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE194122

Expression profiling by high throughput sequencing Genome binding/occupancy profiling by high throughput sequencing Summary Single-cell multiomics data collected from bone marrow mononuclear cells of 12 healthy human donors. Half the samples were measured using the 10X Multiome Gene Expression and Chromatin Accessability kit and half were measured using the 10X 3' Single-Cell Gene Expression kit with Feature Barcoding in combination with the BioLegend TotalSeq B Universal Human Panel v1.0. The dataset was generated to support Multimodal Single-Cell Data Integration Challenge at NeurIPS 2021. Samples were prepared using a standard protocol at four sites. The resulting data was then annotated to identify cell types and remove doublets. The dataset was designed with a nested batch layout such that some donor samples were measured at multiple sites with some donors measured at a single site. In the competition, participants were tasked with challenges including modality prediction, matching profiles from different modalities, and learning a joint embedding from multiple modalities.

Overall design Single-cell multiomics data collected from bone marrow mononuclear cells of 12 healthy human donors.

Contributor(s) Burkhardt DB, Lücken MD, Lance C, Cannoodt R, Pisco AO, Krishnaswamy S, Theis FJ, Bloom JM Citation https://datasets-benchmarks-proceedings.neurips.cc/paper/2021/hash/158f3069a435b314a80bdcb024f8e422-Abstract-round2.html

Related datasets:

Other single cell RNA seq datasets can be found on kaggle: Look here: https://www.kaggle.com/alexandervc/datasets Or search kaggle for "scRNA-seq"

Inspiration

Single cell RNA sequencing is important technology in modern biology, see e.g. "Eleven grand challenges in single-cell data science" https://genomebiology.biomedcentral.com/articles/10.1186/s13059-020-1926-6

Also see review : Nature. P. Kharchenko: "The triumphs and limitations of computational methods for scRNA-seq" https://www.nature.com/articles/s41592-021-01171-x

Search scholar.google "challenges in single cell rna sequencing" https://scholar.google.fr/scholar?q=challenges+in+single+cell+rna+sequencing&hl=en&as_sdt=0&as_vis=1&oi=scholart gives many interesting and highly cited articles

(Cited 968) Computational and analytical challenges in single-cell transcriptomics Oliver Stegle, Sarah A. Teichmann, John C. Marioni Nat. Rev. Genet., 16 (3) (2015), pp. 133-145 https://www.nature.com/articles/nrg3833
PBMC 3k test datasets for besca
zenodo.org
bin
Updated Jan 18, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Klas Hatje; Klas Hatje (2021). PBMC 3k test datasets for besca [Dataset]. http://doi.org/10.5281/zenodo.3752813
Explore at:
binAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.3752813
Dataset updated
Jan 18, 2021
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Klas Hatje; Klas Hatje
License
https://www.gnu.org/licenses/agpl.txthttps://www.gnu.org/licenses/agpl.txt
Description
This is a single cell transcriptomics dataset containing roughly 3,000 PBMCs. The original data was downloaded from the Seurat 3k PBMC tutorial: https://satijalab.org/seurat/v3.0/pbmc3k_tutorial.html. We reprocessed the dataset using the Besca package (https://github.com/bedapub/besca).
Processed AnnData objects for GeneTrajectory inference (Gene Trajectory...
figshare.com
hdf
Updated Apr 4, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Rihao Qu; Francesco Strino (2024). Processed AnnData objects for GeneTrajectory inference (Gene Trajectory Inference for Single-cell Data by Optimal Transport Metrics) [Dataset]. http://doi.org/10.6084/m9.figshare.25539547.v1
Explore at:
hdfAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.25539547.v1
Dataset updated
Apr 4, 2024
Dataset provided by
Figsharehttp://figshare.com/
figshare
Authors
Rihao Qu; Francesco Strino
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
These are processed AnnData objects (converted from Seurat objects) for GeneTrajectory tutorials (https://github.com/KlugerLab/GeneTrajectory-python/):Human myeloid dataset analysisMyeloid cells were extracted from a publicly available 10x scRNA-seq dataset (https:// support.10xgenomics.com/single-cell-gene-expression/datasets/3.0.0/pbmc 10k v3). QC was performed using the same workflow in (https://github.com/satijalab/ Integration2019/blob/master/preprocessing scripts/pbmc 10k v3.R). After standard normalization, highly-variable gene selection and scaling using the Seurat R package, we applied PCA and retained the top 30 principal components. Four sub-clusters of myeloid cells were identified based on Louvian clustering with a resolution of 0.3. Wilcoxon rank-sum test was employed to find cluster-specific gene markers for cell type annotation.For gene trajectory inference, we first applied Diffusion Map on the cell PC embedding (using a local-adaptive kernel, each bandwidth is determined by the distance to its k-nearest neighbor, k = 10) to generate a spectral embedding of cells. We constructed a cell-cell kNN (k = 10) graph based on their coordinates of the top 5 non-trivial Diffusion Map eigenvectors. Among the top 2,000 variable genes, genes expressed by 0.5% − 75% of cells were retained for pairwise gene-gene Wasserstein distance computation. The original cell graph was coarse-grained into a graph of size 1,000. We then built a gene-gene graph where the affinity between genes is transformed from the Wasserstein distance using a Gaussian kernel (local-adaptive, k = 5). Diffusion Map was employed to visualize the embedding of gene graph. For trajectory identification, we used a series of time steps (11,21,8) to extract three gene trajectories.Mouse embryo skin data analysisWe separated out dermal cell populations from the newly collected mouse embryo skin samples. Cells from the wildtype and the Wls mutant were pooled for analyses. After standard normalization, highly-variable gene selection and scaling using Seurat, we applied PCA and retained the top 30 principal components. Three dermal celltypes were stratified based on the expression of canonical dermal markers, including Sox2, Dkk1, and Dkk2. For gene trajectory inference, we first applied Diffusion Map on the cell PC embedding (using a local-adaptive kernel bandwidth, k = 10) to generate a spectral embedding of cells. We constructed a cell-cell kNN (k = 10) graph based on their coordinates of the top 10 non-trivial Diffusion Map eigenvectors. Among the top 2,000 variable genes, genes expressed by 1% − 50% of cells were retained for pairwise gene-gene Wasserstein distance computation. The original cell graph was coarse-grained into a graph of size 1,000. We then built a gene-gene graph where the affinity between genes is transformed from the Wasserstein distance using a Gaussian kernel (local-adaptive, k = 5). Diffusion Map was employed to visualize the embedding of gene graph. For trajectory identification, we used a series of time steps (9,16,5) to sequentially extract three gene trajectories. To compare the differences between the wiltype and the Wls mutant, we stratified Wnt-active UD cells into seven stages according to their expression profiles of the genes binned along the DC gene trajectory.
o
Introduction to single cell RNAseq analysis: supplementary material
explore.openaire.eu
Updated Apr 14, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jose Alejandro Romero Herrera; Samuele Soraggi (2023). Introduction to single cell RNAseq analysis: supplementary material [Dataset]. http://doi.org/10.5281/zenodo.7920686
Explore at:
Unique identifier
https://doi.org/10.5281/zenodo.7920686
Dataset updated
Apr 14, 2023
Authors
Jose Alejandro Romero Herrera; Samuele Soraggi
Description
This archive contains supplementary material used in the workshop "Introduction to single cell RNAseq analysis" taught by the Danish National Sandbox for Health Data Science. The course repo can be found on Github. Data.zip contains 6 10x runs on Spermatogonia development. 3 from healthy individuals and 3 from azoospermic individuals. Data has been already preprocessed using cellranger and can be loaded using Seurat (R) or scanpy (python). Slides.zip contains slides explaning theory regarding single cell RNAseq data analysis Notebooks.zip contains Rmarkdown files to follow the course in using R in Rstudio. Updated version of the notebooks.
Example single-cell RNA-Seq dataset for 15K peripheral blood T-cells
zenodo.org
bin
Updated Aug 23, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Dylan Kotliar; Dylan Kotliar (2024). Example single-cell RNA-Seq dataset for 15K peripheral blood T-cells [Dataset]. http://doi.org/10.5281/zenodo.13368041
Explore at:
binAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.13368041
Dataset updated
Aug 23, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Dylan Kotliar; Dylan Kotliar
Description
This dataset is derived from https://zenodo.org/records/6120249 by first subsetting cells clustered as T-cells and then downsampling to 15,000 cells. We also converted it into SEURAT format in the .rds file for a tutorial on how to run starCAT from data in R. We did not generate this dataset and don't claim any rights to it.
scRNA-seq B cells Nature Immunology 2018 GSE115795
kaggle.com
zip
Updated May 8, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Alexander Chervov (2022). scRNA-seq B cells Nature Immunology 2018 GSE115795 [Dataset]. https://www.kaggle.com/datasets/alexandervc/scrnaseq-b-cells-nature-immunology-2018-gse115795
Explore at:
zip(13784841 bytes)Available download formats
Dataset updated
May 8, 2022
Authors
Alexander Chervov
Description
Remark: for cell cycle analysis - see paper https://arxiv.org/abs/2208.05229 "Computational challenges of cell cycle analysis using single cell transcriptomics" Alexander Chervov, Andrei Zinovyev

Data and Context

Data - results of single cell RNA sequencing, i.e. rows - correspond to cells, columns to genes (or vice versa). value of the matrix shows how strong is "expression" of the corresponding gene in the corresponding cell. https://en.wikipedia.org/wiki/Single-cell_transcriptomics See tutorials: https://scanpy.readthedocs.io/en/stable/tutorials.html ("Scanpy" - main Python package to work with scRNA-seq data). Or https://satijalab.org/seurat/ "Seurat" - "R" package

Particular data: https://pubmed.ncbi.nlm.nih.gov/30104629/ Nat Immunol. 2018 Sep;19(9):1013-1024. doi: 10.1038/s41590-018-0181-4. Epub 2018 Aug 13. Human germinal center transcriptional programs are de-synchronized in B cell lymphoma Pierre Milpied 1, Iñaki Cervera-Marzal 2, Marie-Laure Mollichella 2, Bruno Tesson 3, Gabriel Brisou 2, Alexandra Traverse-Glehen 4, Gilles Salles 4, Lionel Spinelli 2, Bertrand Nadel 5

Data in two variants: 1) Downloaded from GEO: https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSM3190075 https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSM3190076 2) Downloaded from PanglaoDB: https://panglaodb.se/view_data.php?sra=SRA721637&srs=SRS3416994

Related datasets:

Other single cell RNA seq datasets can be found on kaggle: Look here: https://www.kaggle.com/alexandervc/datasets Or search kaggle for "scRNA-seq"

Inspiration

Single cell RNA sequencing is important technology in modern biology, see e.g. "Eleven grand challenges in single-cell data science" https://genomebiology.biomedcentral.com/articles/10.1186/s13059-020-1926-6

Also see review : Nature. P. Kharchenko: "The triumphs and limitations of computational methods for scRNA-seq" https://www.nature.com/articles/s41592-021-01171-x

Search scholar.google "challenges in single cell rna sequencing" https://scholar.google.fr/scholar?q=challenges+in+single+cell+rna+sequencing&hl=en&as_sdt=0&as_vis=1&oi=scholart gives many interesting and highly cited articles

(Cited 968) Computational and analytical challenges in single-cell transcriptomics Oliver Stegle, Sarah A. Teichmann, John C. Marioni Nat. Rev. Genet., 16 (3) (2015), pp. 133-145 https://www.nature.com/articles/nrg3833

Challenges in unsupervised clustering of single-cell RNA-seq data https://www.nature.com/articles/s41576-018-0088-9 Review Article 07 January 2019 Vladimir Yu Kiselev, Tallulah S. Andrews & Martin Hemberg Nature Reviews Genetics volume 20, pages273–282 (2019)

Challenges and emerging directions in single-cell analysis https://link.springer.com/article/10.1186/s13059-017-1218-y Published: 08 May 2017 Guo-Cheng Yuan, Long Cai, Michael Elowitz, Tariq Enver, Guoping Fan, Guoji Guo, Rafael Irizarry, Peter Kharchenko, Junhyong Kim, Stuart Orkin, John Quackenbush, Assieh Saadatpour, Timm Schroeder, Ramesh Shivdasani & Itay Tirosh Genome Biology volume 18, Article number: 84 (2017)

Single-Cell RNA Sequencing in Cancer: Lessons Learned and Emerging Challenges https://www.sciencedirect.com/science/article/pii/S1097276519303569 Molecular Cell Volume 75, Issue 1, 11 July 2019, Pages 7-12 Journal home page for Molecular Cell
scRNA-seq data for Riba...Molina paper "DeepCycle"
kaggle.com
zip
Updated Feb 6, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Alexander Chervov (2022). scRNA-seq data for Riba...Molina paper "DeepCycle" [Dataset]. https://www.kaggle.com/datasets/alexandervc/scrnaseq-data-for-ribamolina-paper-deepcycle/code
Explore at:
zip(1757947795 bytes)Available download formats
Dataset updated
Feb 6, 2022
Authors
Alexander Chervov
Description
Remark: for cell cycle analysis - see paper https://arxiv.org/abs/2208.05229 "Computational challenges of cell cycle analysis using single cell transcriptomics" Alexander Chervov, Andrei Zinovyev

Data and Context

Data - results of single cell RNA sequencing, i.e. rows - correspond to cells, columns to genes (csv file is vice versa). value of the matrix shows how strong is "expression" of the corresponding gene in the corresponding cell. https://en.wikipedia.org/wiki/Single-cell_transcriptomics

See tutorials: https://scanpy.readthedocs.io/en/stable/tutorials.html ("Scanpy" - main Python package to work with scRNA-seq data). Or https://satijalab.org/seurat/ "Seurat" - "R" package

Particular data:

Paper: Cell cycle gene regulation dynamics revealed by RNA velocity and deep-learning https://www.nature.com/articles/s41467-022-30545-8 https://www.biorxiv.org/content/10.1101/2021.03.17.435887v1.full Andrea Riba, Attila Oravecz, Matej Durik, Sara Jiménez, Violaine Alunni, Marie Cerciat, Matthieu Jung, Céline Keime, William M. Keyes, View ORCID ProfileNacho Molina doi: https://doi.org/10.1101/2021.03.17.435887

datasets of processed single-cell RNA-seq for mESC and human fibroblasts related to GEO dataset GSE167609 and the preprint https://doi.org/10.1101/2021.03.17.435887

Data: ZENODO: https://zenodo.org/record/4719436#.YlartShBy38

Related datasets:

Other single cell RNA seq datasets can be found on kaggle: Look here: https://www.kaggle.com/alexandervc/datasets Or search kaggle for "scRNA-seq"

Inspiration

Single cell RNA sequencing is important technology in modern biology, see e.g. "Eleven grand challenges in single-cell data science" https://genomebiology.biomedcentral.com/articles/10.1186/s13059-020-1926-6

Also see review : Nature. P. Kharchenko: "The triumphs and limitations of computational methods for scRNA-seq" https://www.nature.com/articles/s41592-021-01171-x

Search scholar.google "challenges in single cell rna sequencing" https://scholar.google.fr/scholar?q=challenges+in+single+cell+rna+sequencing&hl=en&as_sdt=0&as_vis=1&oi=scholart gives many interesting and highly cited articles

(Cited 968) Computational and analytical challenges in single-cell transcriptomics Oliver Stegle, Sarah A. Teichmann, John C. Marioni Nat. Rev. Genet., 16 (3) (2015), pp. 133-145 https://www.nature.com/articles/nrg3833

Challenges in unsupervised clustering of single-cell RNA-seq data https://www.nature.com/articles/s41576-018-0088-9 Review Article 07 January 2019 Vladimir Yu Kiselev, Tallulah S. Andrews & Martin Hemberg Nature Reviews Genetics volume 20, pages273–282 (2019)

Challenges and emerging directions in single-cell analysis https://link.springer.com/article/10.1186/s13059-017-1218-y Published: 08 May 2017 Guo-Cheng Yuan, Long Cai, Michael Elowitz, Tariq Enver, Guoping Fan, Guoji Guo, Rafael Irizarry, Peter Kharchenko, Junhyong Kim, Stuart Orkin, John Quackenbush, Assieh Saadatpour, Timm Schroeder, Ramesh Shivdasani & Itay Tirosh Genome Biology volume 18, Article number: 84 (2017)

Single-Cell RNA Sequencing in Cancer: Lessons Learned and Emerging Challenges https://www.sciencedirect.com/science/article/pii/S1097276519303569 Molecular Cell Volume 75, Issue 1, 11 July 2019, Pages 7-12 Journal home page for Molecular Cell
scRNA-seq developing murine cerebellum Nature2019
kaggle.com
zip
Updated May 19, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Alexander Chervov (2022). scRNA-seq developing murine cerebellum Nature2019 [Dataset]. https://www.kaggle.com/alexandervc/scrnaseq-developing-murine-cerebellum-nature2019
Explore at:
zip(443360698 bytes)Available download formats
Dataset updated
May 19, 2022
Authors
Alexander Chervov
Description
Remark: for cell cycle analysis - see paper https://arxiv.org/abs/2208.05229 "Computational challenges of cell cycle analysis using single cell transcriptomics" Alexander Chervov, Andrei Zinovyev

Data and Context

Data - results of single cell RNA sequencing, i.e. rows - correspond to cells, columns to genes (or vice versa). value of the matrix shows how strong is "expression" of the corresponding gene in the corresponding cell. https://en.wikipedia.org/wiki/Single-cell_transcriptomics

See tutorials: https://scanpy.readthedocs.io/en/stable/tutorials.html ("Scanpy" - main Python package to work with scRNA-seq data). Or https://satijalab.org/seurat/ "Seurat" - "R" package

Particular data: Paper: Vladoiu MC, El-Hamamy I, Donovan LK, Farooq H et al. Childhood cerebellar tumours mirror conserved fetal transcriptional programs. Nature 2019 Aug;572(7767):67-73. PMID: 31043743 https://pubmed.ncbi.nlm.nih.gov/31043743/

Data: https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE118068

Related datasets:

Other single cell RNA seq datasets can be found on kaggle: Look here: https://www.kaggle.com/alexandervc/datasets Or search kaggle for "scRNA-seq"

Inspiration

Single cell RNA sequencing is important technology in modern biology, see e.g. "Eleven grand challenges in single-cell data science" https://genomebiology.biomedcentral.com/articles/10.1186/s13059-020-1926-6

Also see review : Nature. P. Kharchenko: "The triumphs and limitations of computational methods for scRNA-seq" https://www.nature.com/articles/s41592-021-01171-x

Search scholar.google "challenges in single cell rna sequencing" https://scholar.google.fr/scholar?q=challenges+in+single+cell+rna+sequencing&hl=en&as_sdt=0&as_vis=1&oi=scholart gives many interesting and highly cited articles

(Cited 968) Computational and analytical challenges in single-cell transcriptomics Oliver Stegle, Sarah A. Teichmann, John C. Marioni Nat. Rev. Genet., 16 (3) (2015), pp. 133-145 https://www.nature.com/articles/nrg3833

Challenges in unsupervised clustering of single-cell RNA-seq data https://www.nature.com/articles/s41576-018-0088-9 Review Article 07 January 2019 Vladimir Yu Kiselev, Tallulah S. Andrews & Martin Hemberg Nature Reviews Genetics volume 20, pages273–282 (2019)

Challenges and emerging directions in single-cell analysis https://link.springer.com/article/10.1186/s13059-017-1218-y Published: 08 May 2017 Guo-Cheng Yuan, Long Cai, Michael Elowitz, Tariq Enver, Guoping Fan, Guoji Guo, Rafael Irizarry, Peter Kharchenko, Junhyong Kim, Stuart Orkin, John Quackenbush, Assieh Saadatpour, Timm Schroeder, Ramesh Shivdasani & Itay Tirosh Genome Biology volume 18, Article number: 84 (2017)

Single-Cell RNA Sequencing in Cancer: Lessons Learned and Emerging Challenges https://www.sciencedirect.com/science/article/pii/S1097276519303569 Molecular Cell Volume 75, Issue 1, 11 July 2019, Pages 7-12 Journal home page for Molecular Cell
Single cell RNAseq data U2OS cell line
kaggle.com
zip
Updated Jun 13, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Alexander Chervov (2022). Single cell RNAseq data U2OS cell line [Dataset]. https://www.kaggle.com/alexandervc/single-cell-rnaseq-data-related-to-cell-cycle
Explore at:
zip(396557698 bytes)Available download formats
Dataset updated
Jun 13, 2022
Authors
Alexander Chervov
Description
Context

See paper: https://arxiv.org/abs/2208.05229 on cell cycle analysis discussed here. "Computational challenges of cell cycle analysis using single cell transcriptomics" Alexander Chervov, Andrei Zinovyev Dataset is very high quality with FUCCI cell cycle labels, it is one of the rare examples where TWO subpopulations with "normal" and "fast" proliferation can be clearly seen within seemingly homegeneous cells - one cell line U2OS. For scRNA-seq that phenomena was first discovered in the paper above, with scripts given here.

Data and Context

Data - results of single cell RNA sequencing, i.e. rows - correspond to cells, columns to genes (or vice versa). value of the matrix shows how strong is "expression" of the corresponding gene in the corresponding cell. https://en.wikipedia.org/wiki/Single-cell_transcriptomics

See tutorials: https://scanpy.readthedocs.io/en/stable/tutorials.html ("Scanpy" - main Python package to work with scRNA-seq data). Or https://satijalab.org/seurat/ "Seurat" - "R" package

Particular data: Paper: https://www.nature.com/articles/s41586-021-03232-9 Spatiotemporal dissection of the cell cycle with single-cell proteogenomics Emma Lundberg et.al.

Related datasets:

Other single cell RNA seq datasets can be found on kaggle: Look here: https://www.kaggle.com/alexandervc/datasets Or search kaggle for "scRNA-seq"

Inspiration

Single cell RNA sequencing is important technology in modern biology, see e.g. "Eleven grand challenges in single-cell data science" https://genomebiology.biomedcentral.com/articles/10.1186/s13059-020-1926-6

Also see review : Nature. P. Kharchenko: "The triumphs and limitations of computational methods for scRNA-seq" https://www.nature.com/articles/s41592-021-01171-x

Search scholar.google "challenges in single cell rna sequencing" https://scholar.google.fr/scholar?q=challenges+in+single+cell+rna+sequencing&hl=en&as_sdt=0&as_vis=1&oi=scholart gives many interesting and highly cited articles

(Cited 968) Computational and analytical challenges in single-cell transcriptomics Oliver Stegle, Sarah A. Teichmann, John C. Marioni Nat. Rev. Genet., 16 (3) (2015), pp. 133-145 https://www.nature.com/articles/nrg3833

Challenges in unsupervised clustering of single-cell RNA-seq data https://www.nature.com/articles/s41576-018-0088-9 Review Article 07 January 2019 Vladimir Yu Kiselev, Tallulah S. Andrews & Martin Hemberg Nature Reviews Genetics volume 20, pages273–282 (2019)

Challenges and emerging directions in single-cell analysis https://link.springer.com/article/10.1186/s13059-017-1218-y Published: 08 May 2017 Guo-Cheng Yuan, Long Cai, Michael Elowitz, Tariq Enver, Guoping Fan, Guoji Guo, Rafael Irizarry, Peter Kharchenko, Junhyong Kim, Stuart Orkin, John Quackenbush, Assieh Saadatpour, Timm Schroeder, Ramesh Shivdasani & Itay Tirosh Genome Biology volume 18, Article number: 84 (2017)

Single-Cell RNA Sequencing in Cancer: Lessons Learned and Emerging Challenges https://www.sciencedirect.com/science/article/pii/S1097276519303569 Molecular Cell Volume 75, Issue 1, 11 July 2019, Pages 7-12 Journal home page for Molecular Cell
scRNA-seq developing human immune system
kaggle.com
zip
Updated May 13, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Alexander Chervov (2022). scRNA-seq developing human immune system [Dataset]. https://www.kaggle.com/datasets/alexandervc/scrnaseq-developing-human-immune-system/code
Explore at:
zip(3766618814 bytes)Available download formats
Dataset updated
May 13, 2022
Authors
Alexander Chervov
Description
Remark: for cell cycle analysis - see paper https://arxiv.org/abs/2208.05229 "Computational challenges of cell cycle analysis using single cell transcriptomics" Alexander Chervov, Andrei Zinovyev

Data and Context

Data - results of single cell RNA sequencing, i.e. rows - correspond to cells, columns to genes (or vice versa). value of the matrix shows how strong is "expression" of the corresponding gene in the corresponding cell. https://en.wikipedia.org/wiki/Single-cell_transcriptomics

See tutorials: https://scanpy.readthedocs.io/en/stable/tutorials.html ("Scanpy" - main Python package to work with scRNA-seq data). Or https://satijalab.org/seurat/ "Seurat" - "R" package

Particular data: (Remark, here is part 1, for part 2 see: https://www.kaggle.com/datasets/alexandervc/scrnaseq-developing-human-immune-system-part-2 )

Paper: Suo C, Dann E, Goh I, et al. Mapping the developing human immune system across organs. Science (New York, N.Y.). 2022 May:eabo0510. DOI: 10.1126/science.abo0510. PMID: 35549310. https://europepmc.org/article/MED/35549310

Data: Part (NOT the biggest datasets) of: https://developmental.cellatlas.io/fetal-immune Part of: https://developmental.cellatlas.io/ The datasets are rather huge from 10 to 100 thousands cells (!)

Related datasets:

Other single cell RNA seq datasets can be found on kaggle: Look here: https://www.kaggle.com/alexandervc/datasets Or search kaggle for "scRNA-seq"

Inspiration

Single cell RNA sequencing is important technology in modern biology, see e.g. "Eleven grand challenges in single-cell data science" https://genomebiology.biomedcentral.com/articles/10.1186/s13059-020-1926-6

Also see review : Nature. P. Kharchenko: "The triumphs and limitations of computational methods for scRNA-seq" https://www.nature.com/articles/s41592-021-01171-x

Search scholar.google "challenges in single cell rna sequencing" https://scholar.google.fr/scholar?q=challenges+in+single+cell+rna+sequencing&hl=en&as_sdt=0&as_vis=1&oi=scholart gives many interesting and highly cited articles

(Cited 968) Computational and analytical challenges in single-cell transcriptomics Oliver Stegle, Sarah A. Teichmann, John C. Marioni Nat. Rev. Genet., 16 (3) (2015), pp. 133-145 https://www.nature.com/articles/nrg3833

Challenges in unsupervised clustering of single-cell RNA-seq data https://www.nature.com/articles/s41576-018-0088-9 Review Article 07 January 2019 Vladimir Yu Kiselev, Tallulah S. Andrews & Martin Hemberg Nature Reviews Genetics volume 20, pages273–282 (2019)

Challenges and emerging directions in single-cell analysis https://link.springer.com/article/10.1186/s13059-017-1218-y Published: 08 May 2017 Guo-Cheng Yuan, Long Cai, Michael Elowitz, Tariq Enver, Guoping Fan, Guoji Guo, Rafael Irizarry, Peter Kharchenko, Junhyong Kim, Stuart Orkin, John Quackenbush, Assieh Saadatpour, Timm Schroeder, Ramesh Shivdasani & Itay Tirosh Genome Biology volume 18, Article number: 84 (2017)

Single-Cell RNA Sequencing in Cancer: Lessons Learned and Emerging Challenges https://www.sciencedirect.com/science/article/pii/S1097276519303569 Molecular Cell Volume 75, Issue 1, 11 July 2019, Pages 7-12 Journal home page for Molecular Cell
scRNA-seq MCF10-2A p53 on/off, CENP-A overexpress
kaggle.com
zip
Updated Jul 25, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Alexander Chervov (2022). scRNA-seq MCF10-2A p53 on/off, CENP-A overexpress [Dataset]. https://www.kaggle.com/datasets/alexandervc/scrnaseq-mcf102a-p53-onoff-cenpa-overexpress
Explore at:
zip(185726221 bytes)Available download formats
Dataset updated
Jul 25, 2022
Authors
Alexander Chervov
Description
Remark: See paper: https://arxiv.org/abs/2208.05229 results on cell cycle analysis discussed there. "Computational challenges of cell cycle analysis using single cell transcriptomics" Alexander Chervov, Andrei Zinovyev

Data and Context

Data - results of single cell RNA sequencing, i.e. rows - correspond to cells, columns to genes (or vice versa). value of the matrix shows how strong is "expression" of the corresponding gene in the corresponding cell. https://en.wikipedia.org/wiki/Single-cell_transcriptomics

See tutorials: https://scanpy.readthedocs.io/en/stable/tutorials.html ("Scanpy" - main Python package to work with scRNA-seq data). Or https://satijalab.org/seurat/ "Seurat" - "R" package

Particular data: Paper: CENP-A overexpression promotes distinct fates in human cells, depending on p53 status Daniel Jeffery, Alberto Gatto, Katrina Podsypanina, Charlène Renaud-Pageot, Rebeca Ponce Landete, Lorraine Bonneville, Marie Dumont, Daniele Fachinetti & Geneviève Almouzni https://www.nature.com/articles/s42003-021-01941-5

Data: https://www.ebi.ac.uk/arrayexpress/experiments/E-MTAB-9861/

Related datasets:

Other single cell RNA seq datasets can be found on kaggle: Look here: https://www.kaggle.com/alexandervc/datasets Or search kaggle for "scRNA-seq"

Inspiration

Single cell RNA sequencing is important technology in modern biology, see e.g. "Eleven grand challenges in single-cell data science" https://genomebiology.biomedcentral.com/articles/10.1186/s13059-020-1926-6

Also see review : Nature. P. Kharchenko: "The triumphs and limitations of computational methods for scRNA-seq" https://www.nature.com/articles/s41592-021-01171-x

Search scholar.google "challenges in single cell rna sequencing" https://scholar.google.fr/scholar?q=challenges+in+single+cell+rna+sequencing&hl=en&as_sdt=0&as_vis=1&oi=scholart gives many interesting and highly cited articles

(Cited 968) Computational and analytical challenges in single-cell transcriptomics Oliver Stegle, Sarah A. Teichmann, John C. Marioni Nat. Rev. Genet., 16 (3) (2015), pp. 133-145 https://www.nature.com/articles/nrg3833

Challenges in unsupervised clustering of single-cell RNA-seq data https://www.nature.com/articles/s41576-018-0088-9 Review Article 07 January 2019 Vladimir Yu Kiselev, Tallulah S. Andrews & Martin Hemberg Nature Reviews Genetics volume 20, pages273–282 (2019)

Challenges and emerging directions in single-cell analysis https://link.springer.com/article/10.1186/s13059-017-1218-y Published: 08 May 2017 Guo-Cheng Yuan, Long Cai, Michael Elowitz, Tariq Enver, Guoping Fan, Guoji Guo, Rafael Irizarry, Peter Kharchenko, Junhyong Kim, Stuart Orkin, John Quackenbush, Assieh Saadatpour, Timm Schroeder, Ramesh Shivdasani & Itay Tirosh Genome Biology volume 18, Article number: 84 (2017)

Single-Cell RNA Sequencing in Cancer: Lessons Learned and Emerging Challenges https://www.sciencedirect.com/science/article/pii/S1097276519303569 Molecular Cell Volume 75, Issue 1, 11 July 2019, Pages 7-12 Journal home page for Molecular Cell
scRNA-seq +CRISPR=Perturb-seq, Norman...Weissman
kaggle.com
zip
Updated Jul 20, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Alexander Chervov (2022). scRNA-seq +CRISPR=Perturb-seq, Norman...Weissman [Dataset]. https://www.kaggle.com/datasets/alexandervc/scrnaseq-crisprperturbseq-normanweissman
Explore at:
zip(2682694572 bytes)Available download formats
Dataset updated
Jul 20, 2022
Authors
Alexander Chervov
Description
Remark:

dataset used in: "GEARS: Predicting transcriptional outcomes of novel multi-gene perturbations" Yusuf Roohani, Kexin Huang, Jure Leskovec https://www.biorxiv.org/content/10.1101/2022.07.12.499735v1 https://twitter.com/yusufroohani/status/1547965695744360448 Accepted in ICLR

Data and Context

Data - results of single cell RNA sequencing, i.e. rows - correspond to cells, columns to genes (or vice versa). value of the matrix shows how strong is "expression" of the corresponding gene in the corresponding cell. https://en.wikipedia.org/wiki/Single-cell_transcriptomics

See tutorials: https://scanpy.readthedocs.io/en/stable/tutorials.html ("Scanpy" - main Python package to work with scRNA-seq data). Or https://satijalab.org/seurat/ "Seurat" - "R" package

Particular data - Perturb-seq: Single-cell, pooled CRISPR screening experiment comparing the transcriptional effects of overexpressing genes alone or in combination

Paper: Norman TM, Horlbeck MA, Replogle JM, Ge AY et al. Exploring genetic interaction manifolds constructed from rich single-cell phenotypes. Science 2019 Aug 23;365(6455):786-793. PMID: 31395745 https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6746554/

Data: https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE133344

Related datasets:

Other single cell RNA seq datasets can be found on kaggle: Look here: https://www.kaggle.com/alexandervc/datasets Or search kaggle for "scRNA-seq"

Inspiration

Single cell RNA sequencing is important technology in modern biology, see e.g. "Eleven grand challenges in single-cell data science" https://genomebiology.biomedcentral.com/articles/10.1186/s13059-020-1926-6

Also see review : Nature. P. Kharchenko: "The triumphs and limitations of computational methods for scRNA-seq" https://www.nature.com/articles/s41592-021-01171-x

Search scholar.google "challenges in single cell rna sequencing" https://scholar.google.fr/scholar?q=challenges+in+single+cell+rna+sequencing&hl=en&as_sdt=0&as_vis=1&oi=scholart gives many interesting and highly cited articles

(Cited 968) Computational and analytical challenges in single-cell transcriptomics Oliver Stegle, Sarah A. Teichmann, John C. Marioni Nat. Rev. Genet., 16 (3) (2015), pp. 133-145 https://www.nature.com/articles/nrg3833

Challenges in unsupervised clustering of single-cell RNA-seq data https://www.nature.com/articles/s41576-018-0088-9 Review Article 07 January 2019 Vladimir Yu Kiselev, Tallulah S. Andrews & Martin Hemberg Nature Reviews Genetics volume 20, pages273–282 (2019)

Challenges and emerging directions in single-cell analysis https://link.springer.com/article/10.1186/s13059-017-1218-y Published: 08 May 2017 Guo-Cheng Yuan, Long Cai, Michael Elowitz, Tariq Enver, Guoping Fan, Guoji Guo, Rafael Irizarry, Peter Kharchenko, Junhyong Kim, Stuart Orkin, John Quackenbush, Assieh Saadatpour, Timm Schroeder, Ramesh Shivdasani & Itay Tirosh Genome Biology volume 18, Article number: 84 (2017)

Single-Cell RNA Sequencing in Cancer: Lessons Learned and Emerging Challenges https://www.sciencedirect.com/science/article/pii/S1097276519303569 Molecular Cell Volume 75, Issue 1, 11 July 2019, Pages 7-12 Journal home page for Molecular Cell

Facebook

Twitter

Click to copy link

Link copied

Cite

Marisa Loach (2025). Test Data for Galaxy tutorial "Batch Correction and Integration" - Seurat version [Dataset]. http://doi.org/10.5281/zenodo.14713816

Test Data for Galaxy tutorial "Batch Correction and Integration" - Seurat version

Explore at:

binAvailable download formats

Unique identifier

https://doi.org/10.5281/zenodo.14713816

Dataset updated

Apr 28, 2025

Dataset provided by

The Open University

Authors

Marisa Loach

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

This data is used for the Seurat version of the batch correction and integration tutorial on the Galaxy Training Network. The input data was provided by Seurat in the 'Integrative Analysis in Seurat v5' tutorial. The input dataset provided here has been filtered to include only cells for which nFeature_RNA > 1000. The other datasets were produced on Galaxy. The original dataset was published as: Ding, J., Adiconis, X., Simmons, S.K. et al. Systematic comparison of single-cell and single-nucleus RNA-sequencing methods. Nat Biotechnol 38, 737–746 (2020). https://doi.org/10.1038/s41587-020-0465-8.

Clear search

Close search

Google apps

Main menu

Test Data for Galaxy tutorial "Batch Correction and Integration" - Seurat...

Test Data for Galaxy tutorial "Clustering 3k PBMCs with Seurat" -...

scRNA-seq + scATAC-seq Challenge at NeurIPS 2021

Context

Particular data

Related datasets:

Inspiration

scMetabolism - pbmc_demo.rda

Data from: Harnessing single cell RNA sequencing to identify dendritic cell...

scRNA-seq +CRISPR=Perturb-seq.Norman.SelectedPart

Remark 1:

Remark 2:

Data and Context

Related datasets:

Inspiration

Immcantation 10x Tutorial Data

Data from: Evolution of pallium, hippocampus, and cortical cell types...

CITE-seq=scRNA-seq+Proteins: Challenge NeurIPS2021

Context

Particular data

Related datasets:

Inspiration

PBMC 3k test datasets for besca

Processed AnnData objects for GeneTrajectory inference (Gene Trajectory...

Introduction to single cell RNAseq analysis: supplementary material

Example single-cell RNA-Seq dataset for 15K peripheral blood T-cells

scRNA-seq B cells Nature Immunology 2018 GSE115795

Data and Context

Related datasets:

Inspiration

scRNA-seq data for Riba...Molina paper "DeepCycle"

Data and Context

Related datasets:

Inspiration

scRNA-seq developing murine cerebellum Nature2019

Data and Context

Related datasets:

Inspiration

Single cell RNAseq data U2OS cell line

Context

Data and Context

Related datasets:

Inspiration

scRNA-seq developing human immune system

Data and Context

Related datasets:

Inspiration

scRNA-seq MCF10-2A p53 on/off, CENP-A overexpress

Data and Context

Related datasets:

Inspiration

scRNA-seq +CRISPR=Perturb-seq, Norman...Weissman

Remark:

Data and Context

Related datasets:

Inspiration

Test Data for Galaxy tutorial "Batch Correction and Integration" - Seurat versionSee More Versions

Test Data for Galaxy tutorial "Batch Correction and Integration" - Seurat version