14 datasets found
  1. o

    Test Data for Galaxy tutorial "Batch Correction and Integration" - Seurat...

    • ordo.open.ac.uk
    • zenodo.org
    bin
    Updated Apr 28, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Marisa Loach (2025). Test Data for Galaxy tutorial "Batch Correction and Integration" - Seurat version [Dataset]. http://doi.org/10.5281/zenodo.14713816
    Explore at:
    binAvailable download formats
    Dataset updated
    Apr 28, 2025
    Dataset provided by
    The Open University
    Authors
    Marisa Loach
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This data is used for the Seurat version of the batch correction and integration tutorial on the Galaxy Training Network. The input data was provided by Seurat in the 'Integrative Analysis in Seurat v5' tutorial. The input dataset provided here has been filtered to include only cells for which nFeature_RNA > 1000. The other datasets were produced on Galaxy. The original dataset was published as: Ding, J., Adiconis, X., Simmons, S.K. et al. Systematic comparison of single-cell and single-nucleus RNA-sequencing methods. Nat Biotechnol 38, 737–746 (2020). https://doi.org/10.1038/s41587-020-0465-8.

  2. o

    Test Data for Galaxy Tutorial "Clustering 3k PBMCs with Seurat"

    • ordo.open.ac.uk
    bin
    Updated Nov 14, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Marisa Loach (2024). Test Data for Galaxy Tutorial "Clustering 3k PBMCs with Seurat" [Dataset]. http://doi.org/10.5281/zenodo.14013475
    Explore at:
    binAvailable download formats
    Dataset updated
    Nov 14, 2024
    Dataset provided by
    The Open University
    Authors
    Marisa Loach
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Test Data for Galaxy Tutorial "Clustering 3k PBMCs with Seurat"

  3. scverse tutorial data: Getting started with AnnData

    • figshare.com
    hdf
    Updated Apr 7, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jan Lause (2023). scverse tutorial data: Getting started with AnnData [Dataset]. http://doi.org/10.6084/m9.figshare.22577536.v2
    Explore at:
    hdfAvailable download formats
    Dataset updated
    Apr 7, 2023
    Dataset provided by
    figshare
    Figsharehttp://figshare.com/
    Authors
    Jan Lause
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The data is derived from the 3k PBMC data used in scanpy & Seurat tutorials. In comes in the AnnData h5ad format.

    Processed 3k PBMCs from a Healthy Donor from 10x Genomics, available at https://scanpy.readthedocs.io/en/stable/generated/scanpy.datasets.pbmc3k_processed.html Original 10X data available at http://cf.10xgenomics.com/samples/cell-exp/1.1.0/pbmc3k/pbmc3k_filtered_gene_bc_matrices.tar.gz from this website: https://support.10xgenomics.com/single-cell-gene-expression/datasets/1.1.0/pbmc3k

    The changes made to the original scanpy.datasets.pbmc3k_processed() data are described in this github issue: https://github.com/scverse/scverse-tutorials/issues/51

    See jupyter notebook for details.

  4. Example data for Immcantation training legacy tutorials

    • zenodo.org
    zip
    Updated May 12, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    S Marquez; S Marquez (2024). Example data for Immcantation training legacy tutorials [Dataset]. http://doi.org/10.5281/zenodo.11181600
    Explore at:
    zipAvailable download formats
    Dataset updated
    May 12, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    S Marquez; S Marquez
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    `bcr_phylo_tutorial.zip` is used in the Reconstruction and analysis of B-cell lineage trees from single cell data using Immcantation tutorial.

    `immcantation-BCR-Seurat-tutorial.zip` is used in the Integration of BCR and GEX data tutorial.

  5. Z

    Immcantation 10x Tutorial Data

    • data.niaid.nih.gov
    • zenodo.org
    Updated Oct 20, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Meng, Hailong (2023). Immcantation 10x Tutorial Data [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_8179845
    Explore at:
    Dataset updated
    Oct 20, 2023
    Dataset provided by
    Meng, Hailong
    Gabernet, Gisela
    Jensen, Cole
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Necessary datasets to run the Immcantation 10x Tutorial. Below is the description of the files in the data set.

    BCR_data_sample1.tsv: data corresponding to the first sample (sample 1) of the two samples analyzed in the 10x tutorial. This is the sample used to show the Change-O steps.

    filtered_contig_annotations.csv: filtered contig annotations file for sample 1, output of cellranger vdj.

    filtered_contig.fasta: sequence fasta file for sample 1, output of cellranger vdj.

    BCR_data.tsv: AIRR rearrangement file containing the data for both samples 1 and 2 used in the 10x tutorial.

    BCR.data_08112023.rds: R dataframe object containing the single-cell BCR sequencing data for both samples 1 and 2 used in the 10x tutorial.

    GEX.data_08112023.rds: Seurat object containing the single-cell gene expression data used in the 10x tutorial.

  6. f

    scMetabolism - pbmc_demo.rda

    • figshare.com
    bin
    Updated Jan 31, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yingcheng Wu (2021). scMetabolism - pbmc_demo.rda [Dataset]. http://doi.org/10.6084/m9.figshare.13670038.v1
    Explore at:
    binAvailable download formats
    Dataset updated
    Jan 31, 2021
    Dataset provided by
    figshare
    Authors
    Yingcheng Wu
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The demo datasets for scMetabolismThe demo data is the dataset of Peripheral Blood Mononuclear Cells (PBMC) from 10X Genomics open access dataset (~2,700 single cells, also used by Seurat tutorial).

  7. PBMC data for SCelVis

    • figshare.com
    hdf
    Updated Jun 3, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Benedikt Obermayer (2023). PBMC data for SCelVis [Dataset]. http://doi.org/10.6084/m9.figshare.10002125.v1
    Explore at:
    hdfAvailable download formats
    Dataset updated
    Jun 3, 2023
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Benedikt Obermayer
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    IFN-beta treated and control PBMCs from 8 donorstwo groups of PBMCs from Kang et al. 2017 (https://www.nature.com/articles/nbt.4042), analyzed using the Seurat sample alignment strategy as explained in the tutorial at https://satijalab.org/seurat/v2.4/immune_alignment.html and then converted to h5ad format* ~14000 cells* IFN-treated and unstimulated PBMCs from 8 donors* donor identities determined using demuxlet (see GSE96583)

  8. PBMC 3k test datasets for besca

    • zenodo.org
    bin
    Updated Jan 18, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Klas Hatje; Klas Hatje; Alice Julien-Laferrière; Alice Julien-Laferrière (2021). PBMC 3k test datasets for besca [Dataset]. http://doi.org/10.5281/zenodo.4441679
    Explore at:
    binAvailable download formats
    Dataset updated
    Jan 18, 2021
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Klas Hatje; Klas Hatje; Alice Julien-Laferrière; Alice Julien-Laferrière
    License

    https://www.gnu.org/licenses/agpl.txthttps://www.gnu.org/licenses/agpl.txt

    Description

    This is a single cell transcriptomics dataset containing roughly 3,000 PBMCs. The original data was downloaded from the Seurat 3k PBMC tutorial: https://satijalab.org/seurat/v3.0/pbmc3k_tutorial.html. We reprocessed the dataset using the Besca package (https://github.com/bedapub/besca).

  9. scRNA-seq MCF10-2A p53 on/off, CENP-A overexpress

    • kaggle.com
    Updated Jul 25, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Alexander Chervov (2022). scRNA-seq MCF10-2A p53 on/off, CENP-A overexpress [Dataset]. https://www.kaggle.com/datasets/alexandervc/scrnaseq-mcf102a-p53-onoff-cenpa-overexpress/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 25, 2022
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Alexander Chervov
    Description

    Remark: See paper: https://arxiv.org/abs/2208.05229 results on cell cycle analysis discussed there. "Computational challenges of cell cycle analysis using single cell transcriptomics" Alexander Chervov, Andrei Zinovyev

    Data and Context

    Data - results of single cell RNA sequencing, i.e. rows - correspond to cells, columns to genes (or vice versa). value of the matrix shows how strong is "expression" of the corresponding gene in the corresponding cell. https://en.wikipedia.org/wiki/Single-cell_transcriptomics

    See tutorials: https://scanpy.readthedocs.io/en/stable/tutorials.html ("Scanpy" - main Python package to work with scRNA-seq data). Or https://satijalab.org/seurat/ "Seurat" - "R" package

    Particular data: Paper: CENP-A overexpression promotes distinct fates in human cells, depending on p53 status Daniel Jeffery, Alberto Gatto, Katrina Podsypanina, Charlène Renaud-Pageot, Rebeca Ponce Landete, Lorraine Bonneville, Marie Dumont, Daniele Fachinetti & Geneviève Almouzni https://www.nature.com/articles/s42003-021-01941-5

    Data: https://www.ebi.ac.uk/arrayexpress/experiments/E-MTAB-9861/

    Related datasets:

    Other single cell RNA seq datasets can be found on kaggle: Look here: https://www.kaggle.com/alexandervc/datasets Or search kaggle for "scRNA-seq"

    Inspiration

    Single cell RNA sequencing is important technology in modern biology, see e.g. "Eleven grand challenges in single-cell data science" https://genomebiology.biomedcentral.com/articles/10.1186/s13059-020-1926-6

    Also see review : Nature. P. Kharchenko: "The triumphs and limitations of computational methods for scRNA-seq" https://www.nature.com/articles/s41592-021-01171-x

    Search scholar.google "challenges in single cell rna sequencing" https://scholar.google.fr/scholar?q=challenges+in+single+cell+rna+sequencing&hl=en&as_sdt=0&as_vis=1&oi=scholart gives many interesting and highly cited articles

    (Cited 968) Computational and analytical challenges in single-cell transcriptomics Oliver Stegle, Sarah A. Teichmann, John C. Marioni Nat. Rev. Genet., 16 (3) (2015), pp. 133-145 https://www.nature.com/articles/nrg3833

    Challenges in unsupervised clustering of single-cell RNA-seq data https://www.nature.com/articles/s41576-018-0088-9 Review Article 07 January 2019 Vladimir Yu Kiselev, Tallulah S. Andrews & Martin Hemberg Nature Reviews Genetics volume 20, pages273–282 (2019)

    Challenges and emerging directions in single-cell analysis https://link.springer.com/article/10.1186/s13059-017-1218-y Published: 08 May 2017 Guo-Cheng Yuan, Long Cai, Michael Elowitz, Tariq Enver, Guoping Fan, Guoji Guo, Rafael Irizarry, Peter Kharchenko, Junhyong Kim, Stuart Orkin, John Quackenbush, Assieh Saadatpour, Timm Schroeder, Ramesh Shivdasani & Itay Tirosh Genome Biology volume 18, Article number: 84 (2017)

    Single-Cell RNA Sequencing in Cancer: Lessons Learned and Emerging Challenges https://www.sciencedirect.com/science/article/pii/S1097276519303569 Molecular Cell Volume 75, Issue 1, 11 July 2019, Pages 7-12 Journal home page for Molecular Cell

  10. scRNA-seq B,T cells and lymphomas NatCellBiol2020

    • kaggle.com
    Updated May 10, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Alexander Chervov (2022). scRNA-seq B,T cells and lymphomas NatCellBiol2020 [Dataset]. https://www.kaggle.com/datasets/alexandervc/scrnaseq-bt-cells-and-lymphomas-natcellbiol2020
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 10, 2022
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Alexander Chervov
    Description

    Remark: for cell cycle analysis - see paper https://arxiv.org/abs/2208.05229 "Computational challenges of cell cycle analysis using single cell transcriptomics" Alexander Chervov, Andrei Zinovyev

    Data and Context

    Data - results of single cell RNA sequencing, i.e. rows - correspond to cells, columns to genes (or vice versa). value of the matrix shows how strong is "expression" of the corresponding gene in the corresponding cell. https://en.wikipedia.org/wiki/Single-cell_transcriptomics

    See tutorials: https://scanpy.readthedocs.io/en/stable/tutorials.html ("Scanpy" - main Python package to work with scRNA-seq data). Or https://satijalab.org/seurat/ "Seurat" - "R" package

    Particular data: Paper: Roider, T., Seufert, J., Uvarovskii, A. et al. Dissecting intratumour heterogeneity of nodal B-cell lymphomas at the transcriptional, genetic and drug-response levels. Nat Cell Biol 22, 896–906 (2020). doi: https://doi.org/10.1038/s41556-020-0532-x Data: https://heidata.uni-heidelberg.de/dataset.xhtml?persistentId=doi:10.11588/data/VRJUNV

    Subdirectory names like "FL1", ""DLBCL"... - corresponds to:

    SampleName Tissue Sex Entity Diagnosis Subclassification Batch CellrangerVersion FL3 Lymph node M B cell Lymphoma Follicular Lymphoma NA 1 2.1.1 FL1 Lymph node M B cell Lymphoma Follicular Lymphoma NA 1 2.1.1 FL2 Lymph node M B cell Lymphoma Follicular Lymphoma NA 1 2.1.1 DLBCL1 Lymph node F B cell Lymphoma Diffuse large B cell lymphoma Germinal Center subtype 1 2.1.1 tFL1 Lymph node F B cell Lymphoma Transformed Follicular Lymphoma Germinal Center subtype 1 2.1.1 DLBCL2 Lymph node M B cell Lymphoma Diffuse large B cell lymphoma Germinal Center subtype 1 2.1.1 rLN1 Lymph node M NA Reactive Lymphadenitis NA 1 2.1.1 DLBCL3 Lymph node F B cell Lymphoma Diffuse large B cell lymphoma non-Germinal Center subtype 2 3.0.2 FL4 Lymph node M B cell Lymphoma Follicular Lymphoma NA 2 3.0.2 rLN3 Lymph node F NA Reactive Lymphadenitis NA 2 3.0.2 rLN2 Lymph node M NA Reactive Lymphadenitis NA 2 3.0.2 tFL2 Lymph node F B cell Lymphoma Transformed Follicular Lymphoma Germinal Center subtype 2 3.0.2

    Related datasets:

    Other single cell RNA seq datasets can be found on kaggle: Look here: https://www.kaggle.com/alexandervc/datasets Or search kaggle for "scRNA-seq"

    Inspiration

    Single cell RNA sequencing is important technology in modern biology, see e.g. "Eleven grand challenges in single-cell data science" https://genomebiology.biomedcentral.com/articles/10.1186/s13059-020-1926-6

    Also see review : Nature. P. Kharchenko: "The triumphs and limitations of computational methods for scRNA-seq" https://www.nature.com/articles/s41592-021-01171-x

    Search scholar.google "challenges in single cell rna sequencing" https://scholar.google.fr/scholar?q=challenges+in+single+cell+rna+sequencing&hl=en&as_sdt=0&as_vis=1&oi=scholart gives many interesting and highly cited articles

    (Cited 968) Computational and analytical challenges in single-cell transcriptomics Oliver Stegle, Sarah A. Teichmann, John C. Marioni Nat. Rev. Genet., 16 (3) (2015), pp. 133-145 https://www.nature.com/articles/nrg3833

    Challenges in unsupervised clustering of single-cell RNA-seq data https://www.nature.com/articles/s41576-018-0088-9 Review Article 07 January 2019 Vladimir Yu Kiselev, Tallulah S. Andrews & Martin Hemberg Nature Reviews Genetics volume 20, pages273–282 (2019)

    Challenges and emerging directions in single-cell analysis https://link.springer.com/article/10.1186/s13059-017-1218-y Published: 08 May 2017 Guo-Cheng Yuan, Long Cai, Michael Elowitz, Tariq Enver, Guoping Fan, Guoji Guo, Rafael Irizarry, Peter Kharchenko, Junhyong Kim, Stuart Orkin, John Quackenbush, Assieh Saadatpour, Timm Schroeder, Ramesh Shivdasani & Itay Tirosh Genome Biology volume 18, Article number: 84 (2017)

    Single-Cell RNA Sequencing in Cancer: Lessons Learned and Emerging Challenges https://www.sciencedirect.com/science/article/pii/S1097276519303569 Molecular Cell Volume 75, Issue 1, 11 July 2019, Pages 7-12 Journal home page for Molecular Cell

  11. f

    Processed AnnData objects for GeneTrajectory inference (Gene Trajectory...

    • figshare.com
    hdf
    Updated Apr 4, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rihao Qu; Francesco Strino (2024). Processed AnnData objects for GeneTrajectory inference (Gene Trajectory Inference for Single-cell Data by Optimal Transport Metrics) [Dataset]. http://doi.org/10.6084/m9.figshare.25539547.v1
    Explore at:
    hdfAvailable download formats
    Dataset updated
    Apr 4, 2024
    Dataset provided by
    figshare
    Authors
    Rihao Qu; Francesco Strino
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    These are processed AnnData objects (converted from Seurat objects) for GeneTrajectory tutorials (https://github.com/KlugerLab/GeneTrajectory-python/):Human myeloid dataset analysisMyeloid cells were extracted from a publicly available 10x scRNA-seq dataset (https:// support.10xgenomics.com/single-cell-gene-expression/datasets/3.0.0/pbmc 10k v3). QC was performed using the same workflow in (https://github.com/satijalab/ Integration2019/blob/master/preprocessing scripts/pbmc 10k v3.R). After standard normalization, highly-variable gene selection and scaling using the Seurat R package, we applied PCA and retained the top 30 principal components. Four sub-clusters of myeloid cells were identified based on Louvian clustering with a resolution of 0.3. Wilcoxon rank-sum test was employed to find cluster-specific gene markers for cell type annotation.For gene trajectory inference, we first applied Diffusion Map on the cell PC embedding (using a local-adaptive kernel, each bandwidth is determined by the distance to its k-nearest neighbor, k = 10) to generate a spectral embedding of cells. We constructed a cell-cell kNN (k = 10) graph based on their coordinates of the top 5 non-trivial Diffusion Map eigenvectors. Among the top 2,000 variable genes, genes expressed by 0.5% − 75% of cells were retained for pairwise gene-gene Wasserstein distance computation. The original cell graph was coarse-grained into a graph of size 1,000. We then built a gene-gene graph where the affinity between genes is transformed from the Wasserstein distance using a Gaussian kernel (local-adaptive, k = 5). Diffusion Map was employed to visualize the embedding of gene graph. For trajectory identification, we used a series of time steps (11,21,8) to extract three gene trajectories.Mouse embryo skin data analysisWe separated out dermal cell populations from the newly collected mouse embryo skin samples. Cells from the wildtype and the Wls mutant were pooled for analyses. After standard normalization, highly-variable gene selection and scaling using Seurat, we applied PCA and retained the top 30 principal components. Three dermal celltypes were stratified based on the expression of canonical dermal markers, including Sox2, Dkk1, and Dkk2. For gene trajectory inference, we first applied Diffusion Map on the cell PC embedding (using a local-adaptive kernel bandwidth, k = 10) to generate a spectral embedding of cells. We constructed a cell-cell kNN (k = 10) graph based on their coordinates of the top 10 non-trivial Diffusion Map eigenvectors. Among the top 2,000 variable genes, genes expressed by 1% − 50% of cells were retained for pairwise gene-gene Wasserstein distance computation. The original cell graph was coarse-grained into a graph of size 1,000. We then built a gene-gene graph where the affinity between genes is transformed from the Wasserstein distance using a Gaussian kernel (local-adaptive, k = 5). Diffusion Map was employed to visualize the embedding of gene graph. For trajectory identification, we used a series of time steps (9,16,5) to sequentially extract three gene trajectories. To compare the differences between the wiltype and the Wls mutant, we stratified Wnt-active UD cells into seven stages according to their expression profiles of the genes binned along the DC gene trajectory.

  12. scRNA-seq Trajectory inference.

    • kaggle.com
    Updated Aug 9, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Alexander Chervov (2022). scRNA-seq Trajectory inference. [Dataset]. https://www.kaggle.com/datasets/alexandervc/trajectory-inference-single-cell-rna-seq/discussion
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Aug 9, 2022
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Alexander Chervov
    Description

    Remark: For trajectory inference discussion for that dataset, see paper: https://www.mdpi.com/1099-4300/22/11/1274 "Minimum Spanning vs. Principal Trees for Structured Approximations of Multi-Dimensional Datasets Alexander Chervov, Jonathan Bac and Andrei Zinovyev

    For cell cycle analysis see: https://arxiv.org/abs/2208.05229 "Computational challenges of cell cycle analysis using single cell transcriptomics" Alexander Chervov, Andrei Zinovyev

    Data and Context

    Data - results of single cell RNA sequencing, i.e. rows - correspond to cells, columns to genes (or vice versa). value of the matrix shows how strong is "expression" of the corresponding gene in the corresponding cell. https://en.wikipedia.org/wiki/Single-cell_transcriptomics

    See tutorials: https://scanpy.readthedocs.io/en/stable/tutorials.html ("Scanpy" - main Python package to work with scRNA-seq data). Or https://satijalab.org/seurat/ "Seurat" - "R" package

    Particular data: Gene expressions count matrix. Single cell RNA sequencing data. 447 cells , 24748 genes Mouse Liver Hepatoblast in vivo.

    Paper: Hepatology. 2017 Nov;66(5):1387-1401. doi: 10.1002/hep.29353. Epub 2017 Sep 29. A single-cell transcriptomic analysis reveals precise pathways and regulatory mechanisms underlying hepatoblast differentiation Li Yang 1 2 , Wei-Hua Wang 1 2 , Wei-Lin Qiu 1 3 , Zhen Guo 1 , Erfei Bi 4 , Cheng-Ran Xu 1

    Data: GSE90047 https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE90047 Downloaded from: https://cytotrace.stanford.edu/#shiny-tab-dataset_download

    Related datasets:

    Other single cell RNA seq datasets can be found on kaggle: Look here: https://www.kaggle.com/alexandervc/datasets Or search kaggle for "scRNA-seq"

    Inspiration

    Single cell RNA sequencing is important technology in modern biology, see e.g. "Eleven grand challenges in single-cell data science" https://genomebiology.biomedcentral.com/articles/10.1186/s13059-020-1926-6

    Also see review : Nature. P. Kharchenko: "The triumphs and limitations of computational methods for scRNA-seq" https://www.nature.com/articles/s41592-021-01171-x

    Search scholar.google "challenges in single cell rna sequencing" https://scholar.google.fr/scholar?q=challenges+in+single+cell+rna+sequencing&hl=en&as_sdt=0&as_vis=1&oi=scholart gives many interesting and highly cited articles

    (Cited 968) Computational and analytical challenges in single-cell transcriptomics Oliver Stegle, Sarah A. Teichmann, John C. Marioni Nat. Rev. Genet., 16 (3) (2015), pp. 133-145 https://www.nature.com/articles/nrg3833

    Challenges in unsupervised clustering of single-cell RNA-seq data https://www.nature.com/articles/s41576-018-0088-9 Review Article 07 January 2019 Vladimir Yu Kiselev, Tallulah S. Andrews & Martin Hemberg Nature Reviews Genetics volume 20, pages273–282 (2019)

    Challenges and emerging directions in single-cell analysis https://link.springer.com/article/10.1186/s13059-017-1218-y Published: 08 May 2017 Guo-Cheng Yuan, Long Cai, Michael Elowitz, Tariq Enver, Guoping Fan, Guoji Guo, Rafael Irizarry, Peter Kharchenko, Junhyong Kim, Stuart Orkin, John Quackenbush, Assieh Saadatpour, Timm Schroeder, Ramesh Shivdasani & Itay Tirosh Genome Biology volume 18, Article number: 84 (2017)

    Single-Cell RNA Sequencing in Cancer: Lessons Learned and Emerging Challenges https://www.sciencedirect.com/science/article/pii/S1097276519303569 Molecular Cell Volume 75, Issue 1, 11 July 2019, Pages 7-12 Journal home page for Molecular Cell

  13. Introduction to single cell RNAseq analysis: supplementary material

    • zenodo.org
    zip
    Updated May 10, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jose Alejandro Romero Herrera; Jose Alejandro Romero Herrera; Samuele Soraggi; Samuele Soraggi (2023). Introduction to single cell RNAseq analysis: supplementary material [Dataset]. http://doi.org/10.5281/zenodo.7827899
    Explore at:
    zipAvailable download formats
    Dataset updated
    May 10, 2023
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Jose Alejandro Romero Herrera; Jose Alejandro Romero Herrera; Samuele Soraggi; Samuele Soraggi
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This archive contains 10x matrix data (processed with cellranger) used for the Introduction to single cell RNAseq analysis workshop taught by the Danish National Sandbox for Health Data Science. The course repo can be found on Github.

    Data.zip contains 6 runs on Spermatogonia development. 3 from healthy individuals and 3 from azoospermic individuals. Data has been already preprocessed using cellranger and can be loaded using Seurat (R) or scanpy (python).

  14. scRNAseq_Dataset Merge AMI d5 (CD45+Fibroblast) + AAA Kinetik +...

    • zenodo.org
    Updated Mar 28, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Alexander Lang; Alexander Lang (2023). scRNAseq_Dataset Merge AMI d5 (CD45+Fibroblast) + AAA Kinetik + Cite-Seq_Dataset AG Gerdes [Dataset]. http://doi.org/10.5281/zenodo.7774809
    Explore at:
    Dataset updated
    Mar 28, 2023
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Alexander Lang; Alexander Lang
    Description

    Integration Skript:

    library(Seurat)
    library(tidyverse)
    library(Matrix)

    #cite <- readRDS("C:/Users/alex/sciebo/ALL_NGS/scRNAseq/scRNAseq/Merge AAA mit Cite AAA/Cite_seq_v0.41.rds")
    #CD45 <- readRDS("C:/Users/alex/sciebo/ALL_NGS/scRNAseq/scRNAseq/Schrader/Fertige_Analysen/TS_d5_paper/CD45.rds")
    AAA <- readRDS("C:/Users/alex/sciebo/AAA_Zhao_v4.rds")
    cite <- readRDS("C:/Users/alex/sciebo/CITE_Seq_v0.5.rds")
    all4 <- readRDS("C:/Users/alex/sciebo/ALL_NGS/scRNAseq/scRNAseq/Schrader/Fertige_Analysen/Schrader_All4_Rohanalyse/all4_220228.rds")

    #fuse lists
    c <- list(cite, all4, AAA)
    names(c) <- c("cite", "all4", "AAA")

    pancreas.list <- c[c("cite", "all4", "AAA")]
    for (i in 1:length(pancreas.list)) {
    pancreas.list[[i]] <- SCTransform(pancreas.list[[i]], verbose = FALSE)
    }

    pancreas.features <- SelectIntegrationFeatures(object.list = pancreas.list, nfeatures = 3000)
    #options(future.globals.maxSize= 6091289600)
    #pancreas.list <- PrepSCTIntegration(object.list = pancreas.list, anchor.features = pancreas.features,
    #verbose = FALSE) #future.globals.maxsize was to low. changed it to options(future.globals.maxSize= 1091289600)
    #identify anchors

    #alternative from tutorial (https://satijalab.org/seurat/articles/integration_introduction.html)
    #memory.limit(9999999999)
    features <- SelectIntegrationFeatures(object.list = pancreas.list, nfeatures = 3000)
    pancreas.list <- PrepSCTIntegration(object.list = pancreas.list, anchor.features = features)
    pancreas.anchors <- FindIntegrationAnchors(object.list = pancreas.list, normalization.method = "SCT", anchor.features = pancreas.features, verbose = FALSE)
    pancreas.integrated <- IntegrateData(anchorset = pancreas.anchors, normalization.method = "SCT",
    verbose = FALSE)

    setwd("C:/Users/alex/sciebo/ALL_NGS/scRNAseq/scRNAseq/Schrader/Fertige_Analysen/TS_d5_paper")

    saveRDS(pancreas.integrated, file = "integrated_AAA_Cite_AMI.rds")

    saveRDS(cd45, file = "integrated_AAA_Cite_CD45.rds")

    seurat <- pancreas.integrated

    #seurat <- readRDS("C:/Users/alex/sciebo/ALL_NGS/scRNAseq/scRNAseq/Schrader/Fertige_Analysen/TS_d5_paper/integrated_d5_cite.rds")

    DefaultAssay(object = seurat) <- "integrated"
    seurat <- FindVariableFeatures(seurat, selection.method = "vst", nfeatures = 3000)
    seurat <- ScaleData(seurat, verbose = FALSE)
    seurat <- RunPCA(seurat, npcs = 30, verbose = FALSE)
    seurat <- FindNeighbors(seurat, dims = 1:30)
    seurat <- FindClusters(seurat, resolution = 0.5)
    seurat <- RunUMAP(seurat, reduction = "pca", dims = 1:30)
    DimPlot(seurat, reduction = "umap", split.by = "treatment") + NoLegend()


    DimPlot(seurat, label = T, repel = T) + NoLegend()

    DefaultAssay(object = seurat) <- "ADT"
    adt_marker_integrated <- FindAllMarkers(seurat, logfc.threshold = 0.3)
    write.csv(adt_marker_integrated, file = "adt_marker_all4_integrated.csv")

    DefaultAssay(object = seurat) <- "RNA"
    RNA_marker_integrated <- FindAllMarkers(seurat, logfc.threshold = 0.5)
    write.csv(RNA_marker_integrated, file = "RNA_marker_all4_integrated.csv")

    DimPlot(seurat, label = T, repel = T, split.by = "tissue") + NoLegend()

    FeaturePlot(seurat, features = "Cd40", order = T, label = T)
    FeaturePlot(seurat, features = "Ms.CD40", order = T, label = T)


    #####
    #leanup:
    > seurat@meta.data[["sen_score1"]] <- NULL
    > seurat@meta.data[["sen_score2"]] <- NULL
    > seurat@meta.data[["sen_score3"]] <- NULL
    > seurat@meta.data[["sen_score4"]] <- NULL
    > seurat@meta.data[["sen_score5"]] <- NULL
    > seurat@meta.data[["sen_score6"]] <- NULL
    > seurat@meta.data[["sen_score7"]] <- NULL
    > seurat@meta.data[["pANN_0.25_0.1_1211"]] <- NULL
    > seurat@meta.data[["DF.classifications_0.25_0.1_1211"]] <- NULL
    > seurat@meta.data[["DF.classifications_0.25_0.1_466"]] <- NULL
    > seurat@assays[["prediction.score.celltype"]] <- NULL
    > seurat@meta.data[["predicted.celltype"]] <- NULL
    > seurat@meta.data[["DF.classifications_0.25_0.1_184"]] <- NULL
    > seurat@meta.data[["DF.classifications_0.25_0.1_953"]] <- NULL
    > seurat@meta.data[["integrated_snn_res.3"]] <- NULL
    > seurat@meta.data[["RNA_snn_res.3"]] <- NULL
    > seurat@meta.data[["SingleR"]] <- NULL
    > seurat@meta.data[["SingleR_fine"]] <- NULL
    > seurat@meta.data[["ImmGen"]] <- NULL
    > seurat@meta.data[["ImmGen_fine"]] <- NULL
    > seurat@meta.data[["percent.mt"]] <- NULL
    > seurat@meta.data[["nCount_integrated"]] <- NULL
    > seurat@meta.data[["nFeature_integrated"]] <- NULL
    > seurat@meta.data[["S.Score"]] <- NULL
    > seurat@meta.data[["G2M.Score"]] <- NULL
    > seurat@meta.data[["Phase"]] <- NULL
    > seurat@meta.data[["sen_score8"]] <- NULL
    > seurat@meta.data[["sen_score9"]] <- NULL
    > seurat@meta.data[["sen_score10"]] <- NULL
    > seurat@meta.data[["sen_score11"]] <- NULL
    > seurat@meta.data[["sen_score12"]] <- NULL
    > seurat@meta.data[["sen_score13"]] <- NULL
    > seurat@meta.data[["sen_score14"]] <- NULL
    > seurat@meta.data[["sen_score15"]] <- NULL
    > seurat@meta.data[["sen_score16"]] <- NULL
    > seurat@meta.data[["sen_score17"]] <- NULL
    > seurat@meta.data[["sen_score18"]] <- NULL
    > seurat@meta.data[["sen_score19"]] <- NULL
    seurat@meta.data[["pANN_0.25_0.1_184"]] <- NULL
    seurat@meta.data[["pANN_0.25_0.1_953"]] <- NULL
    seurat@meta.data[["pANN_0.25_0.1_466"]] <- NULL

  15. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Marisa Loach (2025). Test Data for Galaxy tutorial "Batch Correction and Integration" - Seurat version [Dataset]. http://doi.org/10.5281/zenodo.14713816

Test Data for Galaxy tutorial "Batch Correction and Integration" - Seurat version

Explore at:
binAvailable download formats
Dataset updated
Apr 28, 2025
Dataset provided by
The Open University
Authors
Marisa Loach
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

This data is used for the Seurat version of the batch correction and integration tutorial on the Galaxy Training Network. The input data was provided by Seurat in the 'Integrative Analysis in Seurat v5' tutorial. The input dataset provided here has been filtered to include only cells for which nFeature_RNA > 1000. The other datasets were produced on Galaxy. The original dataset was published as: Ding, J., Adiconis, X., Simmons, S.K. et al. Systematic comparison of single-cell and single-nucleus RNA-sequencing methods. Nat Biotechnol 38, 737–746 (2020). https://doi.org/10.1038/s41587-020-0465-8.

Search
Clear search
Close search
Google apps
Main menu