100+ datasets found
  1. Repository for Single Cell RNA Sequencing Analysis of The EMT6 Dataset

    • zenodo.org
    • data.niaid.nih.gov
    bin, txt
    Updated Nov 20, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jonathan Hsu; Allart Stoop; Jonathan Hsu; Allart Stoop (2023). Repository for Single Cell RNA Sequencing Analysis of The EMT6 Dataset [Dataset]. http://doi.org/10.5281/zenodo.10011622
    Explore at:
    bin, txtAvailable download formats
    Dataset updated
    Nov 20, 2023
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Jonathan Hsu; Allart Stoop; Jonathan Hsu; Allart Stoop
    Description

    Table of Contents

    1. Main Description
    2. File Descriptions
    3. Linked Files
    4. Installation and Instructions

    1. Main Description

    ---------------------------

    This is the Zenodo repository for the manuscript titled "A TCR β chain-directed antibody-fusion molecule that activates and expands subsets of T cells and promotes antitumor activity.". The code included in the file titled `marengo_code_for_paper_jan_2023.R` was used to generate the figures from the single-cell RNA sequencing data.

    The following libraries are required for script execution:

    • Seurat
    • scReportoire
    • ggplot2
    • stringr
    • dplyr
    • ggridges
    • ggrepel
    • ComplexHeatmap

    File Descriptions

    ---------------------------

    • The code can be downloaded and opened in RStudios.
    • The "marengo_code_for_paper_jan_2023.R" contains all the code needed to reproduce the figues in the paper
    • The "Marengo_newID_March242023.rds" file is available at the following address: https://zenodo.org/badge/DOI/10.5281/zenodo.7566113.svg (Zenodo DOI: 10.5281/zenodo.7566113).
    • The "all_res_deg_for_heat_updated_march2023.txt" file contains the unfiltered results from DGE anlaysis, also used to create the heatmap with DGE and volcano plots.
    • The "genes_for_heatmap_fig5F.xlsx" contains the genes included in the heatmap in figure 5F.

    Linked Files

    ---------------------

    This repository contains code for the analysis of single cell RNA-seq dataset. The dataset contains raw FASTQ files, as well as, the aligned files that were deposited in GEO. The "Rdata" or "Rds" file was deposited in Zenodo. Provided below are descriptions of the linked datasets:

    Gene Expression Omnibus (GEO) ID: GSE223311(https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE223311)

    • Title: Gene expression profile at single cell level of CD4+ and CD8+ tumor infiltrating lymphocytes (TIL) originating from the EMT6 tumor model from mSTAR1302 treatment.
    • Description: This submission contains the "matrix.mtx", "barcodes.tsv", and "genes.tsv" files for each replicate and condition, corresponding to the aligned files for single cell sequencing data.
    • Submission type: Private. In order to gain access to the repository, you must use a reviewer token (https://www.ncbi.nlm.nih.gov/geo/info/reviewer.html).

    Sequence read archive (SRA) repository ID: SRX19088718 and SRX19088719

    • Title: Gene expression profile at single cell level of CD4+ and CD8+ tumor infiltrating lymphocytes (TIL) originating from the EMT6 tumor model from mSTAR1302 treatment.
    • Description: This submission contains the **raw sequencing** or `.fastq.gz` files, which are tab delimited text files.
    • Submission type: Private. In order to gain access to the repository, you must use a reviewer token (https://www.ncbi.nlm.nih.gov/geo/info/reviewer.html).

    Zenodo DOI: 10.5281/zenodo.7566113(https://zenodo.org/record/7566113#.ZCcmvC2cbrJ)

    • Title: A TCR β chain-directed antibody-fusion molecule that activates and expands subsets of T cells and promotes antitumor activity.
    • Description: This submission contains the "Rdata" or ".Rds" file, which is an R object file. This is a necessary file to use the code.
    • Submission type: Restricted Acess. In order to gain access to the repository, you must contact the author.

    Installation and Instructions

    --------------------------------------

    The code included in this submission requires several essential packages, as listed above. Please follow these instructions for installation:

    > Ensure you have R version 4.1.2 or higher for compatibility.

    > Although it is not essential, you can use R-Studios (Version 2022.12.0+353 (2022.12.0+353)) for accessing and executing the code.

    1. Download the *"Rdata" or ".Rds" file from Zenodo (https://zenodo.org/record/7566113#.ZCcmvC2cbrJ) (Zenodo DOI: 10.5281/zenodo.7566113).

    2. Open R-Studios (https://www.rstudio.com/tags/rstudio-ide/) or a similar integrated development environment (IDE) for R.

    3. Set your working directory to where the following files are located:

    • marengo_code_for_paper_jan_2023.R
    • Install_Packages.R
    • Marengo_newID_March242023.rds
    • genes_for_heatmap_fig5F.xlsx
    • all_res_deg_for_heat_updated_march2023.txt

    You can use the following code to set the working directory in R:

    > setwd(directory)

    4. Open the file titled "Install_Packages.R" and execute it in R IDE. This script will attempt to install all the necessary pacakges, and its dependencies in order to set up an environment where the code in "marengo_code_for_paper_jan_2023.R" can be executed.

    5. Once the "Install_Packages.R" script has been successfully executed, re-start R-Studios or your IDE of choice.

    6. Open the file "marengo_code_for_paper_jan_2023.R" file in R-studios or your IDE of choice.

    7. Execute commands in the file titled "marengo_code_for_paper_jan_2023.R" in R-Studios or your IDE of choice to generate the plots.

  2. o

    Repository for the single cell RNA sequencing data analysis for the human...

    • explore.openaire.eu
    Updated Aug 26, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jonathan; Andrew; Pierre; Allart; Adrian (2023). Repository for the single cell RNA sequencing data analysis for the human manuscript. [Dataset]. http://doi.org/10.5281/zenodo.8286134
    Explore at:
    Dataset updated
    Aug 26, 2023
    Authors
    Jonathan; Andrew; Pierre; Allart; Adrian
    Description

    This is the GitHub repository for the single cell RNA sequencing data analysis for the human manuscript. The following essential libraries are required for script execution: Seurat scReportoire ggplot2 dplyr ggridges ggrepel ComplexHeatmap Linked File: -------------------------------------- This repository contains code for the analysis of single cell RNA-seq dataset. The dataset contains raw FASTQ files, as well as, the aligned files that were deposited in GEO. Provided below are descriptions of the linked datasets: 1. Gene Expression Omnibus (GEO) ID: GSE229626 - Title: Gene expression profile at single cell level of human T cells stimulated via antibodies against the T Cell Receptor (TCR) - Description: This submission contains the matrix.mtx, barcodes.tsv, and genes.tsv files for each replicate and condition, corresponding to the aligned files for single cell sequencing data. - Submission type: Private. In order to gain access to the repository, you must use a "reviewer token"(https://www.ncbi.nlm.nih.gov/geo/info/reviewer.html). 2. Sequence read archive (SRA) repository - Title: Gene expression profile at single cell level of human T cells stimulated via antibodies against the T Cell Receptor (TCR) - Description: This submission contains the "raw sequencing" or .fastq.gz files, which are tab delimited text files. - Submission type: Private. In order to gain access to the repository, you must use a "reviewer token" (https://www.ncbi.nlm.nih.gov/geo/info/reviewer.html). Please note that since the GSE submission is private, the raw data deposited at SRA may not be accessible until the embargo on GSE229626 has been lifted. Installation and Instructions -------------------------------------- The code included in this submission requires several essential packages, as listed above. Please follow these instructions for installation: > Ensure you have R version 4.1.2 or higher for compatibility. > Although it is not essential, you can use R-Studios (Version 2022.12.0+353 (2022.12.0+353)) for accessing and executing the code. The following code can be used to set working directory in R: > setwd(directory) Steps: 1. Download the "Human_code_April2023.R" and "Install_Packages.R" R scripts, and the processed data from GSE229626. 2. Open "R-Studios"(https://www.rstudio.com/tags/rstudio-ide/) or a similar integrated development environment (IDE) for R. 3. Set your working directory to where the following files are located: - Human_code_April2023.R - Install_Packages.R 4. Open the file titled Install_Packages.R and execute it in R IDE. This script will attempt to install all the necessary pacakges, and its dependencies. 5. Open the Human_code_April2023.R R script and execute commands as necessary.

  3. Dataset for the R package "DamageDetective: detecting damaged cells in...

    • zenodo.org
    bin
    Updated Apr 2, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Alicen Henning; Alicen Henning (2025). Dataset for the R package "DamageDetective: detecting damaged cells in single-cell RNA sequencing" [Dataset]. http://doi.org/10.5281/zenodo.15117856
    Explore at:
    binAvailable download formats
    Dataset updated
    Apr 2, 2025
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Alicen Henning; Alicen Henning
    License

    https://www.gnu.org/licenses/gpl-3.0-standalone.htmlhttps://www.gnu.org/licenses/gpl-3.0-standalone.html

    Time period covered
    2025
    Description

    Single cell RNA sequencing (scRNA-seq) data originating from the SeuratData package under the name "pbmc3k". This dataset is utilized for vignette demonstrations of the DamageDetective R package.

  4. Data Repository: Single-cell mapper (scMappR): using scRNA-seq to infer...

    • zenodo.org
    • data.niaid.nih.gov
    bin
    Updated Feb 12, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dustin Sokolowski; Mariela Faykoo-Martinez; Lauren Erdman; Huayun Hou; Cadia Chan; Helen Zhu; Melissa M. Holmes; Anna Goldenberg; Michael D Wilson; Dustin Sokolowski; Mariela Faykoo-Martinez; Lauren Erdman; Huayun Hou; Cadia Chan; Helen Zhu; Melissa M. Holmes; Anna Goldenberg; Michael D Wilson (2021). Data Repository: Single-cell mapper (scMappR): using scRNA-seq to infer cell-type specificities of differentially expressed genes [Dataset]. http://doi.org/10.1101/2020.08.24.265298
    Explore at:
    binAvailable download formats
    Dataset updated
    Feb 12, 2021
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Dustin Sokolowski; Mariela Faykoo-Martinez; Lauren Erdman; Huayun Hou; Cadia Chan; Helen Zhu; Melissa M. Holmes; Anna Goldenberg; Michael D Wilson; Dustin Sokolowski; Mariela Faykoo-Martinez; Lauren Erdman; Huayun Hou; Cadia Chan; Helen Zhu; Melissa M. Holmes; Anna Goldenberg; Michael D Wilson
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Data repository for the scMappR manuscript:

    Abstract from biorXiv (https://www.biorxiv.org/content/10.1101/2020.08.24.265298v1.full).

    RNA sequencing (RNA-seq) is widely used to identify differentially expressed genes (DEGs) and reveal biological mechanisms underlying complex biological processes. RNA-seq is often performed on heterogeneous samples and the resulting DEGs do not necessarily indicate the cell types where the differential expression occurred. While single-cell RNA-seq (scRNA-seq) methods solve this problem, technical and cost constraints currently limit its widespread use. Here we present single cell Mapper (scMappR), a method that assigns cell-type specificity scores to DEGs obtained from bulk RNA-seq by integrating cell-type expression data generated by scRNA-seq and existing deconvolution methods. After benchmarking scMappR using RNA-seq data obtained from sorted blood cells, we asked if scMappR could reveal known cell-type specific changes that occur during kidney regeneration. We found that scMappR appropriately assigned DEGs to cell-types involved in kidney regeneration, including a relatively small proportion of immune cells. While scMappR can work with any user supplied scRNA-seq data, we curated scRNA-seq expression matrices for ∼100 human and mouse tissues to facilitate its use with bulk RNA-seq data alone. Overall, scMappR is a user-friendly R package that complements traditional differential expression analysis available at CRAN.

  5. f

    Data_Sheet_2_NormExpression: An R Package to Normalize Gene Expression Data...

    • frontiersin.figshare.com
    zip
    Updated Jun 1, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Zhenfeng Wu; Weixiang Liu; Xiufeng Jin; Haishuo Ji; Hua Wang; Gustavo Glusman; Max Robinson; Lin Liu; Jishou Ruan; Shan Gao (2023). Data_Sheet_2_NormExpression: An R Package to Normalize Gene Expression Data Using Evaluated Methods.zip [Dataset]. http://doi.org/10.3389/fgene.2019.00400.s002
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    Frontiers
    Authors
    Zhenfeng Wu; Weixiang Liu; Xiufeng Jin; Haishuo Ji; Hua Wang; Gustavo Glusman; Max Robinson; Lin Liu; Jishou Ruan; Shan Gao
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Data normalization is a crucial step in the gene expression analysis as it ensures the validity of its downstream analyses. Although many metrics have been designed to evaluate the existing normalization methods, different metrics or different datasets by the same metric yield inconsistent results, particularly for the single-cell RNA sequencing (scRNA-seq) data. The worst situations could be that one method evaluated as the best by one metric is evaluated as the poorest by another metric, or one method evaluated as the best using one dataset is evaluated as the poorest using another dataset. Here raises an open question: principles need to be established to guide the evaluation of normalization methods. In this study, we propose a principle that one normalization method evaluated as the best by one metric should also be evaluated as the best by another metric (the consistency of metrics) and one method evaluated as the best using scRNA-seq data should also be evaluated as the best using bulk RNA-seq data or microarray data (the consistency of datasets). Then, we designed a new metric named Area Under normalized CV threshold Curve (AUCVC) and applied it with another metric mSCC to evaluate 14 commonly used normalization methods using both scRNA-seq data and bulk RNA-seq data, satisfying the consistency of metrics and the consistency of datasets. Our findings paved the way to guide future studies in the normalization of gene expression data with its evaluation. The raw gene expression data, normalization methods, and evaluation metrics used in this study have been included in an R package named NormExpression. NormExpression provides a framework and a fast and simple way for researchers to select the best method for the normalization of their gene expression data based on the evaluation of different methods (particularly some data-driven methods or their own methods) in the principle of the consistency of metrics and the consistency of datasets.

  6. f

    ProjecTILs murine reference atlas of tumor-infiltrating T cells, version 1

    • figshare.com
    application/gzip
    Updated Jun 29, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Massimo Andreatta; Santiago Carmona (2023). ProjecTILs murine reference atlas of tumor-infiltrating T cells, version 1 [Dataset]. http://doi.org/10.6084/m9.figshare.12478571.v2
    Explore at:
    application/gzipAvailable download formats
    Dataset updated
    Jun 29, 2023
    Dataset provided by
    figshare
    Authors
    Massimo Andreatta; Santiago Carmona
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    We have developed ProjecTILs, a computational approach to project new data sets into a reference map of T cells, enabling their direct comparison in a stable, annotated system of coordinates. Because new cells are embedded in the same space of the reference, ProjecTILs enables the classification of query cells into annotated, discrete states, but also over a continuous space of intermediate states. By comparing multiple samples over the same map, and across alternative embeddings, the method allows exploring the effect of cellular perturbations (e.g. as the result of therapy or genetic engineering) and identifying genetic programs significantly altered in the query compared to a control set or to the reference map. We illustrate the projection of several data sets from recent publications over two cross-study murine T cell reference atlases: the first describing tumor-infiltrating T lymphocytes (TILs), the second characterizing acute and chronic viral infection.To construct the reference TIL atlas, we obtained single-cell gene expression matrices from the following GEO entries: GSE124691, GSE116390, GSE121478, GSE86028; and entry E-MTAB-7919 from Array-Express. Data from GSE124691 contained samples from tumor and from tumor-draining lymph nodes, and were therefore treated as two separate datasets. For the TIL projection examples (OVA Tet+, miR-155 KO and Regnase-KO), we obtained the gene expression counts from entries GSE122713, GSE121478 and GSE137015, respectively.Prior to dataset integration, single-cell data from individual studies were filtered using TILPRED-1.0 (https://github.com/carmonalab/TILPRED), which removes cells not enriched in T cell markers (e.g. Cd2, Cd3d, Cd3e, Cd3g, Cd4, Cd8a, Cd8b1) and cells enriched in non T cell genes (e.g. Spi1, Fcer1g, Csf1r, Cd19). Dataset integration was performed using STACAS (https://github.com/carmonalab/STACAS), a batch-correction algorithm based on Seurat 3. For the TIL reference map, we specified 600 variable genes per dataset, excluding cell cycling genes, mitochondrial, ribosomal and non-coding genes, as well as genes expressed in less than 0.1% or more than 90% of the cells of a given dataset. For integration, a total of 800 variable genes were derived as the intersection of the 600 variable genes of individual datasets, prioritizing genes found in multiple datasets and, in case of draws, those derived from the largest datasets. We determined pairwise dataset anchors using STACAS with default parameters, and filtered anchors using an anchor score threshold of 0.8. Integration was performed using the IntegrateData function in Seurat3, providing the anchor set determined by STACAS, and a custom integration tree to initiate alignment from the largest and most heterogeneous datasets.Next, we performed unsupervised clustering of the integrated cell embeddings using the Shared Nearest Neighbor (SNN) clustering method implemented in Seurat 3 with parameters {resolution=0.6, reduction=”umap”, k.param=20}. We then manually annotated individual clusters (merging clusters when necessary) based on several criteria: i) average expression of key marker genes in individual clusters; ii) gradients of gene expression over the UMAP representation of the reference map; iii) gene-set enrichment analysis to determine over- and under- expressed genes per cluster using MAST. In order to have access to predictive methods for UMAP, we recomputed PCA and UMAP embeddings independently of Seurat3 using respectively the prcomp function from basic R package “stats”, and the “umap” R package (https://github.com/tkonopka/umap).

  7. pbmc single cell RNA-seq matrix

    • zenodo.org
    csv
    Updated May 4, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Samuel Buchet; Samuel Buchet; Francesco Carbone; Morgan Magnin; Morgan Magnin; Mickaël Ménager; Olivier Roux; Olivier Roux; Francesco Carbone; Mickaël Ménager (2021). pbmc single cell RNA-seq matrix [Dataset]. http://doi.org/10.5281/zenodo.4730807
    Explore at:
    csvAvailable download formats
    Dataset updated
    May 4, 2021
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Samuel Buchet; Samuel Buchet; Francesco Carbone; Morgan Magnin; Morgan Magnin; Mickaël Ménager; Olivier Roux; Olivier Roux; Francesco Carbone; Mickaël Ménager
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Single cell RNA-sequencing dataset of peripheral blood mononuclear cells (pbmc: T, B, NK and monocytes) extracted from two healthy donors.

    Cells labeled as C26 come from a 30 years old female and cells labeled as C27 come from a 53 years old male. Cells have been isolated from blood using ficoll. Samples were sequenced using standard 3' v3 chemistry protocols by 10x genomics. Cellranger v4.0.0 was used for the processing, and reads were aligned to the ensembl GRCg38 human genome (GRCg38_r98-ensembl_Sept2019). QC metrics were calculated on the count matrix generated by cellranger (filtered_feature_bc_matrix). Cells with less than 3 genes per cells, less than 500 reads per cell and more than 20% of mithocondrial genes were discarded.

    The processing steps was performed with the R package Seurat (https://satijalab.org/seurat/), including sample integration, data normalisation and scaling, dimensional reduction, and clustering. SCTransform method was adopted for the normalisation and scaling steps. The clustered cells were manually annotated using known cell type markers.

    Files content:

    - raw_dataset.csv: raw gene counts

    - normalized_dataset.csv: normalized gene counts (single cell matrix)

    - cell_types.csv: cell types identified from annotated cell clusters

    - cell_types_macro.csv: cell macro types

    - UMAP_coordinates.csv: 2d cell coordinates computed with UMAP algorithm in Seurat

  8. l

    cellCounts

    • opal.latrobe.edu.au
    • researchdata.edu.au
    bin
    Updated Dec 19, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yang Liao; Dinesh Raghu; Bhupinder Pal; Lisa Mielke; Wei Shi (2022). cellCounts [Dataset]. http://doi.org/10.26181/21588276.v3
    Explore at:
    binAvailable download formats
    Dataset updated
    Dec 19, 2022
    Dataset provided by
    La Trobe
    Authors
    Yang Liao; Dinesh Raghu; Bhupinder Pal; Lisa Mielke; Wei Shi
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This page includes the data and code necessary to reproduce the results of the following paper: Yang Liao, Dinesh Raghu, Bhupinder Pal, Lisa Mielke and Wei Shi. cellCounts: fast and accurate quantification of 10x Chromium single-cell RNA sequencing data. Under review. A Linux computer running an operating system of CentOS 7 (or later) or Ubuntu 20.04 (or later) is recommended for running this analysis. The computer should have >2 TB of disk space and >64 GB of RAM. The following software packages need to be installed before running the analysis. Software executables generated after installation should be included in the $PATH environment variable.

    R (v4.0.0 or newer) https://www.r-project.org/ Rsubread (v2.12.2 or newer) http://bioconductor.org/packages/3.16/bioc/html/Rsubread.html CellRanger (v6.0.1) https://support.10xgenomics.com/single-cell-gene-expression/software/overview/welcome STARsolo (v2.7.10a) https://github.com/alexdobin/STAR sra-tools (v2.10.0 or newer) https://github.com/ncbi/sra-tools Seurat (v3.0.0 or newer) https://satijalab.org/seurat/ edgeR (v3.30.0 or newer) https://bioconductor.org/packages/edgeR/ limma (v3.44.0 or newer) https://bioconductor.org/packages/limma/ mltools (v0.3.5 or newer) https://cran.r-project.org/web/packages/mltools/index.html

    Reference packages generated by 10x Genomics are also required for this analysis and they can be downloaded from the following link (2020-A version for individual human and mouse reference packages should be selected): https://support.10xgenomics.com/single-cell-gene-expression/software/downloads/latest After all these are done, you can simply run the shell script ‘test-all-new.bash’ to perform all the analyses carried out in the paper. This script will automatically download the mixture scRNA-seq data from the SRA database, and it will output a text file called ‘test-all.log’ that contains all the screen outputs and speed/accuracy results of CellRanger, STARsolo and cellCounts.

  9. S

    scRNA-seq data of lung cancer

    • scidb.cn
    Updated Jul 21, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Weimin Li (2022). scRNA-seq data of lung cancer [Dataset]. http://doi.org/10.57760/sciencedb.02028
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 21, 2022
    Dataset provided by
    Science Data Bank
    Authors
    Weimin Li
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    we collected 40 tumor and adjacent normal tissue samples from 19 pathologically diagnosed NSCLC patients (10 LUAD and 9 LUSC) during surgical resections, and rapidly digested the tissues to obtain single-cell suspensions and constructed the cDNA libraries of these samples within 24 hours using the protocol of 10X gennomic. These libraries were sequenced on the Illumina NovaSeq 6000 platform. Finally we obtained the raw gene expression matrices were generated using CellRanger (version 3.0.1). Information was processed in R (version 3.6.0) using the Seurat R package (version 2.3.4).

  10. o

    WORKSHOP: Single cell RNAseq analysis in R

    • explore.openaire.eu
    Updated Sep 26, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sarah Williams; Adele Barugahare; Paul Harrison; Laura Perlaza Jimenez; Nicholas Matigan; Valentine Murigneux; Magdalena Antczak; Uwe Winter (2023). WORKSHOP: Single cell RNAseq analysis in R [Dataset]. http://doi.org/10.5281/zenodo.10042918
    Explore at:
    Dataset updated
    Sep 26, 2023
    Authors
    Sarah Williams; Adele Barugahare; Paul Harrison; Laura Perlaza Jimenez; Nicholas Matigan; Valentine Murigneux; Magdalena Antczak; Uwe Winter
    Description

    This record includes training materials associated with the Australian BioCommons workshop 'Single cell RNAseq analysis in R'. This workshop took place over two, 3.5 hour sessions on 26 and 27 October 2023. Event description Analysis and interpretation of single cell RNAseq (scRNAseq) data requires dedicated workflows. In this hands-on workshop we will show you how to perform single cell analysis using Seurat - an R package for QC, analysis, and exploration of single-cell RNAseq data. We will discuss the 'why' behind each step and cover reading in the count data, quality control, filtering, normalisation, clustering, UMAP layout and identification of cluster markers. We will also explore various ways of visualising single cell expression data. This workshop is presented by the Australian BioCommons, Queensland Cyber Infrastructure Foundation (QCIF) and the Monash Genomics and Bioinformatics Platform with the assistance of a network of facilitators from the national Bioinformatics Training Cooperative. Lead trainers: Sarah Williams, Adele Barugahare, Paul Harrison, Laura Perlaza Jimenez Facilitators: Nick Matigan, Valentine Murigneux, Magdalena (Magda) Antczak Infrastructure provision: Uwe Winter Coordinator: Melissa Burke Training materials Materials are shared under a Creative Commons Attribution 4.0 International agreement unless otherwise specified and were current at the time of the event. Files and materials included in this record: Event metadata (PDF): Information about the event including, description, event URL, learning objectives, prerequisites, technical requirements etc. Index of training materials (PDF): List and description of all materials associated with this event including the name, format, location and a brief description of each file. scRNAseq_Schedule (PDF): A breakdown of the topics and timings for the workshop Materials shared elsewhere: This workshop follows the tutorial 'scRNAseq Analysis in R with Seurat' https://swbioinf.github.io/scRNAseqInR_Doco/index.html Slides used to introduce key topics are available via GitHub https://github.com/swbioinf/scRNAseqInR_Doco/tree/main/slides This material is based on the introductory Guided Clustering Tutorial tutorial from Seurat. It is also drawing from a similar workshop held by Monash Bioinformatics Platform Single-Cell-Workshop, with material here.

  11. u

    Data from: Reference transcriptomics of porcine peripheral immune cells...

    • agdatacommons.nal.usda.gov
    • datasets.ai
    • +3more
    zip
    Updated May 6, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Juber Herrera-Uribe; Jayne Wiarda; Sathesh K. Sivasankaran; Lance Daharsh; Haibo Liu; Kristen A. Byrne; Timothy P. L. Smith; Joan K. Lunney; Crystal L. Loving; Christopher K. Tuggle (2025). Data from: Reference transcriptomics of porcine peripheral immune cells created through bulk and single-cell RNA sequencing [Dataset]. http://doi.org/10.15482/USDA.ADC/1522411
    Explore at:
    zipAvailable download formats
    Dataset updated
    May 6, 2025
    Dataset provided by
    Ag Data Commons
    Authors
    Juber Herrera-Uribe; Jayne Wiarda; Sathesh K. Sivasankaran; Lance Daharsh; Haibo Liu; Kristen A. Byrne; Timothy P. L. Smith; Joan K. Lunney; Crystal L. Loving; Christopher K. Tuggle
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Description

    This dataset contains files reconstructing single-cell data presented in 'Reference transcriptomics of porcine peripheral immune cells created through bulk and single-cell RNA sequencing' by Herrera-Uribe & Wiarda et al. 2021. Samples of peripheral blood mononuclear cells (PBMCs) were collected from seven pigs and processed for single-cell RNA sequencing (scRNA-seq) in order to provide a reference annotation of porcine immune cell transcriptomics at enhanced, single-cell resolution. Analysis of single-cell data allowed identification of 36 cell clusters that were further classified into 13 cell types, including monocytes, dendritic cells, B cells, antibody-secreting cells, numerous populations of T cells, NK cells, and erythrocytes. Files may be used to reconstruct the data as presented in the manuscript, allowing for individual query by other users. Scripts for original data analysis are available at https://github.com/USDA-FSEPRU/PorcinePBMCs_bulkRNAseq_scRNAseq. Raw data are available at https://www.ebi.ac.uk/ena/browser/view/PRJEB43826. Funding for this dataset was also provided by NRSP8: National Animal Genome Research Program (https://www.nimss.org/projects/view/mrp/outline/18464). Resources in this dataset:Resource Title: Herrera-Uribe & Wiarda et al. PBMCs - All Cells 10X Format. File Name: PBMC7_AllCells.zipResource Description: Zipped folder containing PBMC counts matrix, gene names, and cell IDs. Files are as follows:

    matrix of gene counts* (matrix.mtx.gx) gene names (features.tsv.gz) cell IDs (barcodes.tsv.gz)

    *The ‘raw’ count matrix is actually gene counts obtained following ambient RNA removal. During ambient RNA removal, we specified to calculate non-integer count estimations, so most gene counts are actually non-integer values in this matrix but should still be treated as raw/unnormalized data that requires further normalization/transformation. Data can be read into R using the function Read10X().Resource Title: Herrera-Uribe & Wiarda et al. PBMCs - All Cells Metadata. File Name: PBMC7_AllCells_meta.csvResource Description: .csv file containing metadata for cells included in the final dataset. Metadata columns include:

    nCount_RNA = the number of transcripts detected in a cell nFeature_RNA = the number of genes detected in a cell Loupe = cell barcodes; correspond to the cell IDs found in the .h5Seurat and 10X formatted objects for all cells prcntMito = percent mitochondrial reads in a cell Scrublet = doublet probability score assigned to a cell seurat_clusters = cluster ID assigned to a cell PaperIDs = sample ID for a cell celltypes = cell type ID assigned to a cellResource Title: Herrera-Uribe & Wiarda et al. PBMCs - All Cells PCA Coordinates. File Name: PBMC7_AllCells_PCAcoord.csvResource Description: .csv file containing first 100 PCA coordinates for cells. Resource Title: Herrera-Uribe & Wiarda et al. PBMCs - All Cells t-SNE Coordinates. File Name: PBMC7_AllCells_tSNEcoord.csvResource Description: .csv file containing t-SNE coordinates for all cells.Resource Title: Herrera-Uribe & Wiarda et al. PBMCs - All Cells UMAP Coordinates. File Name: PBMC7_AllCells_UMAPcoord.csvResource Description: .csv file containing UMAP coordinates for all cells.Resource Title: Herrera-Uribe & Wiarda et al. PBMCs - CD4 T Cells t-SNE Coordinates. File Name: PBMC7_CD4only_tSNEcoord.csvResource Description: .csv file containing t-SNE coordinates for only CD4 T cells (clusters 0, 3, 4, 28). A dataset of only CD4 T cells can be re-created from the PBMC7_AllCells.h5Seurat, and t-SNE coordinates used in publication can be re-assigned using this .csv file.Resource Title: Herrera-Uribe & Wiarda et al. PBMCs - CD4 T Cells UMAP Coordinates. File Name: PBMC7_CD4only_UMAPcoord.csvResource Description: .csv file containing UMAP coordinates for only CD4 T cells (clusters 0, 3, 4, 28). A dataset of only CD4 T cells can be re-created from the PBMC7_AllCells.h5Seurat, and UMAP coordinates used in publication can be re-assigned using this .csv file.Resource Title: Herrera-Uribe & Wiarda et al. PBMCs - Gamma Delta T Cells UMAP Coordinates. File Name: PBMC7_GDonly_UMAPcoord.csvResource Description: .csv file containing UMAP coordinates for only gamma delta T cells (clusters 6, 21, 24, 31). A dataset of only gamma delta T cells can be re-created from the PBMC7_AllCells.h5Seurat, and UMAP coordinates used in publication can be re-assigned using this .csv file.Resource Title: Herrera-Uribe & Wiarda et al. PBMCs - Gamma Delta T Cells t-SNE Coordinates. File Name: PBMC7_GDonly_tSNEcoord.csvResource Description: .csv file containing t-SNE coordinates for only gamma delta T cells (clusters 6, 21, 24, 31). A dataset of only gamma delta T cells can be re-created from the PBMC7_AllCells.h5Seurat, and t-SNE coordinates used in publication can be re-assigned using this .csv file.Resource Title: Herrera-Uribe & Wiarda et al. PBMCs - Gene Annotation Information. File Name: UnfilteredGeneInfo.txtResource Description: .txt file containing gene nomenclature information used to assign gene names in the dataset. 'Name' column corresponds to the name assigned to a feature in the dataset.Resource Title: Herrera-Uribe & Wiarda et al. PBMCs - All Cells H5Seurat. File Name: PBMC7.tarResource Description: .h5Seurat object of all cells in PBMC dataset. File needs to be untarred, then read into R using function LoadH5Seurat().

  12. o

    Single-cell Atlas Reveals Diagnostic Features Predicting Progressive Drug...

    • explore.openaire.eu
    • data.niaid.nih.gov
    • +1more
    Updated Aug 6, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Vaidehi Krishnan; Florian Schmidt; Zahid Nawaz; Prasanna Nori Venkatesh; Lee Kian Leong; Chan Zhu En; Alice Man Sze Cheung; Sudipto Bari; Meera Makheja; Ahmad Lajam; Pavanish Kumar; John Ouyang; Owen Rackham; William Ying Khee Hwang; Salvatore Albani; Charles Chuah; Shyam Prabhakar; Sin Tiong Ong (2021). Single-cell Atlas Reveals Diagnostic Features Predicting Progressive Drug Resistance in Chronic Myeloid Leukemia [Dataset]. http://doi.org/10.5281/zenodo.7337398
    Explore at:
    Dataset updated
    Aug 6, 2021
    Authors
    Vaidehi Krishnan; Florian Schmidt; Zahid Nawaz; Prasanna Nori Venkatesh; Lee Kian Leong; Chan Zhu En; Alice Man Sze Cheung; Sudipto Bari; Meera Makheja; Ahmad Lajam; Pavanish Kumar; John Ouyang; Owen Rackham; William Ying Khee Hwang; Salvatore Albani; Charles Chuah; Shyam Prabhakar; Sin Tiong Ong
    Description

    This archive contains data of scRNAseq and CyTOF in form of Seurat objects, txt and csv files as well as R scripts for data analysis and Figure generation. A summary of the content is provided in the following. R scripts Script to run Machine learning models predicting group specific marker genes: CML_Find_Markers_Zenodo.R Script to reproduce the majority of Main and Supplementary Figures shown in the manuscript: CML_Paper_Figures_Zenodo.R Script to run inferCNV analysis: inferCNV_Zenodo.R Script to plot NATMI analysis results:NATMI_CvsA_FC0.32_Updown_Column_plot_Zenodo.R Script to conduct sub-clustering and filtering of NK cells NK_Marker_Detection_Zenodo.R Helper scripts for plotting and DEG calculation:ComputePairWiseDE_v2.R, Seurat_DE_Heatmap_RCA_Style.R RDS files General scRNA-seq Seurat objects: scRNA-seq seurat object after QC, and cell type annotation used for most analysis in the manuscript: DUKE_DataSet_Doublets_Removed_Relabeled.RDS scRNA-seq including findings e.g. from NK analysis used in the shiny app: DUKE_final_for_Shiny_App.rds Neighborhood enrichment score computed for group A across all HSPCs: Enrichment_score_global_groupA.RDS UMAP coordinates used in the article: Layout_2D_nNeighbours_25_Metric_cosine_TCU_removed.RDS SCENIC files: Regulon set used in SCENIC: 2.6_regulons_asGeneSet.Rds AUC values computed for regulons: 3.4_regulonAUC.Rds MetaData used in SCENIC cellInfo.Rds Group specific regulons for LCS: groupSpecificRegulonsBCRAblP.RDS Patient specific regulons for LSC: patientSpecificRegulonsBCRAblP.RDS Patient specificity score for LSC: PatientSpecificRegulonSpecificityScoreBCRAblP.RDS Regulon specificty score for LSC: RegulonSpecificityScoreBCRAblP.RDS BCR-ABL1 inference: HSC with inferred BCR-ABL1 label: HSCs_CML_with_BCR-Abl_label.RDS UMAP for HSC with inferred BCR-ABL1 label: HSCs_CML_with_BCR-Abl_label_UMAP.RDS HSPCs with BCR-ABL1 module scores: HSPC_metacluster_74K_with_modscore_27thmay.RDS NK sub-clustering and filtering: NK object with module scores: NK_8617cells_with_modscore_1stjune.RDS Feature genes for NK cells computed with DubStepR: NK_Cells_DubStepR NK cells Seurat object excluding contaminating T and B cells: NK_cells_T_B_17_removed.RDS NK Seurat object including neighbourhood enrichment score calculations: NK_seurat_object_with_enrichment_labels_V2.RDS txt and csv files: Proportions per cluster calculated from CyTOF: CyTOF_Proportions.txt Correlation between scRNAseq and CyTOF cell type abundance: scRNAseq_Cor_Cytof.txt Correlation between manual gating and FlowSOM clustering: Manual_vs_FlowSOM.txt GSEA results: HSPC, HSC and LSC results: FINAL_GSEA_DATA_For_GGPLOT.txt NK: NK_For_Plotting.txt TFRC and HLA expression: TFRC_and_HLA_Values.txt NATMI result files: UP-regulated_mean.csv DOWN-regulated_mean.csv Gene position file used in inferCNV: inferCNV_gene_positions_hg38.txt Module scores for NK subclusters per cell: NK_Supplementary_Module_Scores.csv Compressed folders: All CyTOF raw data files: CyTOF_Data_raw.zip Results of the patient-based classifier: PatientwiseClassifier.zip Results of the single-cell based classifier: SingleCellClassifierResults.zip For general new data analysis approaches, we recommend the readers to use the Seruat object stored in DUKE_final_for_Shiny_App.rds or to use the shiny app(http://scdbm.ddnetbio.com/) and perform further analysis from there. RAW data is available at EGA upon request using Study ID: EGAS00001005509 Revision The for_CML_manuscript_revision.tar.gz folder contains scripts and data for the paper revision including 1) Detection of the BCR-ABL fusion with long read sequencing; 2) Identification of BCR-ABL junction reads with scRNAseq; 3) Detection of expressed mutations using scRNAseq.

  13. Z

    Data from: Robust clustering and interpretation of scRNA-seq data using...

    • data.niaid.nih.gov
    • zenodo.org
    Updated May 30, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Florian Schmidt (2021). Robust clustering and interpretation of scRNA-seq data using reference component analysis [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_4021966
    Explore at:
    Dataset updated
    May 30, 2021
    Dataset provided by
    Florian Schmidt
    Bobby Ranjan
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Datasets and Code accompanying the new release of RCA, RCA2. The R-package for RCA2 is available at GitHub: https://github.com/prabhakarlab/RCAv2/

    The datasets included here are:

    Datasets required for a characterization of batch effects:

    merged_rna_seurat.rds

    de_list.rds

    mergedRCAObj.rds

    merged_rna_integrated.rds

    10X_PBMCs.RDS: Processed 10X PBMC data RCA2 object (10X PBMC example data sets )

    NBM_RDS_Files.zip: Several RDS files containing RCA2 object of Normal Bone Marrow (NBM) data, umap coordinates, doublet finder results and metadata information (Normal Bone Marrow use case)

    Dataset used for the Covid19 example:

    blish_covid.seu.rds

    rownames_of_glocal_projection_immune_cells.txt

    Blish_RCA_no_QC_filtering_project_to_multiple_panels.rds

    Data sets used to outline the ability of supervised clustering to detect disease states:

    809653.seurat.rds

    blish_covid.seu.rds

    Performance benchmarking results:

    Memory_consumption.txt

    rca_time_list.rds

    ScanPY input files:

    input_data.zip

    The R script provides R code to regenerate the main paper Figures 2 to 7 modulo some visual modifications performed in Inkscape.

    Provided R scripts are:

    ComputePairWiseDE_v2.R (Required code for pairwise DE computation)

    RCA_Figure_Reproduction.R

    Provided python Code for Scanpy analysis:

    RA_Scanpy.ipynb

    CITESeq_Scanpy.ipynb

  14. f

    beachmat: A Bioconductor C++ API for accessing high-throughput biological...

    • plos.figshare.com
    pdf
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Aaron T. L. Lun; Hervé Pagès; Mike L. Smith (2023). beachmat: A Bioconductor C++ API for accessing high-throughput biological data from a variety of R matrix types [Dataset]. http://doi.org/10.1371/journal.pcbi.1006135
    Explore at:
    pdfAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    PLOS Computational Biology
    Authors
    Aaron T. L. Lun; Hervé Pagès; Mike L. Smith
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Biological experiments involving genomics or other high-throughput assays typically yield a data matrix that can be explored and analyzed using the R programming language with packages from the Bioconductor project. Improvements in the throughput of these assays have resulted in an explosion of data even from routine experiments, which poses a challenge to the existing computational infrastructure for statistical data analysis. For example, single-cell RNA sequencing (scRNA-seq) experiments frequently generate large matrices containing expression values for each gene in each cell, requiring sparse or file-backed representations for memory-efficient manipulation in R. These alternative representations are not easily compatible with high-performance C++ code used for computationally intensive tasks in existing R/Bioconductor packages. Here, we describe a C++ interface named beachmat, which enables agnostic data access from various matrix representations. This allows package developers to write efficient C++ code that is interoperable with dense, sparse and file-backed matrices, amongst others. We evaluated the performance of beachmat for accessing data from each matrix representation using both simulated and real scRNA-seq data, and defined a clear memory/speed trade-off to motivate the choice of an appropriate representation. We also demonstrate how beachmat can be incorporated into the code of other packages to drive analyses of a very large scRNA-seq data set.

  15. m

    Multifocal, multiphenotypic tumours arising from an MTOR mutation acquired...

    • data.mendeley.com
    Updated Dec 8, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Chloe Pacyna (2023). Multifocal, multiphenotypic tumours arising from an MTOR mutation acquired in early embryogenesis - scRNA-seq dataset [Dataset]. http://doi.org/10.17632/6fg25sm5g8.1
    Explore at:
    Dataset updated
    Dec 8, 2023
    Authors
    Chloe Pacyna
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This R dataset contains the Seurat object containing scRNA-seq counts and metadata from a patient with multifocal renal tumors. It is associated with the paper "Multifocal, multiphenotypic tumours arising from an MTOR mutation acquired in early embryogenesis"

  16. Single-cell RNA-seq exposed to multiple compounds

    • kaggle.com
    zip
    Updated Mar 11, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Alexander Chervov (2021). Single-cell RNA-seq exposed to multiple compounds [Dataset]. https://www.kaggle.com/alexandervc/singlecell-rnaseq-exposed-to-multiple-compounds
    Explore at:
    zip(4314885205 bytes)Available download formats
    Dataset updated
    Mar 11, 2021
    Authors
    Alexander Chervov
    Description

    Remark 1: for cell cycle analysis - see paper https://arxiv.org/abs/2208.05229 "Computational challenges of cell cycle analysis using single cell transcriptomics" Alexander Chervov, Andrei Zinovyev

    Remark 2: See same data at: https://www.kaggle.com/datasets/alexandervc/scrnaseq-exposed-to-multiple-compounds extracted pieces from huge file here - more easy to load and work.

    Data and Context

    Data - results of single cell RNA sequencing, i.e. rows - correspond to cells, columns to genes (or vice versa). value of the matrix shows how strong is "expression" of the corresponding gene in the corresponding cell. https://en.wikipedia.org/wiki/Single-cell_transcriptomics

    Particular data:

    Data - scRNA expressions for several cell lines affected by drugs with different doses/durations.

    The data from https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE139944 Status Public on Dec 05, 2019 Title Massively multiplex chemical transcriptomics at single cell resolution Organisms Homo sapiens; Mus musculus Experiment type Expression profiling by high throughput sequencing Summary Single-cell RNA-seq libraries were generated using two and three level single-cell combinatorial indexing RNA sequencing (sci-RNA-seq) of untreated or small molecule inhibitor exposed HEK293T, NIH3T3, A549, MCF7 and K562 cells. Different cells and different treatment were hashed and pooled prior to sci-RNA-seq using a nuclear barcoding strategy. This nuclear barcoding strategy relies on fixation of barcode containing well-specific oligos that are specific to a given cell type, replicate or treatment condition.

    The corresponding paper is here: https://pubmed.ncbi.nlm.nih.gov/31806696/ Science. 2020 Jan 3;367(6473):45-51 "Massively multiplex chemical transcriptomics at single-cell resolution" Sanjay R Srivatsan, ... , Cole Trapnell

    The authors splitted data into 4 subdatasets - see sciPlex1, sciPlex2, sciPlex3,sciPlex4 in filenames. The main dataset is the sciPlex3 which contains about 600K cells.

    The data splitted into small parts - which one can be easily loaded into memory can be found in https://www.kaggle.com/alexandervc/scrnaseq-exposed-to-multiple-compounds

    Related datasets:

    Other single cell RNA seq datasets can be found on kaggle: Look here: https://www.kaggle.com/alexandervc/datasets Or search kaggle for "scRNA-seq"

    A collection of some bioinformatics related resources on kaggle: https://www.kaggle.com/general/203136

    Inspiration

    Single cell RNA sequencing is important technology in modern biology, see e.g. "Eleven grand challenges in single-cell data science" https://genomebiology.biomedcentral.com/articles/10.1186/s13059-020-1926-6

    Also see review : Nature. P. Kharchenko: "The triumphs and limitations of computational methods for scRNA-seq" https://www.nature.com/articles/s41592-021-01171-x

    Search scholar.google "challenges in single cell rna sequencing" https://scholar.google.fr/scholar?q=challenges+in+single+cell+rna+sequencing&hl=en&as_sdt=0&as_vis=1&oi=scholart gives many interesting and highly cited articles

    (Cited 968) Computational and analytical challenges in single-cell transcriptomics Oliver Stegle, Sarah A. Teichmann, John C. Marioni Nat. Rev. Genet., 16 (3) (2015), pp. 133-145 https://www.nature.com/articles/nrg3833

    Challenges in unsupervised clustering of single-cell RNA-seq data https://www.nature.com/articles/s41576-018-0088-9 Review Article 07 January 2019 Vladimir Yu Kiselev, Tallulah S. Andrews & Martin Hemberg Nature Reviews Genetics volume 20, pages273–282 (2019)

    Challenges and emerging directions in single-cell analysis https://link.springer.com/article/10.1186/s13059-017-1218-y Published: 08 May 2017 Guo-Cheng Yuan, Long Cai, Michael Elowitz, Tariq Enver, Guoping Fan, Guoji Guo, Rafael Irizarry, Peter Kharchenko, Junhyong Kim, Stuart Orkin, John Quackenbush, Assieh Saadatpour, Timm Schroeder, Ramesh Shivdasani & Itay Tirosh Genome Biology volume 18, Article number: 84 (2017)

    Single-Cell RNA Sequencing in Cancer: Lessons Learned and Emerging Challenges https://www.sciencedirect.com/science/article/pii/S1097276519303569 Molecular Cell Volume 75, Issue 1, 11 July 2019, Pages 7-12 Journal home page for Molecular Cell

  17. Z

    Processed data for "Dissociation of solid tumour tissues with cold active...

    • data.niaid.nih.gov
    • zenodo.org
    Updated Jan 24, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Campbell, Kieran (2020). Processed data for "Dissociation of solid tumour tissues with cold active protease for single-cell RNA-seq minimizes conserved collagenase-associated stress responses" [Dataset]. https://data.niaid.nih.gov/resources?id=ZENODO_3407790
    Explore at:
    Dataset updated
    Jan 24, 2020
    Dataset authored and provided by
    Campbell, Kieran
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    tar.gz of processed data in the form of compressed R files (rds) of SingleCellExperiment (https://bioconductor.org/packages/release/bioc/html/SingleCellExperiment.html) objects and a metadata csv for the data in the publication Dissociation of solid tumour tissues with cold active protease for single-cell RNA-seq minimizes conserved collagenase-associated stress responses (O'Flanagan et al. 2019).

  18. Z

    Community Package Lasry Dataset

    • data.niaid.nih.gov
    • zenodo.org
    Updated Feb 5, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Celik, Muhammet (2024). Community Package Lasry Dataset [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_10125579
    Explore at:
    Dataset updated
    Feb 5, 2024
    Dataset provided by
    Celik, Muhammet
    Solovey, Maria
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The Community package (https://github.com/SoloveyMaria/community) is an R package designed for the analysis of single-cell RNA sequencing data, specifically for inferring interactions between different cell types. The dataset provided here is compatible with the Community tool, allowing for direct utilization. The dataset associated with this research has undergone peer review and has been published in the journal Nature Cancer. The publication can be accessed via the following link: https://doi.org/10.1038/s43018-022-00480-0. For access to the raw data, please visit: https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE185381. It's important to note that the data in this repository has undergone batch correction and normalization, and the corresponding metadata has been appropriately adjusted. This processed data serves as the input for the Community tool.

  19. List of tumor microenvironment scRNA-seq datasets included in TMExplorer.

    • plos.figshare.com
    xls
    Updated Jun 16, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Erik Christensen; Alaine Naidas; David Chen; Mia Husic; Parisa Shooshtari (2023). List of tumor microenvironment scRNA-seq datasets included in TMExplorer. [Dataset]. http://doi.org/10.1371/journal.pone.0272302.t002
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 16, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Erik Christensen; Alaine Naidas; David Chen; Mia Husic; Parisa Shooshtari
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    List of tumor microenvironment scRNA-seq datasets included in TMExplorer.

  20. d

    Data from: Natural variation in a cortex/epidermis-specific transcription...

    • search.dataone.org
    • datadryad.org
    Updated Feb 27, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Pengcheng Li; Tianze Zhu; Yunyun Wang; Xiaomin Zhang; Xiaoyi Yang; Shuai Fang; Wei Li; Wenye Rui; Aiqing Yang; Yamin Duan; Yuxing Yan; Qingchun Pan; Zhongtao Jia; Houmiao Wang; Zefeng Yang; Peng Yu; Chenwu Xu (2025). Natural variation in a cortex/epidermis-specific transcription factor bZIP89 determines lateral root development and drought resilience in maize [Dataset]. http://doi.org/10.5061/dryad.nzs7h451h
    Explore at:
    Dataset updated
    Feb 27, 2025
    Dataset provided by
    Dryad Digital Repository
    Authors
    Pengcheng Li; Tianze Zhu; Yunyun Wang; Xiaomin Zhang; Xiaoyi Yang; Shuai Fang; Wei Li; Wenye Rui; Aiqing Yang; Yamin Duan; Yuxing Yan; Qingchun Pan; Zhongtao Jia; Houmiao Wang; Zefeng Yang; Peng Yu; Chenwu Xu
    Description

    Lateral roots (LRs) branching is crucial for water and nutrient acquisition in plants, thus determining the overall plant performance and productivity. However, the transcriptional regulation of LR development in crops and its role in enhancing stress resilience remain largely unexplored. Leveraging integrated transcriptome-wide association studies (TWAS) and single-cell RNA sequencing (scRNA-seq), we reported a basic leucine zipper (bZIP) transcription factor ZmbZIP89 as an important regulator of LRs elongation and mapped its spatial expression pattern in cortex/epidermis cell types. ZmbZIP89 activates the expression of ZmPRX47 to regulate the production of root reactive oxygen species (ROS) homeostasis, contributing to increased lateral root length (LRL) and enhanced drought resistance. Natural variations in the 3′UTR of ZmbZIP89 enhance its expression by increasing mRNA stability, leading to improved LRL and drought tolerance. These findings contribute to our understanding of the mol..., , , # Natural variation in a cortex/epidermis-specific transcription factor bZIP89 determines lateral root development and drought resilience in maize

    https://doi.org/10.5061/dryad.nzs7h451h

    Description of the data and file structure

    1-RNAseq.sh: RNAseq for 357 lines

    2-TWAS.R: TWAS for LRL

    3.1-scRNAseq.R: scRNAseq for primary root

    3.2-Deconvolution.R: Deconvolution

    3.3-AuCell.R: AuCell for LRL genes

    4-DAPseq.sh: DAPseq for bZIP89

    5-WGS.sh: Call variants for 164 inbred lines

    6-xgboost8TF100iteration.py: Machine learning models for trait prediction

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Jonathan Hsu; Allart Stoop; Jonathan Hsu; Allart Stoop (2023). Repository for Single Cell RNA Sequencing Analysis of The EMT6 Dataset [Dataset]. http://doi.org/10.5281/zenodo.10011622
Organization logo

Repository for Single Cell RNA Sequencing Analysis of The EMT6 Dataset

Explore at:
bin, txtAvailable download formats
Dataset updated
Nov 20, 2023
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Jonathan Hsu; Allart Stoop; Jonathan Hsu; Allart Stoop
Description

Table of Contents

  1. Main Description
  2. File Descriptions
  3. Linked Files
  4. Installation and Instructions

1. Main Description

---------------------------

This is the Zenodo repository for the manuscript titled "A TCR β chain-directed antibody-fusion molecule that activates and expands subsets of T cells and promotes antitumor activity.". The code included in the file titled `marengo_code_for_paper_jan_2023.R` was used to generate the figures from the single-cell RNA sequencing data.

The following libraries are required for script execution:

  • Seurat
  • scReportoire
  • ggplot2
  • stringr
  • dplyr
  • ggridges
  • ggrepel
  • ComplexHeatmap

File Descriptions

---------------------------

  • The code can be downloaded and opened in RStudios.
  • The "marengo_code_for_paper_jan_2023.R" contains all the code needed to reproduce the figues in the paper
  • The "Marengo_newID_March242023.rds" file is available at the following address: https://zenodo.org/badge/DOI/10.5281/zenodo.7566113.svg (Zenodo DOI: 10.5281/zenodo.7566113).
  • The "all_res_deg_for_heat_updated_march2023.txt" file contains the unfiltered results from DGE anlaysis, also used to create the heatmap with DGE and volcano plots.
  • The "genes_for_heatmap_fig5F.xlsx" contains the genes included in the heatmap in figure 5F.

Linked Files

---------------------

This repository contains code for the analysis of single cell RNA-seq dataset. The dataset contains raw FASTQ files, as well as, the aligned files that were deposited in GEO. The "Rdata" or "Rds" file was deposited in Zenodo. Provided below are descriptions of the linked datasets:

Gene Expression Omnibus (GEO) ID: GSE223311(https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE223311)

  • Title: Gene expression profile at single cell level of CD4+ and CD8+ tumor infiltrating lymphocytes (TIL) originating from the EMT6 tumor model from mSTAR1302 treatment.
  • Description: This submission contains the "matrix.mtx", "barcodes.tsv", and "genes.tsv" files for each replicate and condition, corresponding to the aligned files for single cell sequencing data.
  • Submission type: Private. In order to gain access to the repository, you must use a reviewer token (https://www.ncbi.nlm.nih.gov/geo/info/reviewer.html).

Sequence read archive (SRA) repository ID: SRX19088718 and SRX19088719

  • Title: Gene expression profile at single cell level of CD4+ and CD8+ tumor infiltrating lymphocytes (TIL) originating from the EMT6 tumor model from mSTAR1302 treatment.
  • Description: This submission contains the **raw sequencing** or `.fastq.gz` files, which are tab delimited text files.
  • Submission type: Private. In order to gain access to the repository, you must use a reviewer token (https://www.ncbi.nlm.nih.gov/geo/info/reviewer.html).

Zenodo DOI: 10.5281/zenodo.7566113(https://zenodo.org/record/7566113#.ZCcmvC2cbrJ)

  • Title: A TCR β chain-directed antibody-fusion molecule that activates and expands subsets of T cells and promotes antitumor activity.
  • Description: This submission contains the "Rdata" or ".Rds" file, which is an R object file. This is a necessary file to use the code.
  • Submission type: Restricted Acess. In order to gain access to the repository, you must contact the author.

Installation and Instructions

--------------------------------------

The code included in this submission requires several essential packages, as listed above. Please follow these instructions for installation:

> Ensure you have R version 4.1.2 or higher for compatibility.

> Although it is not essential, you can use R-Studios (Version 2022.12.0+353 (2022.12.0+353)) for accessing and executing the code.

1. Download the *"Rdata" or ".Rds" file from Zenodo (https://zenodo.org/record/7566113#.ZCcmvC2cbrJ) (Zenodo DOI: 10.5281/zenodo.7566113).

2. Open R-Studios (https://www.rstudio.com/tags/rstudio-ide/) or a similar integrated development environment (IDE) for R.

3. Set your working directory to where the following files are located:

  • marengo_code_for_paper_jan_2023.R
  • Install_Packages.R
  • Marengo_newID_March242023.rds
  • genes_for_heatmap_fig5F.xlsx
  • all_res_deg_for_heat_updated_march2023.txt

You can use the following code to set the working directory in R:

> setwd(directory)

4. Open the file titled "Install_Packages.R" and execute it in R IDE. This script will attempt to install all the necessary pacakges, and its dependencies in order to set up an environment where the code in "marengo_code_for_paper_jan_2023.R" can be executed.

5. Once the "Install_Packages.R" script has been successfully executed, re-start R-Studios or your IDE of choice.

6. Open the file "marengo_code_for_paper_jan_2023.R" file in R-studios or your IDE of choice.

7. Execute commands in the file titled "marengo_code_for_paper_jan_2023.R" in R-Studios or your IDE of choice to generate the plots.

Search
Clear search
Close search
Google apps
Main menu