100+ datasets found

c
Data from: Reference transcriptomics of porcine peripheral immune cells...
s.cnmilf.com
agdatacommons.nal.usda.gov
+3more
Updated Jun 5, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Agricultural Research Service (2025). Data from: Reference transcriptomics of porcine peripheral immune cells created through bulk and single-cell RNA sequencing [Dataset]. https://s.cnmilf.com/user74170196/https/catalog.data.gov/dataset/data-from-reference-transcriptomics-of-porcine-peripheral-immune-cells-created-through-bul-e667c
Explore at:
Dataset updated
Jun 5, 2025
Dataset provided by
Agricultural Research Service
Description
This dataset contains files reconstructing single-cell data presented in 'Reference transcriptomics of porcine peripheral immune cells created through bulk and single-cell RNA sequencing' by Herrera-Uribe & Wiarda et al. 2021. Samples of peripheral blood mononuclear cells (PBMCs) were collected from seven pigs and processed for single-cell RNA sequencing (scRNA-seq) in order to provide a reference annotation of porcine immune cell transcriptomics at enhanced, single-cell resolution. Analysis of single-cell data allowed identification of 36 cell clusters that were further classified into 13 cell types, including monocytes, dendritic cells, B cells, antibody-secreting cells, numerous populations of T cells, NK cells, and erythrocytes. Files may be used to reconstruct the data as presented in the manuscript, allowing for individual query by other users. Scripts for original data analysis are available at https://github.com/USDA-FSEPRU/PorcinePBMCs_bulkRNAseq_scRNAseq. Raw data are available at https://www.ebi.ac.uk/ena/browser/view/PRJEB43826. Funding for this dataset was also provided by NRSP8: National Animal Genome Research Program (https://www.nimss.org/projects/view/mrp/outline/18464). Resources in this dataset:Resource Title: Herrera-Uribe & Wiarda et al. PBMCs - All Cells 10X Format. File Name: PBMC7_AllCells.zipResource Description: Zipped folder containing PBMC counts matrix, gene names, and cell IDs. Files are as follows: matrix of gene counts* (matrix.mtx.gx) gene names (features.tsv.gz) cell IDs (barcodes.tsv.gz) *The ‘raw’ count matrix is actually gene counts obtained following ambient RNA removal. During ambient RNA removal, we specified to calculate non-integer count estimations, so most gene counts are actually non-integer values in this matrix but should still be treated as raw/unnormalized data that requires further normalization/transformation. Data can be read into R using the function Read10X().Resource Title: Herrera-Uribe & Wiarda et al. PBMCs - All Cells Metadata. File Name: PBMC7_AllCells_meta.csvResource Description: .csv file containing metadata for cells included in the final dataset. Metadata columns include: nCount_RNA = the number of transcripts detected in a cell nFeature_RNA = the number of genes detected in a cell Loupe = cell barcodes; correspond to the cell IDs found in the .h5Seurat and 10X formatted objects for all cells prcntMito = percent mitochondrial reads in a cell Scrublet = doublet probability score assigned to a cell seurat_clusters = cluster ID assigned to a cell PaperIDs = sample ID for a cell celltypes = cell type ID assigned to a cellResource Title: Herrera-Uribe & Wiarda et al. PBMCs - All Cells PCA Coordinates. File Name: PBMC7_AllCells_PCAcoord.csvResource Description: .csv file containing first 100 PCA coordinates for cells. Resource Title: Herrera-Uribe & Wiarda et al. PBMCs - All Cells t-SNE Coordinates. File Name: PBMC7_AllCells_tSNEcoord.csvResource Description: .csv file containing t-SNE coordinates for all cells.Resource Title: Herrera-Uribe & Wiarda et al. PBMCs - All Cells UMAP Coordinates. File Name: PBMC7_AllCells_UMAPcoord.csvResource Description: .csv file containing UMAP coordinates for all cells.Resource Title: Herrera-Uribe & Wiarda et al. PBMCs - CD4 T Cells t-SNE Coordinates. File Name: PBMC7_CD4only_tSNEcoord.csvResource Description: .csv file containing t-SNE coordinates for only CD4 T cells (clusters 0, 3, 4, 28). A dataset of only CD4 T cells can be re-created from the PBMC7_AllCells.h5Seurat, and t-SNE coordinates used in publication can be re-assigned using this .csv file.Resource Title: Herrera-Uribe & Wiarda et al. PBMCs - CD4 T Cells UMAP Coordinates. File Name: PBMC7_CD4only_UMAPcoord.csvResource Description: .csv file containing UMAP coordinates for only CD4 T cells (clusters 0, 3, 4, 28). A dataset of only CD4 T cells can be re-created from the PBMC7_AllCells.h5Seurat, and UMAP coordinates used in publication can be re-assigned using this .csv file.Resource Title: Herrera-Uribe & Wiarda et al. PBMCs - Gamma Delta T Cells UMAP Coordinates. File Name: PBMC7_GDonly_UMAPcoord.csvResource Description: .csv file containing UMAP coordinates for only gamma delta T cells (clusters 6, 21, 24, 31). A dataset of only gamma delta T cells can be re-created from the PBMC7_AllCells.h5Seurat, and UMAP coordinates used in publication can be re-assigned using this .csv file.Resource Title: Herrera-Uribe & Wiarda et al. PBMCs - Gamma Delta T Cells t-SNE Coordinates. File Name: PBMC7_GDonly_tSNEcoord.csvResource Description: .csv file containing t-SNE coordinates for only gamma delta T cells (clusters 6, 21, 24, 31). A dataset of only gamma delta T cells can be re-created from the PBMC7_AllCells.h5Seurat, and t-SNE coordinates used in publication can be re-assigned using this .csv file.Resource Title: Herrera-Uribe & Wiarda et al. PBMCs - Gene Annotation Information. File Name: UnfilteredGeneInfo.txtResource Description: .txt file containing gene nomenclature information used to assign gene names in the dataset. 'Name' column corresponds to the name assigned to a feature in the dataset.Resource Title: Herrera-Uribe & Wiarda et al. PBMCs - All Cells H5Seurat. File Name: PBMC7.tarResource Description: .h5Seurat object of all cells in PBMC dataset. File needs to be untarred, then read into R using function LoadH5Seurat().
Z
Repository for Single Cell RNA Sequencing Analysis of The EMT6 Dataset
data.niaid.nih.gov
zenodo.org
Updated Nov 20, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Stoop, Allart (2023). Repository for Single Cell RNA Sequencing Analysis of The EMT6 Dataset [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_10011621
Explore at:
Dataset updated
Nov 20, 2023
Dataset provided by
Stoop, Allart
Hsu, Jonathan
Description
Table of Contents

Main Description File Descriptions Linked Files Installation and Instructions

1. Main Description

This is the Zenodo repository for the manuscript titled "A TCR β chain-directed antibody-fusion molecule that activates and expands subsets of T cells and promotes antitumor activity.". The code included in the file titled marengo_code_for_paper_jan_2023.R was used to generate the figures from the single-cell RNA sequencing data. The following libraries are required for script execution:

Seurat scReportoire ggplot2 stringr dplyr ggridges ggrepel ComplexHeatmap

File Descriptions

The code can be downloaded and opened in RStudios. The "marengo_code_for_paper_jan_2023.R" contains all the code needed to reproduce the figues in the paper The "Marengo_newID_March242023.rds" file is available at the following address: https://zenodo.org/badge/DOI/10.5281/zenodo.7566113.svg (Zenodo DOI: 10.5281/zenodo.7566113). The "all_res_deg_for_heat_updated_march2023.txt" file contains the unfiltered results from DGE anlaysis, also used to create the heatmap with DGE and volcano plots. The "genes_for_heatmap_fig5F.xlsx" contains the genes included in the heatmap in figure 5F.

Linked Files

This repository contains code for the analysis of single cell RNA-seq dataset. The dataset contains raw FASTQ files, as well as, the aligned files that were deposited in GEO. The "Rdata" or "Rds" file was deposited in Zenodo. Provided below are descriptions of the linked datasets:

Gene Expression Omnibus (GEO) ID: GSE223311(https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE223311)

Title: Gene expression profile at single cell level of CD4+ and CD8+ tumor infiltrating lymphocytes (TIL) originating from the EMT6 tumor model from mSTAR1302 treatment. Description: This submission contains the "matrix.mtx", "barcodes.tsv", and "genes.tsv" files for each replicate and condition, corresponding to the aligned files for single cell sequencing data. Submission type: Private. In order to gain access to the repository, you must use a reviewer token (https://www.ncbi.nlm.nih.gov/geo/info/reviewer.html).

Sequence read archive (SRA) repository ID: SRX19088718 and SRX19088719

Title: Gene expression profile at single cell level of CD4+ and CD8+ tumor infiltrating lymphocytes (TIL) originating from the EMT6 tumor model from mSTAR1302 treatment. Description: This submission contains the raw sequencing or .fastq.gz files, which are tab delimited text files. Submission type: Private. In order to gain access to the repository, you must use a reviewer token (https://www.ncbi.nlm.nih.gov/geo/info/reviewer.html).

Zenodo DOI: 10.5281/zenodo.7566113(https://zenodo.org/record/7566113#.ZCcmvC2cbrJ)

Title: A TCR β chain-directed antibody-fusion molecule that activates and expands subsets of T cells and promotes antitumor activity. Description: This submission contains the "Rdata" or ".Rds" file, which is an R object file. This is a necessary file to use the code. Submission type: Restricted Acess. In order to gain access to the repository, you must contact the author.

Installation and Instructions

The code included in this submission requires several essential packages, as listed above. Please follow these instructions for installation:

Ensure you have R version 4.1.2 or higher for compatibility.

Although it is not essential, you can use R-Studios (Version 2022.12.0+353 (2022.12.0+353)) for accessing and executing the code.

Download the *"Rdata" or ".Rds" file from Zenodo (https://zenodo.org/record/7566113#.ZCcmvC2cbrJ) (Zenodo DOI: 10.5281/zenodo.7566113).

Open R-Studios (https://www.rstudio.com/tags/rstudio-ide/) or a similar integrated development environment (IDE) for R.

Set your working directory to where the following files are located:

marengo_code_for_paper_jan_2023.R Install_Packages.R Marengo_newID_March242023.rds genes_for_heatmap_fig5F.xlsx all_res_deg_for_heat_updated_march2023.txt

You can use the following code to set the working directory in R:

setwd(directory)

Open the file titled "Install_Packages.R" and execute it in R IDE. This script will attempt to install all the necessary pacakges, and its dependencies in order to set up an environment where the code in "marengo_code_for_paper_jan_2023.R" can be executed.

Once the "Install_Packages.R" script has been successfully executed, re-start R-Studios or your IDE of choice.

Open the file "marengo_code_for_paper_jan_2023.R" file in R-studios or your IDE of choice.

Execute commands in the file titled "marengo_code_for_paper_jan_2023.R" in R-Studios or your IDE of choice to generate the plots.
single-cell RNA sequencing count matrix
figshare.com
application/gzip
Updated Jun 1, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Weike Pei; Fuwei Shang; Xi Wang; Thorsten Feyerabend; Thomas Höfer; Hans-Reimer Rodewald (2023). single-cell RNA sequencing count matrix [Dataset]. http://doi.org/10.6084/m9.figshare.11842245.v1
Explore at:
application/gzipAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.11842245.v1
Dataset updated
Jun 1, 2023
Dataset provided by
Figsharehttp://figshare.com/
Authors
Weike Pei; Fuwei Shang; Xi Wang; Thorsten Feyerabend; Thomas Höfer; Hans-Reimer Rodewald
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
These are count matrix data from all the four experiments. Experiment ID and cell population names are indicated in the file names. To read the count matrix data, please follow the instruction: https://support.10xgenomics.com/single-cell-gene-expression/software/pipelines/latest/output/matrices
pbmc single cell RNA-seq matrix
zenodo.org
csv
Updated May 4, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Samuel Buchet; Samuel Buchet; Francesco Carbone; Morgan Magnin; Morgan Magnin; Mickaël Ménager; Olivier Roux; Olivier Roux; Francesco Carbone; Mickaël Ménager (2021). pbmc single cell RNA-seq matrix [Dataset]. http://doi.org/10.5281/zenodo.4730807
Explore at:
csvAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.4730807
Dataset updated
May 4, 2021
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Samuel Buchet; Samuel Buchet; Francesco Carbone; Morgan Magnin; Morgan Magnin; Mickaël Ménager; Olivier Roux; Olivier Roux; Francesco Carbone; Mickaël Ménager
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Single cell RNA-sequencing dataset of peripheral blood mononuclear cells (pbmc: T, B, NK and monocytes) extracted from two healthy donors.

Cells labeled as C26 come from a 30 years old female and cells labeled as C27 come from a 53 years old male. Cells have been isolated from blood using ficoll. Samples were sequenced using standard 3' v3 chemistry protocols by 10x genomics. Cellranger v4.0.0 was used for the processing, and reads were aligned to the ensembl GRCg38 human genome (GRCg38_r98-ensembl_Sept2019). QC metrics were calculated on the count matrix generated by cellranger (filtered_feature_bc_matrix). Cells with less than 3 genes per cells, less than 500 reads per cell and more than 20% of mithocondrial genes were discarded.

The processing steps was performed with the R package Seurat (https://satijalab.org/seurat/), including sample integration, data normalisation and scaling, dimensional reduction, and clustering. SCTransform method was adopted for the normalisation and scaling steps. The clustered cells were manually annotated using known cell type markers.

Files content:

- raw_dataset.csv: raw gene counts

- normalized_dataset.csv: normalized gene counts (single cell matrix)

- cell_types.csv: cell types identified from annotated cell clusters

- cell_types_macro.csv: cell macro types

- UMAP_coordinates.csv: 2d cell coordinates computed with UMAP algorithm in Seurat
BD Rhapsody matrix of single-cell RNA sequencing (scRNA-seq)
figshare.com
zip
Updated Feb 11, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Chunpeng He (2024). BD Rhapsody matrix of single-cell RNA sequencing (scRNA-seq) [Dataset]. http://doi.org/10.6084/m9.figshare.25202630.v1
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.25202630.v1
Dataset updated
Feb 11, 2024
Dataset provided by
Figsharehttp://figshare.com/
Authors
Chunpeng He
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
BD Rhapsody matrix of single-cell RNA sequencing (scRNA-seq)
Generating an expression matrix for droplet single-cell RNA-seq (dscRNA-seq)...
zenodo.org
application/gzip, bin
Updated Aug 4, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jonathan Manning; Wendi Bacon; Jonathan Manning; Wendi Bacon (2022). Generating an expression matrix for droplet single-cell RNA-seq (dscRNA-seq) data [Dataset]. http://doi.org/10.5281/zenodo.3661237
Explore at:
application/gzip, binAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.3661237
Dataset updated
Aug 4, 2022
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Jonathan Manning; Wendi Bacon; Jonathan Manning; Wendi Bacon
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This tutorial is adapted from the 'Generating an expression matrix' training session at the EBI (https://www.ebi.ac.uk/training/events/2019/single-cell-rna-seq-analysis-questions-clusters).
f
DataSheet1_Missing Value Imputation With Low-Rank Matrix Completion in...
frontiersin.figshare.com
docx
Updated Jun 3, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Meng Huang; Xiucai Ye; Hongmin Li; Tetsuya Sakurai (2023). DataSheet1_Missing Value Imputation With Low-Rank Matrix Completion in Single-Cell RNA-Seq Data by Considering Cell Heterogeneity.docx [Dataset]. http://doi.org/10.3389/fgene.2022.952649.s001
Explore at:
docxAvailable download formats
Unique identifier
https://doi.org/10.3389/fgene.2022.952649.s001
Dataset updated
Jun 3, 2023
Dataset provided by
Frontiers
Authors
Meng Huang; Xiucai Ye; Hongmin Li; Tetsuya Sakurai
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Single-cell RNA-sequencing (scRNA-seq) technologies enable the measurements of gene expressions in individual cells, which is helpful for exploring cancer heterogeneity and precision medicine. However, various technical noises lead to false zero values (missing gene expression values) in scRNA-seq data, termed as dropout events. These zero values complicate the analysis of cell patterns, which affects the high-precision analysis of intra-tumor heterogeneity. Recovering missing gene expression values is still a major obstacle in the scRNA-seq data analysis. In this study, taking the cell heterogeneity into consideration, we develop a novel method, called single cell Gauss–Newton Gene expression Imputation (scGNGI), to impute the scRNA-seq expression matrices by using a low-rank matrix completion. The obtained experimental results on the simulated datasets and real scRNA-seq datasets show that scGNGI can more effectively impute the missing values for scRNA-seq gene expression and improve the down-stream analysis compared to other state-of-the-art methods. Moreover, we show that the proposed method can better preserve gene expression variability among cells. Overall, this study helps explore the complex biological system and precision medicine in scRNA-seq data.
scRNA-seq data
figshare.com
zip
Updated Nov 11, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Daniel Leite (2023). scRNA-seq data [Dataset]. http://doi.org/10.6084/m9.figshare.23659953.v1
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.23659953.v1
Dataset updated
Nov 11, 2023
Dataset provided by
figshare
Figsharehttp://figshare.com/
Authors
Daniel Leite
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
scRNA-seq data for "An atlas of spider development at single-cell resolution provides new insights into arthropod embryogenesis"
n
Data from: Large-scale integration of single-cell transcriptomic data...
data.niaid.nih.gov
dataone.org
+1more
zip
Updated Dec 14, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
David McKellar; Iwijn De Vlaminck; Benjamin Cosgrove (2021). Large-scale integration of single-cell transcriptomic data captures transitional progenitor states in mouse skeletal muscle regeneration [Dataset]. http://doi.org/10.5061/dryad.t4b8gtj34
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5061/dryad.t4b8gtj34
Dataset updated
Dec 14, 2021
Dataset provided by
Cornell University
Authors
David McKellar; Iwijn De Vlaminck; Benjamin Cosgrove
License
https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
Description
Skeletal muscle repair is driven by the coordinated self-renewal and fusion of myogenic stem and progenitor cells. Single-cell gene expression analyses of myogenesis have been hampered by the poor sampling of rare and transient cell states that are critical for muscle repair, and do not inform the spatial context that is important for myogenic differentiation. Here, we demonstrate how large-scale integration of single-cell and spatial transcriptomic data can overcome these limitations. We created a single-cell transcriptomic dataset of mouse skeletal muscle by integration, consensus annotation, and analysis of 23 newly collected scRNAseq datasets and 88 publicly available single-cell (scRNAseq) and single-nucleus (snRNAseq) RNA-sequencing datasets. The resulting dataset includes more than 365,000 cells and spans a wide range of ages, injury, and repair conditions. Together, these data enabled identification of the predominant cell types in skeletal muscle, and resolved cell subtypes, including endothelial subtypes distinguished by vessel-type of origin, fibro/adipogenic progenitors defined by functional roles, and many distinct immune populations. The representation of different experimental conditions and the depth of transcriptome coverage enabled robust profiling of sparsely expressed genes. We built a densely sampled transcriptomic model of myogenesis, from stem cell quiescence to myofiber maturation and identified rare, transitional states of progenitor commitment and fusion that are poorly represented in individual datasets. We performed spatial RNA sequencing of mouse muscle at three time points after injury and used the integrated dataset as a reference to achieve a high-resolution, local deconvolution of cell subtypes. We also used the integrated dataset to explore ligand-receptor co-expression patterns and identify dynamic cell-cell interactions in muscle injury response. We provide a public web tool to enable interactive exploration and visualization of the data. Our work supports the utility of large-scale integration of single-cell transcriptomic data as a tool for biological discovery.

Methods Mice. The Cornell University Institutional Animal Care and Use Committee (IACUC) approved all animal protocols, and experiments were performed in compliance with its institutional guidelines. Adult C57BL/6J mice (mus musculus) were obtained from Jackson Laboratories (#000664; Bar Harbor, ME) and were used at 4-7 months of age. Aged C57BL/6J mice were obtained from the National Institute of Aging (NIA) Rodent Aging Colony and were used at 20 months of age. For new scRNAseq experiments, female mice were used in each experiment.

Mouse injuries and single-cell isolation. To induce muscle injury, both tibialis anterior (TA) muscles of old (20 months) C57BL/6J mice were injected with 10 µl of notexin (10 µg/ml; Latoxan; France). At 0, 1, 2, 3.5, 5, or 7 days post-injury (dpi), mice were sacrificed and TA muscles were collected and processed independently to generate single-cell suspensions. Muscles were digested with 8 mg/ml Collagenase D (Roche; Switzerland) and 10 U/ml Dispase II (Roche; Switzerland), followed by manual dissociation to generate cell suspensions. Cell suspensions were sequentially filtered through 100 and 40 μm filters (Corning Cellgro #431752 and #431750) to remove debris. Erythrocytes were removed through incubation in erythrocyte lysis buffer (IBI Scientific #89135-030).

Single-cell RNA-sequencing library preparation. After digestion, single-cell suspensions were washed and resuspended in 0.04% BSA in PBS at a concentration of 106 cells/ml. Cells were counted manually with a hemocytometer to determine their concentration. Single-cell RNA-sequencing libraries were prepared using the Chromium Single Cell 3’ reagent kit v3 (10x Genomics, PN-1000075; Pleasanton, CA) following the manufacturer’s protocol. Cells were diluted into the Chromium Single Cell A Chip to yield a recovery of 6,000 single-cell transcriptomes. After preparation, libraries were sequenced using on a NextSeq 500 (Illumina; San Diego, CA) using 75 cycle high output kits (Index 1 = 8, Read 1 = 26, and Read 2 = 58). Details on estimated sequencing saturation and the number of reads per sample are shown in Sup. Data 1.

Spatial RNA sequencing library preparation. Tibialis anterior muscles of adult (5 mo) C57BL6/J mice were injected with 10µl notexin (10 µg/ml) at 2, 5, and 7 days prior to collection. Upon collection, tibialis anterior muscles were isolated, embedded in OCT, and frozen fresh in liquid nitrogen. Spatially tagged cDNA libraries were built using the Visium Spatial Gene Expression 3’ Library Construction v1 Kit (10x Genomics, PN-1000187; Pleasanton, CA) (Fig. S7). Optimal tissue permeabilization time for 10 µm thick sections was found to be 15 minutes using the 10x Genomics Visium Tissue Optimization Kit (PN-1000193). H&E stained tissue sections were imaged using Zeiss PALM MicroBeam laser capture microdissection system and the images were stitched and processed using Fiji ImageJ software. cDNA libraries were sequenced on an Illumina NextSeq 500 using 150 cycle high output kits (Read 1=28bp, Read 2=120bp, Index 1=10bp, and Index 2=10bp). Frames around the capture area on the Visium slide were aligned manually and spots covering the tissue were selected using Loop Browser v4.0.0 software (10x Genomics). Sequencing data was then aligned to the mouse reference genome (mm10) using the spaceranger v1.0.0 pipeline to generate a feature-by-spot-barcode expression matrix (10x Genomics).

Download and alignment of single-cell RNA sequencing data. For all samples available via SRA, parallel-fastq-dump (github.com/rvalieris/parallel-fastq-dump) was used to download raw .fastq files. Samples which were only available as .bam files were converted to .fastq format using bamtofastq from 10x Genomics (github.com/10XGenomics/bamtofastq). Raw reads were aligned to the mm10 reference using cellranger (v3.1.0).

Preprocessing and batch correction of single-cell RNA sequencing datasets. First, ambient RNA signal was removed using the default SoupX (v1.4.5) workflow (autoEstCounts and adjustCounts; github.com/constantAmateur/SoupX). Samples were then preprocessed using the standard Seurat (v3.2.1) workflow (NormalizeData, ScaleData, FindVariableFeatures, RunPCA, FindNeighbors, FindClusters, and RunUMAP; github.com/satijalab/seurat). Cells with fewer than 750 features, fewer than 1000 transcripts, or more than 30% of unique transcripts derived from mitochondrial genes were removed. After preprocessing, DoubletFinder (v2.0) was used to identify putative doublets in each dataset, individually. BCmvn optimization was used for PK parameterization. Estimated doublet rates were computed by fitting the total number of cells after quality filtering to a linear regression of the expected doublet rates published in the 10x Chromium handbook. Estimated homotypic doublet rates were also accounted for using the modelHomotypic function. The default PN value (0.25) was used. Putative doublets were then removed from each individual dataset. After preprocessing and quality filtering, we merged the datasets and performed batch-correction with three tools, independently- Harmony (github.com/immunogenomics/harmony) (v1.0), Scanorama (github.com/brianhie/scanorama) (v1.3), and BBKNN (github.com/Teichlab/bbknn) (v1.3.12). We then used Seurat to process the integrated data. After initial integration, we removed the noisy cluster and re-integrated the data using each of the three batch-correction tools.

Cell type annotation. Cell types were determined for each integration method independently. For Harmony and Scanorama, dimensions accounting for 95% of the total variance were used to generate SNN graphs (Seurat::FindNeighbors). Louvain clustering was then performed on the output graphs (including the corrected graph output by BBKNN) using Seurat::FindClusters. A clustering resolution of 1.2 was used for Harmony (25 initial clusters), BBKNN (28 initial clusters), and Scanorama (38 initial clusters). Cell types were determined based on expression of canonical genes (Fig. S3). Clusters which had similar canonical marker gene expression patterns were merged.

Pseudotime workflow. Cells were subset based on the consensus cell types between all three integration methods. Harmony embedding values from the dimensions accounting for 95% of the total variance were used for further dimensional reduction with PHATE, using phateR (v1.0.4) (github.com/KrishnaswamyLab/phateR).

Deconvolution of spatial RNA sequencing spots. Spot deconvolution was performed using the deconvolution module in BayesPrism (previously known as “Tumor microEnvironment Deconvolution”, TED, v1.0; github.com/Danko-Lab/TED). First, myogenic cells were re-labeled, according to binning along the first PHATE dimension, as “Quiescent MuSCs” (bins 4-5), “Activated MuSCs” (bins 6-7), “Committed Myoblasts” (bins 8-10), and “Fusing Myoctes” (bins 11-18). Culture-associated muscle stem cells were ignored and myonuclei labels were retained as “Myonuclei (Type IIb)” and “Myonuclei (Type IIx)”. Next, highly and differentially expressed genes across the 25 groups of cells were identified with differential gene expression analysis using Seurat (FindAllMarkers, using Wilcoxon Rank Sum Test; results in Sup. Data 2). The resulting genes were filtered based on average log2-fold change (avg_logFC > 1) and the percentage of cells within the cluster which express each gene (pct.expressed > 0.5), yielding 1,069 genes. Mitochondrial and ribosomal protein genes were also removed from this list, in line with recommendations in the BayesPrism vignette. For each of the cell types, mean raw counts were calculated across the 1,069 genes to generate a gene expression profile for BayesPrism. Raw counts for each spot were then passed to the run.Ted function, using
Colorectal cancer scRNA-seq 10xG-format data matrix
data.niaid.nih.gov
datadryad.org
zip
Updated Dec 9, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Christopher Lausted; Raymond Yeung; Xiaowei Yan; Neda Jabbari; Qiang Tian; Heidi Kenerson; Changting Meng; Dani Bergey; Venu Pillarisetty; Kevin Sullivan; Priyanka Baloni; Leroy Hood (2020). Colorectal cancer scRNA-seq 10xG-format data matrix [Dataset]. http://doi.org/10.5061/dryad.pvmcvdngt
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5061/dryad.pvmcvdngt
Dataset updated
Dec 9, 2020
Dataset provided by
Institute for Systems Biologyhttps://isbscience.org/
University of Washington
Authors
Christopher Lausted; Raymond Yeung; Xiaowei Yan; Neda Jabbari; Qiang Tian; Heidi Kenerson; Changting Meng; Dani Bergey; Venu Pillarisetty; Kevin Sullivan; Priyanka Baloni; Leroy Hood
License
https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
Description
Metastatic colorectal cancer (CRC) is a major cause of cancer-related death and incidence is rising in the younger population (<50 years). Current chemotherapies can achieve response rates above 50%, but immunotherapies have limited value for patients with microsatellite-stable (MSS) cancers. The present study investigates the impact of chemotherapy on the tumor immune microenvironment. We treat human liver metastases slices with 5-Fluorouracil (5FU) plus either irinotecan or oxaliplatin, then perform single-cell transcriptome analyses. Results from eight cases reveal two cellular subtypes with divergent responses to chemotherapy. Susceptible tumors are characterized by a stemness signature, an activated interferon pathway, and suppression of PD-1 ligands in response to 5FU+irinotecan. Conversely, immune checkpoint TIM-3 ligands are maintained or up-regulated by chemotherapy in CRC with an enterocyte-like signature, and combining chemotherapy with TIM-3 blockade leads to synergistic tumor killing. Together, our analyses highlight chemo-modulation of the immune microenvironment and provide a framework for combined chemo-immunotherapies.

Methods Following surgical resection of CRLM specimens greater than 2 cm in diameter, sterile 6 mm tumor tissue cores were punch biopsied and immediately placed in BELZER-UW solution on ice. Within hours, cores were cut into 250 μm thick slices by vibratome and placed with media onto Millicell Cell Culture Inserts in a 24-well cell culture plate. Tumor slices were treated with 1 µg/ml 5-Fluorouracil in combination with 1 µg/ml oxaliplatin in the FOLFOX group or 1 µg/ml 5-Fluorouracil in combination with 2 µg/ml irinotecan in the FOLFIRI group. Control group consisted of slices treated with 0.2% DMSO in medium.

Tumor slices were dissociated using the MACS Tumor Dissociation Kit according to the Miltenyi Biotec “dissociation of soft tumors” protocol. Cells were processed by 10xGenomics Chromium using the single-cell 3' RNA version 2 protocol. RNAseq libraries were sequenced on the NextSeq500 instrument for 150 cycles (26 bp for Read 1 and 124 bp for Read 2). Reads were aligned to the human genome (GRCh38) and quantified using the Cell Ranger version 2.0 with default settings. All gene expression data was saved to a single Cell Ranger-like Market Exchange Format (MEX) sparse data matrix.
Multiple Single Cell RNA Expressions ARCHS4
kaggle.com
zip
Updated Jun 26, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Alexander (2021). Multiple Single Cell RNA Expressions ARCHS4 [Dataset]. https://www.kaggle.com/alexandervc/multiple-single-cell-rna-expressions-archs4
Explore at:
zip(23088130184 bytes)Available download formats
Dataset updated
Jun 26, 2021
Authors
Alexander
Description
Context

Dataset is downloaded from https://amp.pharm.mssm.edu/archs4/download.html The methods are described in Nature Communications paper: https://www.nature.com/articles/s41467-018-03751-6

The ARCHS4 data provides user-friendly access to multiple gene expression data from the GEO database. (https://www.ncbi.nlm.nih.gov/geo/ ). While in GEO database most of data is stored in raw formats, ARCHS4 provides prepared count matrix expression data. While GEO contains data stored separately for each research paper, ARCHS4 collects all the information in one single matrix. One may consult the main site for further information.

Main data files are in H5 (HD5, Hierarchical Data Format ) file format https://en.wikipedia.org/wiki/Hierarchical_Data_Format It contains expression data, as well as annotation data and futher meta-information. There are several other auxilliary files like TSNE 3d projection (in CSV format) and correlation matrices for genes for human and mouse in feather format.

Content

The main file (for human): human_matrix.h5 - contains data matrix - which is 238522 samples times 35238 genes, as well as, various meta information: gene names, samples information (tissue, etc), references to GEO database id where all the details can be found.

There is also similar data for mouse, csv files with TSNE images, correlation matrices for genes.

Acknowledgements

The ARCHS4 project is by :

'Alexander Lachmann', 'alexander.lachmann@mssm.edu', update: '2020-02-06'
s
Single-cell RNA sequencing data on primary samples from: Aberrant expression...
figshare.scilifelab.se
researchdata.se
+1more
hdf
Updated Jul 1, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Carl Sandén; Henrik Lilljebjörn; Thoas Fioretos (2025). Single-cell RNA sequencing data on primary samples from: Aberrant expression of SLAMF6 constitutes a targetable immune escape mechanism in acute myeloid leukemia [Dataset]. http://doi.org/10.17044/scilifelab.28263911.v2
Explore at:
hdfAvailable download formats
Unique identifier
https://doi.org/10.17044/scilifelab.28263911.v2
Dataset updated
Jul 1, 2025
Dataset provided by
Lund University
Authors
Carl Sandén; Henrik Lilljebjörn; Thoas Fioretos
License
Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically
Description
This dataset includes single-cell RNA sequencing (scRNA-seq) data from primary AML (acute myeloid leukemia) samples. Libraries were produced using the 10X Genomics Chromium Single Cell 3ʹ Reagent Kits v3 and sequenced on an Illumina Novaseq 6000 system (Illumina). The dataset is available as raw sequencing reads (fastq; restricted access) or as an annotated matrix of scRNA count data (h5ad).
Quantifying batch effects for individual genes in single-cell data
zenodo.org
bin, zip
Updated May 14, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yang Zhou; Qiongyu Sheng; Guohua Wang; Li Xu; Shuilin Jin; Yang Zhou; Qiongyu Sheng; Guohua Wang; Li Xu; Shuilin Jin (2025). Quantifying batch effects for individual genes in single-cell data [Dataset]. http://doi.org/10.5281/zenodo.13358933
Explore at:
zip, binAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.13358933
Dataset updated
May 14, 2025
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Yang Zhou; Qiongyu Sheng; Guohua Wang; Li Xu; Shuilin Jin; Yang Zhou; Qiongyu Sheng; Guohua Wang; Li Xu; Shuilin Jin
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
Aug 22, 2024
Description
The normalized datasets and their data source used in the GTE manuscript. Because several datasets are too large to upload the whole gene set, we only upload the data matrix of highly variable genes (HVGs). The codes can be found at https://github.com/yzhou1999/GTEs.
Z
Single-cell Spatial Transcriptomics Data with Paired RNAseq for TISSUE...
data.niaid.nih.gov
zenodo.org
Updated Jan 8, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sun, Eric (2024). Single-cell Spatial Transcriptomics Data with Paired RNAseq for TISSUE spatial gene expression prediction [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_8259941
Explore at:
Dataset updated
Jan 8, 2024
Dataset authored and provided by
Sun, Eric
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Dataset folders from "TISSUE: uncertainty-calibrated prediction of single-cell spatial transcriptomics improves downstream analyses". If using the processed data or TISSUE algorithm, please cite: https://doi.org/10.1101/2023.04.25.538326.

The directory of datasets are compressed in tar gzip format. The top level contains folders with dataset names and within each of those folders, there are the relevant data files which include:

Spatial_count.txt --- a tab-delimited file containing spatial transcriptomics counts matrix

scRNA_count.txt --- a tab-delimited file containing RNAseq counts matrix

Locations.txt --- a tab-delimited file containing the (x,y) spatial coordinates of cells in the spatial transcriptomics data

Metadata.txt --- for some datasets, this is a comma-separated file containing the metadata table for the spatial transcriptomics data

These files are formatted and organized to be read into AnnData objects using the native loading functions in the TISSUE package (https://github.com/sunericd/TISSUE). Some folders will also have additional accessory files such as gene lists corresponding to some experiments present in our manuscript and/or adjacency matrix objects.

Also included are the two simulated spatial transcriptomics datasets that we generated using SRTsim.

The SVZ folders contain our processed MERFISH spatial transcriptomics dataset on the adult mouse subventricular zone. Refer to the SVZFullFinal folder for the full dataset with TISSUE-informed cell labels. All other folders are processed data accessed from publicly available sources. The identity of numbered folders can be found in the Data Availability statement of the benchmarking paper from which they were retrieved: https://doi.org/10.1038/s41592-022-01480-9

"svz_merfish_data.zip" includes the raw MERFISH dataset on the adult mouse subventricular zone.
Meta data for single cells
figshare.com
bin
Updated May 7, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Andrew Leduc (2024). Meta data for single cells [Dataset]. http://doi.org/10.6084/m9.figshare.25282663.v1
Explore at:
binAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.25282663.v1
Dataset updated
May 7, 2024
Dataset provided by
figshare
Figsharehttp://figshare.com/
Authors
Andrew Leduc
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Information such as label, run, diameter...ect ID column corresponds to columns of single cell data matrix
SCANPY Python package for scRNA-seq analysis
kaggle.com
Updated Feb 5, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Alexander Chervov (2022). SCANPY Python package for scRNA-seq analysis [Dataset]. https://www.kaggle.com/datasets/alexandervc/scanpy-python-package-for-scrnaseq-analysis/discussion
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Feb 5, 2022
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Alexander Chervov
Description
Remark: for cell cycle analysis - see paper https://arxiv.org/abs/2208.05229 "Computational challenges of cell cycle analysis using single cell transcriptomics" Alexander Chervov, Andrei Zinovyev (Scanpy is not always reliable for cell cycle analysis ).

https://scanpy.readthedocs.io/en/stable/

Scanpy – Single-Cell Analysis in Python

Scanpy is a scalable toolkit for analyzing single-cell gene expression data built jointly with anndata. It includes preprocessing, visualization, clustering, trajectory inference and differential expression testing. The Python-based implementation efficiently deals with datasets of more than one million cells.

Single cell RNA sequencing data - count matrices: rows - correspond to cells, columns to genes, value of the matrix shows how strong is "expression" of the corresponding gene in the corresponding cell. https://en.wikipedia.org/wiki/Single-cell_transcriptomics

SCANPY is a scalable toolkit for analyzing single-cell gene expression data. It includes methods for preprocessing, visualization, clustering, pseudotime and trajectory inference, differential expression testing, and simulation of gene regulatory networks. Its Python-based implementation efficiently deals with data sets of more than one million cells (https://github.com/theislab/Scanpy). Along with SCANPY, we present ANNDATA, a generic class for handling annotated data matrices (https://github.com/theislab/anndata).

Paper:

Wolf, F., Angerer, P. & Theis, F. SCANPY: large-scale single-cell gene expression data analysis. Genome Biol 19, 15 (2018). https://doi.org/10.1186/s13059-017-1382-0 https://genomebiology.biomedcentral.com/articles/10.1186/s13059-017-1382-0

Inspiration

Single cell RNA sequencing is important technology in modern biology, see e.g. "Eleven grand challenges in single-cell data science" https://genomebiology.biomedcentral.com/articles/10.1186/s13059-020-1926-6 Also see review : Nature. P. Kharchenko: "The triumphs and limitations of computational methods for scRNA-seq" https://www.nature.com/articles/s41592-021-01171-x
o
Repository for the single cell RNA sequencing data analysis for the human...
explore.openaire.eu
Updated Aug 26, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jonathan; Andrew; Pierre; Allart; Adrian (2023). Repository for the single cell RNA sequencing data analysis for the human manuscript. [Dataset]. http://doi.org/10.5281/zenodo.8286134
Explore at:
Unique identifier
https://doi.org/10.5281/zenodo.8286134
Dataset updated
Aug 26, 2023
Authors
Jonathan; Andrew; Pierre; Allart; Adrian
Description
This is the GitHub repository for the single cell RNA sequencing data analysis for the human manuscript. The following essential libraries are required for script execution: Seurat scReportoire ggplot2 dplyr ggridges ggrepel ComplexHeatmap Linked File: -------------------------------------- This repository contains code for the analysis of single cell RNA-seq dataset. The dataset contains raw FASTQ files, as well as, the aligned files that were deposited in GEO. Provided below are descriptions of the linked datasets: 1. Gene Expression Omnibus (GEO) ID: GSE229626 - Title: Gene expression profile at single cell level of human T cells stimulated via antibodies against the T Cell Receptor (TCR) - Description: This submission contains the matrix.mtx, barcodes.tsv, and genes.tsv files for each replicate and condition, corresponding to the aligned files for single cell sequencing data. - Submission type: Private. In order to gain access to the repository, you must use a "reviewer token"(https://www.ncbi.nlm.nih.gov/geo/info/reviewer.html). 2. Sequence read archive (SRA) repository - Title: Gene expression profile at single cell level of human T cells stimulated via antibodies against the T Cell Receptor (TCR) - Description: This submission contains the "raw sequencing" or .fastq.gz files, which are tab delimited text files. - Submission type: Private. In order to gain access to the repository, you must use a "reviewer token" (https://www.ncbi.nlm.nih.gov/geo/info/reviewer.html). Please note that since the GSE submission is private, the raw data deposited at SRA may not be accessible until the embargo on GSE229626 has been lifted. Installation and Instructions -------------------------------------- The code included in this submission requires several essential packages, as listed above. Please follow these instructions for installation: > Ensure you have R version 4.1.2 or higher for compatibility. > Although it is not essential, you can use R-Studios (Version 2022.12.0+353 (2022.12.0+353)) for accessing and executing the code. The following code can be used to set working directory in R: > setwd(directory) Steps: 1. Download the "Human_code_April2023.R" and "Install_Packages.R" R scripts, and the processed data from GSE229626. 2. Open "R-Studios"(https://www.rstudio.com/tags/rstudio-ide/) or a similar integrated development environment (IDE) for R. 3. Set your working directory to where the following files are located: - Human_code_April2023.R - Install_Packages.R 4. Open the file titled Install_Packages.R and execute it in R IDE. This script will attempt to install all the necessary pacakges, and its dependencies. 5. Open the Human_code_April2023.R R script and execute commands as necessary.

Single-cell RNA-Seq and TCR-Seq analysis of PD-1+ CD8+ T-cells responding to...

zenodo.org

bin, csv, zip

Updated Oct 24, 2024

Facebook

Twitter

Click to copy link

Link copied

Cite

Bertram Bengsch; Bertram Bengsch; Sagar; Sagar; Zhen Zhang; Zhen Zhang (2024). Single-cell RNA-Seq and TCR-Seq analysis of PD-1+ CD8+ T-cells responding to anti-PD-1 and anti-PD-1/CTLA-4 immunotherapy in melanoma [Dataset]. http://doi.org/10.5281/zenodo.13971562

Explore at:

bin, csv, zipAvailable download formats

Unique identifier

https://doi.org/10.5281/zenodo.13971562

Dataset updated

Oct 24, 2024

Dataset provided by

Zenodo

Authors

Bertram Bengsch; Bertram Bengsch; Sagar; Sagar; Zhen Zhang; Zhen Zhang

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

This dataset details the scRNASeq and TCR-Seq analysis of sorted PD-1+ CD8+ T cells from patients with melanoma treated with checkpoint therapy (anti-PD-1 monotherapy and anti-PD-1 & anti-CTLA-4 combination therapy) at baseline and after the first cycle of therapy. A major publication using this dataset is accessible here: (reference)

*experimental design

Single-cell RNA sequencing was performed using 10x Genomics with feature barcoding technology to multiplex cell samples from different patients undergoing mono or dual therapy so that they can be loaded on one well to reduce costs and minimize technical variability. Hashtag oligomers (oligos) were obtained as purified and already oligo-conjugated in TotalSeq-C format from BioLegend. Cells were thawed, counted and 20 million cells per patient and time point were used for staining. Cells were stained with barcoded antibodies together with a staining solution containing antibodies against CD3, CD4, CD8, PD-1/IgG4 and fixable viability dye (eBioscience) prior to FACS sorting. Barcoded antibody concentrations used were 0.5 µg per million cells, as recommended by the manufacturer (BioLegend) for flow cytometry applications. After staining, cells were washed twice in PBS containing 2% BSA and 0.01% Tween 20, followed by centrifugation (300 xg 5 min at 4 °C) and supernatant exchange. After the final wash, cells were resuspended in PBS and filtered through 40 µm cell strainers and proceeded for sorting. Sorted cells were counted and approximately 75,000 cells were processed through 10x Genomics single-cell V(D)J workflow according to the manufacturer’s instructions. Gene expression, hashing and TCR libraries were pooled to desired quantities to obtain the sequencing depths of 15,000 reads per cell for gene expression libraries and 5,000 reads per cell for hashing and TCR libraries. Libraries were sequenced on a NovaSeq 6000 flow cell in a 2X100 paired-end format.

*extract protocol

PBMCs were thawed, counted and 20 million cells per patient and time point were used for staining. Cells were stained with barcoded antibodies together with a staining solution containing antibodies against CD3, CD4, CD8, PD-1/IgG4 and fixable viability dye (eBioscience) prior to FACS sorting. Barcoded antibody concentrations used were 0.5 µg per million cells, as recommended by the manufacturer (BioLegend) for flow cytometry applications. After staining, cells were washed twice in PBS containing 2% BSA and 0.01% Tween 20, followed by centrifugation (300 xg 5 min at 4 °C) and supernatant exchange. After the final wash, cells were resuspended in PBS and filtered through 40 µm cell strainers and proceeded for sorting. Sorted cells were counted and approximately 75,000 cells were processed through 10x Genomics single-cell V(D)J workflow according to the manufacturer’s instructions.

*library construction protocol

Sorted cells were counted and approximately 75,000 cells were processed through 10x Genomics single-cell V(D)J workflow according to the manufacturer’s instructions. Gene expression, hashing and TCR libraries were pooled to desired quantities to obtain the sequencing depths of 15,000 reads per cell for gene expression libraries and 5,000 reads per cell for hashing and TCR libraries. Libraries were sequenced on a NovaSeq 6000 flow cell in a 2X100 paired-end format.

*library strategy

scRNA-seq and scTCR-seq

*data processing step

Pre-processing of sequencing results to generate count matrices (gene expression and HTO barcode counts) was performed using the 10x genomics Cell Ranger pipeline.

Further processing was done with Seurat (cell and gene filtering, hashtag identification, clustering, differential gene expression analysis based on gene expression).

*genome build/assembly

Alignment was performed using prebuilt Cell Ranger human reference GRCh38.

*processed data files format and content

RNA counts and HTO counts are in sparse matrix format and TCR clonotypes are in csv format.

Datasets were merged and analyzed by Seurat and the analyzed objects are in rds format.

file name	file checksum
PD1CD8_160421_filtered_feature_bc_matrix.zip	da2e006d2b39485fd8cf8701742c6d77
PD1CD8_190421_filtered_feature_bc_matrix.zip	e125fc5031899bba71e1171888d78205
PD1CD8_160421_filtered_contig_annotations.csv	927241805d507204fbe9ef7045d0ccf4
PD1CD8_190421_filtered_contig_annotations.csv	8ca544d27f06e66592b567d3ab86551e

*processed data file	antibodies/tags
PD1CD8_160421_filtered_feature_bc_matrix.zip	none
PD1CD8_160421_filtered_feature_bc_matrix.zip	TotalSeq™-C0251 anti-human Hashtag 1 Antibody - (HASH_1) - M1_base_monotherapy TotalSeq™-C0252 anti-human Hashtag 2 Antibody - (HASH_2) - M1_post_monotherapy TotalSeq™-C0253 anti-human Hashtag 3 Antibody - (HASH_3) - C1_base_combined_therapy TotalSeq™-C0254 anti-human Hashtag 4 Antibody - (HASH_4) - C1_post_combined_therapy TotalSeq™-C0255 anti-human Hashtag 5 Antibody - (HASH_5) - C2_base_combined_therapy TotalSeq™-C0256 anti-human Hashtag 6 Antibody - (HASH_6) - C2_post_combined_therapy
PD1CD8_160421_filtered_contig_annotations.csv	none
PD1CD8_190421_filtered_feature_bc_matrix.zip	none
PD1CD8_190421_filtered_feature_bc_matrix.zip	TotalSeq™-C0251 anti-human Hashtag 1 Antibody - (HASH_1) - M2_base_monotherapy TotalSeq™-C0252 anti-human Hashtag 2 Antibody - (HASH_2) - M2_post_monotherapy TotalSeq™-C0253 anti-human Hashtag 3 Antibody - (HASH_3) - M3_base_monotherapy TotalSeq™-C0254 anti-human Hashtag 4 Antibody - (HASH_4) - M3_post_monotherapy TotalSeq™-C0255 anti-human Hashtag 5 Antibody - (HASH_5) - C3_base_combined_therapy TotalSeq™-C0256 anti-human Hashtag 6 Antibody - (HASH_6) - C3_post_combined_therapy
PD1CD8_190421_filtered_contig_annotations.csv	none

m
Data from: Single-cell RNA-Seq of human primary lung and bronchial...
data.mendeley.com
figshare.com
Updated Mar 13, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Soeren Lukassen (2020). Single-cell RNA-Seq of human primary lung and bronchial epithelium cells [Dataset]. http://doi.org/10.17632/7r2cwbw44m.1
Explore at:
Unique identifier
https://doi.org/10.17632/7r2cwbw44m.1
Dataset updated
Mar 13, 2020
Authors
Soeren Lukassen
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset contains count matrices and per-cells metadata tables for RNA sequencing of 39778 single nuclei from healthy primary lung samples of 12 lung adenocarcinoma patients as well as 17451 single human bronchiole epithelial cells from 4 donors. All samples were processed using the 10X Genomics Chromium platform with v2 chemistry and sequenced with one sample per lane on an Illumina HiSeq4000. Reads were aligned to the hg19 reference genome version 1.2.0 obtained from 10X Genomics. Data processing was performed using Seurat3. The metadata table includes patient ID, sex, age, smoking status, and cell type, as well as QC statistics (number of genes, number of cells, ratio of mitochondrial reads).
E
Processed Chromium Single Cell GEX, CSP and VDJ data from intestinal plasma...
ega-archive.org
Updated Apr 18, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). Processed Chromium Single Cell GEX, CSP and VDJ data from intestinal plasma cells of untreated celiac disease patients [Dataset]. https://ega-archive.org/datasets/EGAD50000000339
Explore at:
Dataset updated
Apr 18, 2024
License
https://ega-archive.org/dacs/EGAC50000000162https://ega-archive.org/dacs/EGAC50000000162
Description
The dataset contains processed sequencing data from Chromium Single Cell 5’ gene expression, human B cell VDJ and feature barcode (CSP) sequencing from transglutaminase 2-specific and other small intestinal plasma cells isolated from four untreated celiac disease patients. The raw sequencing data has been processed with Cell Ranger v.6.0.2 with the multi and aggr functions using the pre-built Cell Ranger references GRCh38 version 2020-A for gene expression and GRCh38-alts-ensembl-5.0.0 for V(D)J analysis. The dataset consists of a gene expression and antibody capture expression matrix (cell barcodes and feature names in tsv.gz file, expression matrix in mtx.gz file) and VDJ sequences in AIRR format (csv file). A metadata file (csv file) details cells passing our custom quality control based on number of detected genes, UMIs, mitochondrial genes, immunoglobulin genes and a productively rearranged immunoglobulin heavy chain of the IgA isotype.

Facebook

Twitter

Click to copy link

Link copied

Cite

Agricultural Research Service (2025). Data from: Reference transcriptomics of porcine peripheral immune cells created through bulk and single-cell RNA sequencing [Dataset]. https://s.cnmilf.com/user74170196/https/catalog.data.gov/dataset/data-from-reference-transcriptomics-of-porcine-peripheral-immune-cells-created-through-bul-e667c

Data from: Reference transcriptomics of porcine peripheral immune cells created through bulk and single-cell RNA sequencing

Explore at:

Dataset updated

Jun 5, 2025

Dataset provided by

Agricultural Research Service

Description

This dataset contains files reconstructing single-cell data presented in 'Reference transcriptomics of porcine peripheral immune cells created through bulk and single-cell RNA sequencing' by Herrera-Uribe & Wiarda et al. 2021. Samples of peripheral blood mononuclear cells (PBMCs) were collected from seven pigs and processed for single-cell RNA sequencing (scRNA-seq) in order to provide a reference annotation of porcine immune cell transcriptomics at enhanced, single-cell resolution. Analysis of single-cell data allowed identification of 36 cell clusters that were further classified into 13 cell types, including monocytes, dendritic cells, B cells, antibody-secreting cells, numerous populations of T cells, NK cells, and erythrocytes. Files may be used to reconstruct the data as presented in the manuscript, allowing for individual query by other users. Scripts for original data analysis are available at https://github.com/USDA-FSEPRU/PorcinePBMCs_bulkRNAseq_scRNAseq. Raw data are available at https://www.ebi.ac.uk/ena/browser/view/PRJEB43826. Funding for this dataset was also provided by NRSP8: National Animal Genome Research Program (https://www.nimss.org/projects/view/mrp/outline/18464). Resources in this dataset:Resource Title: Herrera-Uribe & Wiarda et al. PBMCs - All Cells 10X Format. File Name: PBMC7_AllCells.zipResource Description: Zipped folder containing PBMC counts matrix, gene names, and cell IDs. Files are as follows: matrix of gene counts* (matrix.mtx.gx) gene names (features.tsv.gz) cell IDs (barcodes.tsv.gz) *The ‘raw’ count matrix is actually gene counts obtained following ambient RNA removal. During ambient RNA removal, we specified to calculate non-integer count estimations, so most gene counts are actually non-integer values in this matrix but should still be treated as raw/unnormalized data that requires further normalization/transformation. Data can be read into R using the function Read10X().Resource Title: Herrera-Uribe & Wiarda et al. PBMCs - All Cells Metadata. File Name: PBMC7_AllCells_meta.csvResource Description: .csv file containing metadata for cells included in the final dataset. Metadata columns include: nCount_RNA = the number of transcripts detected in a cell nFeature_RNA = the number of genes detected in a cell Loupe = cell barcodes; correspond to the cell IDs found in the .h5Seurat and 10X formatted objects for all cells prcntMito = percent mitochondrial reads in a cell Scrublet = doublet probability score assigned to a cell seurat_clusters = cluster ID assigned to a cell PaperIDs = sample ID for a cell celltypes = cell type ID assigned to a cellResource Title: Herrera-Uribe & Wiarda et al. PBMCs - All Cells PCA Coordinates. File Name: PBMC7_AllCells_PCAcoord.csvResource Description: .csv file containing first 100 PCA coordinates for cells. Resource Title: Herrera-Uribe & Wiarda et al. PBMCs - All Cells t-SNE Coordinates. File Name: PBMC7_AllCells_tSNEcoord.csvResource Description: .csv file containing t-SNE coordinates for all cells.Resource Title: Herrera-Uribe & Wiarda et al. PBMCs - All Cells UMAP Coordinates. File Name: PBMC7_AllCells_UMAPcoord.csvResource Description: .csv file containing UMAP coordinates for all cells.Resource Title: Herrera-Uribe & Wiarda et al. PBMCs - CD4 T Cells t-SNE Coordinates. File Name: PBMC7_CD4only_tSNEcoord.csvResource Description: .csv file containing t-SNE coordinates for only CD4 T cells (clusters 0, 3, 4, 28). A dataset of only CD4 T cells can be re-created from the PBMC7_AllCells.h5Seurat, and t-SNE coordinates used in publication can be re-assigned using this .csv file.Resource Title: Herrera-Uribe & Wiarda et al. PBMCs - CD4 T Cells UMAP Coordinates. File Name: PBMC7_CD4only_UMAPcoord.csvResource Description: .csv file containing UMAP coordinates for only CD4 T cells (clusters 0, 3, 4, 28). A dataset of only CD4 T cells can be re-created from the PBMC7_AllCells.h5Seurat, and UMAP coordinates used in publication can be re-assigned using this .csv file.Resource Title: Herrera-Uribe & Wiarda et al. PBMCs - Gamma Delta T Cells UMAP Coordinates. File Name: PBMC7_GDonly_UMAPcoord.csvResource Description: .csv file containing UMAP coordinates for only gamma delta T cells (clusters 6, 21, 24, 31). A dataset of only gamma delta T cells can be re-created from the PBMC7_AllCells.h5Seurat, and UMAP coordinates used in publication can be re-assigned using this .csv file.Resource Title: Herrera-Uribe & Wiarda et al. PBMCs - Gamma Delta T Cells t-SNE Coordinates. File Name: PBMC7_GDonly_tSNEcoord.csvResource Description: .csv file containing t-SNE coordinates for only gamma delta T cells (clusters 6, 21, 24, 31). A dataset of only gamma delta T cells can be re-created from the PBMC7_AllCells.h5Seurat, and t-SNE coordinates used in publication can be re-assigned using this .csv file.Resource Title: Herrera-Uribe & Wiarda et al. PBMCs - Gene Annotation Information. File Name: UnfilteredGeneInfo.txtResource Description: .txt file containing gene nomenclature information used to assign gene names in the dataset. 'Name' column corresponds to the name assigned to a feature in the dataset.Resource Title: Herrera-Uribe & Wiarda et al. PBMCs - All Cells H5Seurat. File Name: PBMC7.tarResource Description: .h5Seurat object of all cells in PBMC dataset. File needs to be untarred, then read into R using function LoadH5Seurat().

Clear search

Close search

Google apps

Main menu

Data from: Reference transcriptomics of porcine peripheral immune cells...

Repository for Single Cell RNA Sequencing Analysis of The EMT6 Dataset

1. Main Description

File Descriptions

Linked Files

Installation and Instructions

single-cell RNA sequencing count matrix

pbmc single cell RNA-seq matrix

BD Rhapsody matrix of single-cell RNA sequencing (scRNA-seq)

Generating an expression matrix for droplet single-cell RNA-seq (dscRNA-seq)...

DataSheet1_Missing Value Imputation With Low-Rank Matrix Completion in...

scRNA-seq data

Data from: Large-scale integration of single-cell transcriptomic data...

Colorectal cancer scRNA-seq 10xG-format data matrix

Multiple Single Cell RNA Expressions ARCHS4

Context

Content

Acknowledgements

Single-cell RNA sequencing data on primary samples from: Aberrant expression...

Quantifying batch effects for individual genes in single-cell data

Single-cell Spatial Transcriptomics Data with Paired RNAseq for TISSUE...

Meta data for single cells

SCANPY Python package for scRNA-seq analysis

Scanpy – Single-Cell Analysis in Python

Inspiration

Repository for the single cell RNA sequencing data analysis for the human...

Single-cell RNA-Seq and TCR-Seq analysis of PD-1+ CD8+ T-cells responding to...

Data from: Single-cell RNA-Seq of human primary lung and bronchial...

Processed Chromium Single Cell GEX, CSP and VDJ data from intestinal plasma...

Data from: Reference transcriptomics of porcine peripheral immune cells created through bulk and single-cell RNA sequencing