80 datasets found

n
Transcription start site analysis for heterogenous CD4+ T cells using 5′...
data.niaid.nih.gov
datadryad.org
zip
Updated Apr 22, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Akiko Oguchi; Yasuhiro Murakawa (2024). Transcription start site analysis for heterogenous CD4+ T cells using 5′ scRNA-seq [Dataset]. http://doi.org/10.5061/dryad.gtht76hv9
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5061/dryad.gtht76hv9
Dataset updated
Apr 22, 2024
Dataset provided by
RIKEN Center for Integrative Medical Sciences
Authors
Akiko Oguchi; Yasuhiro Murakawa
License
https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
Description
These datasets are generated by ReapTEC (read-level pre-filtering and transcribed enhancer call) using 5' single-cell RNA-seq data on human heterogenous CD4+ T cells. By taking advantage of a unique “cap signature” derived from the 5′-end of a transcript, ReapTEC simultaneously profiles gene expression and enhancer activity at nucleotide resolution using 5′-end single-cell RNA-sequencing (5′ scRNA-seq). The detail of ReapTEC pipeline is described in https://github.com/MurakawaLab/ReapTEC.
f
ProjecTILs murine reference atlas of tumor-infiltrating T cells, version 1
figshare.com
application/gzip
Updated Jun 29, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Massimo Andreatta; Santiago Carmona (2023). ProjecTILs murine reference atlas of tumor-infiltrating T cells, version 1 [Dataset]. http://doi.org/10.6084/m9.figshare.12478571.v2
Explore at:
application/gzipAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.12478571.v2
Dataset updated
Jun 29, 2023
Dataset provided by
figshare
Authors
Massimo Andreatta; Santiago Carmona
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
We have developed ProjecTILs, a computational approach to project new data sets into a reference map of T cells, enabling their direct comparison in a stable, annotated system of coordinates. Because new cells are embedded in the same space of the reference, ProjecTILs enables the classification of query cells into annotated, discrete states, but also over a continuous space of intermediate states. By comparing multiple samples over the same map, and across alternative embeddings, the method allows exploring the effect of cellular perturbations (e.g. as the result of therapy or genetic engineering) and identifying genetic programs significantly altered in the query compared to a control set or to the reference map. We illustrate the projection of several data sets from recent publications over two cross-study murine T cell reference atlases: the first describing tumor-infiltrating T lymphocytes (TILs), the second characterizing acute and chronic viral infection.To construct the reference TIL atlas, we obtained single-cell gene expression matrices from the following GEO entries: GSE124691, GSE116390, GSE121478, GSE86028; and entry E-MTAB-7919 from Array-Express. Data from GSE124691 contained samples from tumor and from tumor-draining lymph nodes, and were therefore treated as two separate datasets. For the TIL projection examples (OVA Tet+, miR-155 KO and Regnase-KO), we obtained the gene expression counts from entries GSE122713, GSE121478 and GSE137015, respectively.Prior to dataset integration, single-cell data from individual studies were filtered using TILPRED-1.0 (https://github.com/carmonalab/TILPRED), which removes cells not enriched in T cell markers (e.g. Cd2, Cd3d, Cd3e, Cd3g, Cd4, Cd8a, Cd8b1) and cells enriched in non T cell genes (e.g. Spi1, Fcer1g, Csf1r, Cd19). Dataset integration was performed using STACAS (https://github.com/carmonalab/STACAS), a batch-correction algorithm based on Seurat 3. For the TIL reference map, we specified 600 variable genes per dataset, excluding cell cycling genes, mitochondrial, ribosomal and non-coding genes, as well as genes expressed in less than 0.1% or more than 90% of the cells of a given dataset. For integration, a total of 800 variable genes were derived as the intersection of the 600 variable genes of individual datasets, prioritizing genes found in multiple datasets and, in case of draws, those derived from the largest datasets. We determined pairwise dataset anchors using STACAS with default parameters, and filtered anchors using an anchor score threshold of 0.8. Integration was performed using the IntegrateData function in Seurat3, providing the anchor set determined by STACAS, and a custom integration tree to initiate alignment from the largest and most heterogeneous datasets.Next, we performed unsupervised clustering of the integrated cell embeddings using the Shared Nearest Neighbor (SNN) clustering method implemented in Seurat 3 with parameters {resolution=0.6, reduction=”umap”, k.param=20}. We then manually annotated individual clusters (merging clusters when necessary) based on several criteria: i) average expression of key marker genes in individual clusters; ii) gradients of gene expression over the UMAP representation of the reference map; iii) gene-set enrichment analysis to determine over- and under- expressed genes per cluster using MAST. In order to have access to predictive methods for UMAP, we recomputed PCA and UMAP embeddings independently of Seurat3 using respectively the prcomp function from basic R package “stats”, and the “umap” R package (https://github.com/tkonopka/umap).
n
Data from: Large-scale integration of single-cell transcriptomic data...
data.niaid.nih.gov
dataone.org
+1more
zip
Updated Dec 14, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
David McKellar; Iwijn De Vlaminck; Benjamin Cosgrove (2021). Large-scale integration of single-cell transcriptomic data captures transitional progenitor states in mouse skeletal muscle regeneration [Dataset]. http://doi.org/10.5061/dryad.t4b8gtj34
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5061/dryad.t4b8gtj34
Dataset updated
Dec 14, 2021
Dataset provided by
Cornell University
Authors
David McKellar; Iwijn De Vlaminck; Benjamin Cosgrove
License
https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
Description
Skeletal muscle repair is driven by the coordinated self-renewal and fusion of myogenic stem and progenitor cells. Single-cell gene expression analyses of myogenesis have been hampered by the poor sampling of rare and transient cell states that are critical for muscle repair, and do not inform the spatial context that is important for myogenic differentiation. Here, we demonstrate how large-scale integration of single-cell and spatial transcriptomic data can overcome these limitations. We created a single-cell transcriptomic dataset of mouse skeletal muscle by integration, consensus annotation, and analysis of 23 newly collected scRNAseq datasets and 88 publicly available single-cell (scRNAseq) and single-nucleus (snRNAseq) RNA-sequencing datasets. The resulting dataset includes more than 365,000 cells and spans a wide range of ages, injury, and repair conditions. Together, these data enabled identification of the predominant cell types in skeletal muscle, and resolved cell subtypes, including endothelial subtypes distinguished by vessel-type of origin, fibro/adipogenic progenitors defined by functional roles, and many distinct immune populations. The representation of different experimental conditions and the depth of transcriptome coverage enabled robust profiling of sparsely expressed genes. We built a densely sampled transcriptomic model of myogenesis, from stem cell quiescence to myofiber maturation and identified rare, transitional states of progenitor commitment and fusion that are poorly represented in individual datasets. We performed spatial RNA sequencing of mouse muscle at three time points after injury and used the integrated dataset as a reference to achieve a high-resolution, local deconvolution of cell subtypes. We also used the integrated dataset to explore ligand-receptor co-expression patterns and identify dynamic cell-cell interactions in muscle injury response. We provide a public web tool to enable interactive exploration and visualization of the data. Our work supports the utility of large-scale integration of single-cell transcriptomic data as a tool for biological discovery.

Methods Mice. The Cornell University Institutional Animal Care and Use Committee (IACUC) approved all animal protocols, and experiments were performed in compliance with its institutional guidelines. Adult C57BL/6J mice (mus musculus) were obtained from Jackson Laboratories (#000664; Bar Harbor, ME) and were used at 4-7 months of age. Aged C57BL/6J mice were obtained from the National Institute of Aging (NIA) Rodent Aging Colony and were used at 20 months of age. For new scRNAseq experiments, female mice were used in each experiment.

Mouse injuries and single-cell isolation. To induce muscle injury, both tibialis anterior (TA) muscles of old (20 months) C57BL/6J mice were injected with 10 µl of notexin (10 µg/ml; Latoxan; France). At 0, 1, 2, 3.5, 5, or 7 days post-injury (dpi), mice were sacrificed and TA muscles were collected and processed independently to generate single-cell suspensions. Muscles were digested with 8 mg/ml Collagenase D (Roche; Switzerland) and 10 U/ml Dispase II (Roche; Switzerland), followed by manual dissociation to generate cell suspensions. Cell suspensions were sequentially filtered through 100 and 40 μm filters (Corning Cellgro #431752 and #431750) to remove debris. Erythrocytes were removed through incubation in erythrocyte lysis buffer (IBI Scientific #89135-030).

Single-cell RNA-sequencing library preparation. After digestion, single-cell suspensions were washed and resuspended in 0.04% BSA in PBS at a concentration of 106 cells/ml. Cells were counted manually with a hemocytometer to determine their concentration. Single-cell RNA-sequencing libraries were prepared using the Chromium Single Cell 3’ reagent kit v3 (10x Genomics, PN-1000075; Pleasanton, CA) following the manufacturer’s protocol. Cells were diluted into the Chromium Single Cell A Chip to yield a recovery of 6,000 single-cell transcriptomes. After preparation, libraries were sequenced using on a NextSeq 500 (Illumina; San Diego, CA) using 75 cycle high output kits (Index 1 = 8, Read 1 = 26, and Read 2 = 58). Details on estimated sequencing saturation and the number of reads per sample are shown in Sup. Data 1.

Spatial RNA sequencing library preparation. Tibialis anterior muscles of adult (5 mo) C57BL6/J mice were injected with 10µl notexin (10 µg/ml) at 2, 5, and 7 days prior to collection. Upon collection, tibialis anterior muscles were isolated, embedded in OCT, and frozen fresh in liquid nitrogen. Spatially tagged cDNA libraries were built using the Visium Spatial Gene Expression 3’ Library Construction v1 Kit (10x Genomics, PN-1000187; Pleasanton, CA) (Fig. S7). Optimal tissue permeabilization time for 10 µm thick sections was found to be 15 minutes using the 10x Genomics Visium Tissue Optimization Kit (PN-1000193). H&E stained tissue sections were imaged using Zeiss PALM MicroBeam laser capture microdissection system and the images were stitched and processed using Fiji ImageJ software. cDNA libraries were sequenced on an Illumina NextSeq 500 using 150 cycle high output kits (Read 1=28bp, Read 2=120bp, Index 1=10bp, and Index 2=10bp). Frames around the capture area on the Visium slide were aligned manually and spots covering the tissue were selected using Loop Browser v4.0.0 software (10x Genomics). Sequencing data was then aligned to the mouse reference genome (mm10) using the spaceranger v1.0.0 pipeline to generate a feature-by-spot-barcode expression matrix (10x Genomics).

Download and alignment of single-cell RNA sequencing data. For all samples available via SRA, parallel-fastq-dump (github.com/rvalieris/parallel-fastq-dump) was used to download raw .fastq files. Samples which were only available as .bam files were converted to .fastq format using bamtofastq from 10x Genomics (github.com/10XGenomics/bamtofastq). Raw reads were aligned to the mm10 reference using cellranger (v3.1.0).

Preprocessing and batch correction of single-cell RNA sequencing datasets. First, ambient RNA signal was removed using the default SoupX (v1.4.5) workflow (autoEstCounts and adjustCounts; github.com/constantAmateur/SoupX). Samples were then preprocessed using the standard Seurat (v3.2.1) workflow (NormalizeData, ScaleData, FindVariableFeatures, RunPCA, FindNeighbors, FindClusters, and RunUMAP; github.com/satijalab/seurat). Cells with fewer than 750 features, fewer than 1000 transcripts, or more than 30% of unique transcripts derived from mitochondrial genes were removed. After preprocessing, DoubletFinder (v2.0) was used to identify putative doublets in each dataset, individually. BCmvn optimization was used for PK parameterization. Estimated doublet rates were computed by fitting the total number of cells after quality filtering to a linear regression of the expected doublet rates published in the 10x Chromium handbook. Estimated homotypic doublet rates were also accounted for using the modelHomotypic function. The default PN value (0.25) was used. Putative doublets were then removed from each individual dataset. After preprocessing and quality filtering, we merged the datasets and performed batch-correction with three tools, independently- Harmony (github.com/immunogenomics/harmony) (v1.0), Scanorama (github.com/brianhie/scanorama) (v1.3), and BBKNN (github.com/Teichlab/bbknn) (v1.3.12). We then used Seurat to process the integrated data. After initial integration, we removed the noisy cluster and re-integrated the data using each of the three batch-correction tools.

Cell type annotation. Cell types were determined for each integration method independently. For Harmony and Scanorama, dimensions accounting for 95% of the total variance were used to generate SNN graphs (Seurat::FindNeighbors). Louvain clustering was then performed on the output graphs (including the corrected graph output by BBKNN) using Seurat::FindClusters. A clustering resolution of 1.2 was used for Harmony (25 initial clusters), BBKNN (28 initial clusters), and Scanorama (38 initial clusters). Cell types were determined based on expression of canonical genes (Fig. S3). Clusters which had similar canonical marker gene expression patterns were merged.

Pseudotime workflow. Cells were subset based on the consensus cell types between all three integration methods. Harmony embedding values from the dimensions accounting for 95% of the total variance were used for further dimensional reduction with PHATE, using phateR (v1.0.4) (github.com/KrishnaswamyLab/phateR).

Deconvolution of spatial RNA sequencing spots. Spot deconvolution was performed using the deconvolution module in BayesPrism (previously known as “Tumor microEnvironment Deconvolution”, TED, v1.0; github.com/Danko-Lab/TED). First, myogenic cells were re-labeled, according to binning along the first PHATE dimension, as “Quiescent MuSCs” (bins 4-5), “Activated MuSCs” (bins 6-7), “Committed Myoblasts” (bins 8-10), and “Fusing Myoctes” (bins 11-18). Culture-associated muscle stem cells were ignored and myonuclei labels were retained as “Myonuclei (Type IIb)” and “Myonuclei (Type IIx)”. Next, highly and differentially expressed genes across the 25 groups of cells were identified with differential gene expression analysis using Seurat (FindAllMarkers, using Wilcoxon Rank Sum Test; results in Sup. Data 2). The resulting genes were filtered based on average log2-fold change (avg_logFC > 1) and the percentage of cells within the cluster which express each gene (pct.expressed > 0.5), yielding 1,069 genes. Mitochondrial and ribosomal protein genes were also removed from this list, in line with recommendations in the BayesPrism vignette. For each of the cell types, mean raw counts were calculated across the 1,069 genes to generate a gene expression profile for BayesPrism. Raw counts for each spot were then passed to the run.Ted function, using
f
Processed naive T cell single-cell RNA-seq, Seurat object
figshare.com
application/gzip
Updated Jan 5, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Daniel Bunis (2021). Processed naive T cell single-cell RNA-seq, Seurat object [Dataset]. http://doi.org/10.6084/m9.figshare.11886891.v2
Explore at:
application/gzipAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.11886891.v2
Dataset updated
Jan 5, 2021
Dataset provided by
figshare
Authors
Daniel Bunis
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Processed naive CD4 and CD8 T cell single-cell RNAseq data from human samples. The file contains a Seurat object stored as an .rds file which can be read into R with the readRDS() function. It was generated using the raw data of similar name in this project, as well as the code stored here: https://github.com/dtm2451/ProgressiveHematopoiesis
Processed Seurat objects for GeneTrajectory inference (Gene Trajectory...
figshare.com
application/gzip
Updated Feb 19, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Rihao Qu; Peggy Myung (2024). Processed Seurat objects for GeneTrajectory inference (Gene Trajectory Inference for Single-cell Data by Optimal Transport Metrics) [Dataset]. http://doi.org/10.6084/m9.figshare.25243225.v1
Explore at:
application/gzipAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.25243225.v1
Dataset updated
Feb 19, 2024
Dataset provided by
Figsharehttp://figshare.com/
Authors
Rihao Qu; Peggy Myung
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
These are processed Seurat objects for the two biological datasets in GeneTrajectory inference (https://github.com/KlugerLab/GeneTrajectory/):Human myeloid dataset analysisMyeloid cells were extracted from a publicly available 10x scRNA-seq dataset (https:// support.10xgenomics.com/single-cell-gene-expression/datasets/3.0.0/pbmc 10k v3). QC was performed using the same workflow in (https://github.com/satijalab/ Integration2019/blob/master/preprocessing scripts/pbmc 10k v3.R). After standard normalization, highly-variable gene selection and scaling using the Seurat R package, we applied PCA and retained the top 30 principal components. Four sub-clusters of myeloid cells were identified based on Louvian clustering with a resolution of 0.3. Wilcoxon rank-sum test was employed to find cluster-specific gene markers for cell type annotation.For gene trajectory inference, we first applied Diffusion Map on the cell PC embedding (using a local-adaptive kernel, each bandwidth is determined by the distance to its k-nearest neighbor, k = 10) to generate a spectral embedding of cells. We constructed a cell-cell kNN (k = 10) graph based on their coordinates of the top 5 non-trivial Diffusion Map eigenvectors. Among the top 2,000 variable genes, genes expressed by 0.5% − 75% of cells were retained for pairwise gene-gene Wasserstein distance computation. The original cell graph was coarse-grained into a graph of size 1,000. We then built a gene-gene graph where the affinity between genes is transformed from the Wasserstein distance using a Gaussian kernel (local-adaptive, k = 5). Diffusion Map was employed to visualize the embedding of gene graph. For trajectory identification, we used a series of time steps (11,21,8) to extract three gene trajectories. Mouse embryo skin data analysisWe separated out dermal cell populations from the newly collected mouse embryo skin samples. Cells from the wildtype and the Wls mutant were pooled for analyses. After standard normalization, highly-variable gene selection and scaling using Seurat, we applied PCA and retained the top 30 principal components. Three dermal celltypes were stratified based on the expression of canonical dermal markers, including Sox2, Dkk1, and Dkk2. For gene trajectory inference, we first applied Diffusion Map on the cell PC embedding (using a local-adaptive kernel bandwidth, k = 10) to generate a spectral embedding of cells. We constructed a cell-cell kNN (k = 10) graph based on their coordinates of the top 10 non-trivial Diffusion Map eigenvectors. Among the top 2,000 variable genes, genes expressed by 1% − 50% of cells were retained for pairwise gene-gene Wasserstein distance computation. The original cell graph was coarse-grained into a graph of size 1,000. We then built a gene-gene graph where the affinity between genes is transformed from the Wasserstein distance using a Gaussian kernel (local-adaptive, k = 5). Diffusion Map was employed to visualize the embedding of gene graph. For trajectory identification, we used a series of time steps (9,16,5) to sequentially extract three gene trajectories. To compare the differences between the wiltype and the Wls mutant, we stratified Wnt-active UD cells into seven stages according to their expression profiles of the genes binned along the DC gene trajectory.
Data from: A Single-Cell Tumor Immune Atlas for Precision Oncology
zenodo.org
bin, csv
Updated Mar 31, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Paula Nieto; Paula Nieto (2022). A Single-Cell Tumor Immune Atlas for Precision Oncology [Dataset]. http://doi.org/10.5281/zenodo.4263972
Explore at:
bin, csvAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.4263972
Dataset updated
Mar 31, 2022
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Paula Nieto; Paula Nieto
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Preprint version of the Single-Cell Tumor Immune Atlas

This upload contains:

TICAtlas.rds: an rds file containing a Seurat object with the whole Atlas (317111 cells, RNA and integrated assays, PCA and UMAP reductions)

TICAtlas.h5ad: an h5ad file with the whole Atlas (317111 cells, RNA assay, PCA and UMAP)

TICAtlas_RNA.rds: an rds file containing a Seurat object of the whole Atlas but only the RNA assay (317111 cells, UMAP embedding)

TICAtlas_downsampled_1000.rds: an rds file containing a downsampled version of the Seurat object of the whole Atlas (24834 cells, RNA and integrated assay, PCA and UMAP reductions)

TICAtlas_downsampled_1000.h5ad: an rds file containing a downsampled version of the Seurat object of the whole Atlas (24834 cells, RNA assay, PCA and UMAP reductions)

TICAtlas_metadata.csv: a comma-separated text file with the metadata for each of the cells

For the h5ad files, the .X slot contains the normalized data, while the .X.raw slot contains the raw counts as they were in the original datasets.

All the files contain the following patient/sample metadata variables:

patient: assigned patient identifiers

gender: the patient's gender (male/female/unknown)

source: dataset of origin

subtype: cancer type (abbreviations as indicated in the preprint)

cluster_kmeans_k6: patients clusters, NA if filtered out

cell_type: annotated cell type for each of the cells

If you have any issues with the metadata you can use the TICAtlas_metadata.csv file.

For more information, read our preprint and check our GitHub.

h5ad files can be read with Python using Scanpy, rds files can be read in R using Seurat. For format conversion between AnnData and Seurat we recommend SeuratDisk. For other single-cell data formats you can use sceasy.
Azimuth Reference - Human Adipose
zenodo.org
explore.openaire.eu
bin
Updated Sep 1, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Satija Lab; Satija Lab (2022). Azimuth Reference - Human Adipose [Dataset]. http://doi.org/10.5281/zenodo.7032920
Explore at:
binAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.7032920
Dataset updated
Sep 1, 2022
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Satija Lab; Satija Lab
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Here we provide the reference data files used to run the Azimuth Human Adipose reference web application. For a full description of the reference data structure, please see the wiki at https://github.com/satijalab/azimuth/wiki/Azimuth-Reference-Format.
f
Processed CODEX Data (Seurat Objects)
plus.figshare.com
application/gzip
Updated Apr 12, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Shovik Bandyopadhyay; Jonathan Sussman; Kyung Jin Ahn; Kai Tan (2024). Processed CODEX Data (Seurat Objects) [Dataset]. http://doi.org/10.25452/figshare.plus.25127657.v1
Explore at:
application/gzipAvailable download formats
Unique identifier
https://doi.org/10.25452/figshare.plus.25127657.v1
Dataset updated
Apr 12, 2024
Dataset provided by
Figshare+
Authors
Shovik Bandyopadhyay; Jonathan Sussman; Kyung Jin Ahn; Kai Tan
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
Seurat objects containing the raw and normalized data for:Normal bone marrow (NBM) atlas: contains all cells obtained through segmentation after filtering and QC. Includes coarse and fine level of annotations that were obtained through an iterative process of subclustering. Neighborhood analysis results are included as a metadata column. Additional Osteo-MSC and Fibro-MSC cells that were manually annotatedAML/NSM CODEX data: contains all cells after filtering for 3 diagnostic and 2 post-therapy AML samples as well as 3 negative staging marrow samples. Cell labels were derived through reciprocal principal component analysis (RPCA) reference mapping onto the normal bone marrow atlas. Neighborhood analysis was conducted separately for AML Diagnostic, AML Post-Therapy, and NSM samples. Neighborhoods were manually annotated for each set. The results of the neighborhood analysis were merged and included in the metadata of the Seurat object. All normalized data is stored in the Seurat assay object. Markers that were not included in normalization and downstream analysis are included with raw values as a metadata column. Full source code used to generate these objects can be found on GitHub: https://github.com/shovikb94/spatial-bonemarrow-atlas/tree/mainSee related materials in Collection at: https://doi.org/10.25452/figshare.plus.c.7174914
o
Azimuth Reference - Mouse Motor Cortex
explore.openaire.eu
Updated Feb 17, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Satija Lab (2021). Azimuth Reference - Mouse Motor Cortex [Dataset]. http://doi.org/10.5281/zenodo.4546935
Explore at:
Unique identifier
https://doi.org/10.5281/zenodo.4546935
Dataset updated
Feb 17, 2021
Authors
Satija Lab
Description
Here we provide the reference data files used to run the Azimuth Mouse Motor Cortex reference web application. For a full description of the reference data structure, please see the wiki at https://github.com/satijalab/azimuth/wiki/Azimuth-Reference-Format.
l
cellCounts
opal.latrobe.edu.au
researchdata.edu.au
bin
Updated Dec 19, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yang Liao; Dinesh Raghu; Bhupinder Pal; Lisa Mielke; Wei Shi (2022). cellCounts [Dataset]. http://doi.org/10.26181/21588276.v3
Explore at:
binAvailable download formats
Unique identifier
https://doi.org/10.26181/21588276.v3
Dataset updated
Dec 19, 2022
Dataset provided by
La Trobe
Authors
Yang Liao; Dinesh Raghu; Bhupinder Pal; Lisa Mielke; Wei Shi
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This page includes the data and code necessary to reproduce the results of the following paper: Yang Liao, Dinesh Raghu, Bhupinder Pal, Lisa Mielke and Wei Shi. cellCounts: fast and accurate quantification of 10x Chromium single-cell RNA sequencing data. Under review. A Linux computer running an operating system of CentOS 7 (or later) or Ubuntu 20.04 (or later) is recommended for running this analysis. The computer should have >2 TB of disk space and >64 GB of RAM. The following software packages need to be installed before running the analysis. Software executables generated after installation should be included in the $PATH environment variable.

R (v4.0.0 or newer) https://www.r-project.org/ Rsubread (v2.12.2 or newer) http://bioconductor.org/packages/3.16/bioc/html/Rsubread.html CellRanger (v6.0.1) https://support.10xgenomics.com/single-cell-gene-expression/software/overview/welcome STARsolo (v2.7.10a) https://github.com/alexdobin/STAR sra-tools (v2.10.0 or newer) https://github.com/ncbi/sra-tools Seurat (v3.0.0 or newer) https://satijalab.org/seurat/ edgeR (v3.30.0 or newer) https://bioconductor.org/packages/edgeR/ limma (v3.44.0 or newer) https://bioconductor.org/packages/limma/ mltools (v0.3.5 or newer) https://cran.r-project.org/web/packages/mltools/index.html

Reference packages generated by 10x Genomics are also required for this analysis and they can be downloaded from the following link (2020-A version for individual human and mouse reference packages should be selected): https://support.10xgenomics.com/single-cell-gene-expression/software/downloads/latest After all these are done, you can simply run the shell script ‘test-all-new.bash’ to perform all the analyses carried out in the paper. This script will automatically download the mixture scRNA-seq data from the SRA database, and it will output a text file called ‘test-all.log’ that contains all the screen outputs and speed/accuracy results of CellRanger, STARsolo and cellCounts.
Seurat object subset of mouse liver scRNAseq data (Guilliams et al., Cell...
zenodo.org
bin
Updated Jan 12, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Browaeys Robin; Browaeys Robin (2022). Seurat object subset of mouse liver scRNAseq data (Guilliams et al., Cell 2022) [Dataset]. http://doi.org/10.5281/zenodo.5840787
Explore at:
binAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.5840787
Dataset updated
Jan 12, 2022
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Browaeys Robin; Browaeys Robin
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Seurat object containing a subset of the mouse liver scRNAseq data (Guilliams et al., Cell 2022)

Data used only for demonstration purpose. Namely, to demonstrate the Differential NicheNet pipeline: https://github.com/saeyslab/nichenetr/blob/master/vignettes/differential_nichenet.md
Data from: Pre-ciliated tubal epithelial cells are prone to initiation of...
data.niaid.nih.gov
datadryad.org
zip
Updated Oct 17, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Coulter Ralston; Alexander Nikitin; Benjamin Cosgrove (2024). Pre-ciliated tubal epithelial cells are prone to initiation of high-grade serous ovarian carcinoma [Dataset]. http://doi.org/10.5061/dryad.4mw6m90hm
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5061/dryad.4mw6m90hm
Dataset updated
Oct 17, 2024
Dataset provided by
Cornell University
Authors
Coulter Ralston; Alexander Nikitin; Benjamin Cosgrove
License
https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
Description
The distal region of the uterine (Fallopian) tube is commonly associated with high-grade serous carcinoma (HGSC), the predominant and most aggressive form of ovarian or extra-uterine cancer. Specific cell states and lineage dynamics of the adult tubal epithelium (TE) remain insufficiently understood, hindering efforts to determine the cell of origin for HGSC. Here, we report a comprehensive census of cell types and states of the mouse uterine tube. We show that distal TE cells expressing the stem/progenitor cell marker Slc1a3 can differentiate into both secretory (Ovgp1+) and ciliated (Fam183b+) cells. Inactivation of Trp53 and Rb1, whose pathways are commonly altered in HGSC, leads to elimination of targeted Slc1a3+ cells by apoptosis, thereby preventing their malignant transformation. In contrast, pre-ciliated cells (Krt5+, Prom1+, Trp73+) remain cancer-prone and give rise to serous tubal intraepithelial carcinomas and overt HGSC. These findings identify transitional pre-ciliated cells as a previously unrecognized cancer-prone cell state and point to pre-ciliation mechanisms as novel diagnostic and therapeutic targets. Methods

Single-cell RNA-sequencing library preparation For TE single cell expression and transcriptome analysis we isolated TE from C57BL6 adult estrous female mice. In 3 independent experiments a total of 62 uterine tubes were collected. Each uterine tube was placed in sterile PBS containing 100 IU ml-1 of penicillin and 100 µg ml-1 streptomycin (Corning, 30-002-Cl), and separated in distal and proximal regions. Tissues from the same region were combined in a 40 µl drop of the same PBS solution, cut open lengthwise, and minced into 1.5-2.5 mm pieces with 25G needles. Minced tissues were transferred with help of a sterile wide bore 200 µl pipette tip into a 1.8 ml cryo vial containing 1.2 ml A-mTE-D1 (300 IU ml-1 collagenase IV mixed with 100 IU ml-1 hyaluronidase; Stem Cell Technologies, 07912, in DMEM Ham’s F12, Hyclone, SH30023.FS). Tissues were incubated with loose cap for 1 h at 37°C in a 5% CO2 incubator. During the incubation tubes were taken out 4 times and tissues suspended with a wide bore 200 µl pipette tip. At the end of incubation, the tissue-cell suspension from each tube was transferred into 1 ml TrypLE (Invitrogen, 12604013) pre-warmed to 37°C, suspended 70 times with a 1000 µl pipette tip, 5 ml A-SM [DMEM Ham’s F12 containing 2% fetal bovine serum (FBS)] were added to the mix, and TE cells were pelleted by centrifugation 300x g for 10 minutes at 25°C. Pellets were then suspended with 1 ml pre-warmed to 37°C A-mTE-D2 (7 mg ml-1 Dispase II, Worthington NPRO2, and 10 µg ml-1 Deoxyribonuclease I, Stem Cell Technologies, 07900), and mixed 70 times with a 1000 µl pipette tip. 5 ml A-mTE-D2 was added and samples were passed through a 40 µm cell strainer, and pelleted by centrifugation at 300x g for 7 minutes at +4°C. Pellets were suspended in 100 µl microbeads per 107 total cells or fewer, and dead cells were removed with the Dead Cell Removal Kit (Miltenyi Biotec, 130-090-101) according to the manufacturer’s protocol. Pelleted live cell fractions were collected in 1.5 ml low binding centrifuge tubes, kept on ice, and suspended in ice cold 50 µl A-Ri-Buffer (5% FBS, 1% GlutaMAX-I, Invitrogen, 35050-079, 9 µM Y-27632, Millipore, 688000, and 100 IU ml-1 penicillin 100 μg ml-1 streptomycin in DMEM Ham’s F12). Cell aliquots were stained with trypan blue for live and dead cell calculation. Live cell preparations with a target cell recovery of 5,000-6,000 were loaded on Chromium controller (10X Genomics, Single Cell 3’ v2 chemistry) to perform single cell partitioning and barcoding using the microfluidic platform device. After preparation of barcoded, next-generation sequencing cDNA libraries samples were sequenced on Illumina NextSeq500 System.

Download and alignment of single-cell RNA sequencing data For sequence alignment, a custom reference for mm39 was built using the cellranger (v6.1.2, 10x Genomics) mkref function. The mm39.fa soft-masked assembly sequence and the mm39.ncbiRefSeq.gtf (release 109) genome annotation last updated 2020-10-27 were used to form the custom reference. The raw sequencing reads were aligned to the custom reference and quantified using the cellranger count function.

Preprocessing and batch correction All preprocessing and data analysis was conducted in R (v.4.1.1 (2021-08-10)). The cellranger count outs were first modified with the autoEstCont and adjustCounts functions from SoupX (v.1.6.1) to output a corrected matrix with the ambient RNA signal (soup) removed (https://github.com/constantAmateur/SoupX). To preprocess the corrected matrices, the Seurat (v.4.1.1) NormalizeData, FindVariableFeatures, ScaleData, RunPCA, FindNeighbors, and RunUMAP functions were used to create a Seurat object for each sample (https://github.com/satijalab/seurat). The number of principal components used to construct a shared nearest-neighbor graph were chosen to account for 95% of the total variance. To detect possible doublets, we used the package DoubletFinder (v.2.0.3) with inputs specific to each Seurat object. DoubletFinder creates artificial doublets and calculates the proportion of artificial k nearest neighbors (pANN) for each cell from a merged dataset of the artificial and actual data. To maximize DoubletFinder’s predictive power, mean-variance normalized bimodality coefficient (BCMVN) was used to determine the optimal pK value for each dataset. To establish a threshold for pANN values to distinguish between singlets and doublets, the estimated multiplet rates for each sample were calculated by interpolating between the target cell recovery values according to the 10x Chromium user manual. Homotypic doublets were identified using unannotated Seurat clusters in each dataset with the modelHomotypic function. After doublets were identified, all distal and proximal samples were merged separately. Cells with greater than 30% mitochondrial genes, cells with fewer than 750 nCount RNA, and cells with fewer than 200 nFeature RNA were removed from the merged datasets. To correct for any batch defects between sample runs, we used the harmony (v.0.1.0) integration method (github.com/immunogenomics/harmony).

Clustering parameters and annotations After merging the datasets and batch-correction, the dimensions reflecting 95% of the total variance were input into Seurat’s FindNeighbors function with a k.param of 70. Louvain clustering was then conducted using Seurat’s FindClusters with a resolution of 0.7. The resulting 19 clusters were annotated based on the expression of canonical genes and the results of differential gene expression (Wilcoxon Rank Sum test) analysis. One cluster expressing lymphatic and epithelial markers was omitted from later analysis as it only contained 2 cells suspected to be doublets. To better understand the epithelial populations, we reclustered 6 epithelial populations and reapplied harmony batch correction. The clustering parameters from FindNeighbors was a k.param of 50, and a resolution of 0.7 was used for FindClusters. The resulting 9 clusters within the epithelial subset were further annotated using differential expression analysis and canonical markers.

Pseudotime analysis Potential of heat diffusion for affinity-based transition embedding (PHATE) is dimensional reduction method to more accurately visualize continual progressions found in biological data 35. A modified version of Seurat (v4.1.1) was developed to include the ‘RunPHATE’ function for converting a Seurat Object to a PHATE embedding. This was built on the phateR package (v.1.0.7) (https://github.com/scottgigante/seurat/tree/patch/add-PHATE-again). In addition to PHATE, pseudotime values were calculated with Monocle3 (v.1.2.7), which computes trajectories with an origin set by the user 36,55–57. The origin was set to be a progenitor cell state confirmed with lineage tracing experiments. 35. Moon, K. R. et al. Visualizing structure and transitions in high-dimensional biological data. Nat Biotechnol 37, 1482–1492 (2019). doi:10.1038/s41587-019-0336-3 36. Cao, J. et al. The single-cell transcriptional landscape of mammalian organogenesis. Nature 566, 496–502 (2019). doi:10.1038/s41586-019-0969-x 55. Trapnell, C. et al. The dynamics and regulators of cell fate decisions are revealed by pseudotemporal ordering of single cells. Nature Biotechnology 32, 381–386 (2014). doi:10.1038/nbt.2859 56. Qiu, X. et al. Single-cell mRNA quantification and differential analysis with Census. Nature Methods 14, 309–315 (2017). doi:10.1038/nmeth.4150 57. Qiu, X. et al. Reversed graph embedding resolves complex single-cell trajectories. Nat Methods 14, 979–982 (2017). doi:10.1038/nmeth.4402
f
Processed HSPCs single-cell RNA-seq, Seurat object
figshare.com
application/gzip
Updated Jan 5, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Daniel Bunis (2021). Processed HSPCs single-cell RNA-seq, Seurat object [Dataset]. http://doi.org/10.6084/m9.figshare.11894691.v2
Explore at:
application/gzipAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.11894691.v2
Dataset updated
Jan 5, 2021
Dataset provided by
figshare
Authors
Daniel Bunis
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Processed hematopoietic stem and progenitor cell (HSPC) single-cell RNAseq data from human samples. The file contains a Seurat object stored as an .rds file which can be read into R with the readRDS() function. It was generated using the raw data of similar name in this project, as well as the code stored here: https://github.com/dtm2451/ProgressiveHematopoiesis
o
Azimuth Reference - Human Pancreas
explore.openaire.eu
Updated Feb 17, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Satija Lab (2021). Azimuth Reference - Human Pancreas [Dataset]. http://doi.org/10.5281/zenodo.4546925
Explore at:
Unique identifier
https://doi.org/10.5281/zenodo.4546925
Dataset updated
Feb 17, 2021
Authors
Satija Lab
Description
Here we provide the reference data files used to run the Azimuth Human Pancreas reference web application. For a full description of the reference data structure, please see the wiki at https://github.com/satijalab/azimuth/wiki/Azimuth-Reference-Format.
Processed Seurat Object of scRNAseq data from wildtype and CaMKK2 KO immune...
zenodo.org
bin
Updated Jun 17, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
William Tomaszewski; William Tomaszewski (2022). Processed Seurat Object of scRNAseq data from wildtype and CaMKK2 KO immune infiltrate of CT2a preclinical murine glioma [Dataset]. http://doi.org/10.5281/zenodo.6654420
Explore at:
binAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.6654420
Dataset updated
Jun 17, 2022
Dataset provided by
Zenodohttp://zenodo.org/
Authors
William Tomaszewski; William Tomaszewski
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This repository contains the processed Seurat objects generated from the raw data deposited at the Gene Expression Omnibus (GEO) under GSE197879.

Details about the experiment and sequencing are available under GSE197879.

Information on how the Seurat objects were created can be found in this GitHub repository https://github.com/wht10/CT2A_scRNAseq_CaMKK2KOvWT .

Notable metadata within each Seurat object:

1. Processed_CD45_Live_Fig2b.rds

Genotype - whether the cell is from a WT or CaMKK2 KO mouse

HTO_maxID - The biological replicate that the cell came from (4 biological replicates per genotype)

MouseID - A concatenation between the genotype and HTO_maxID, providing a unique identifier for each biological replicate

Cell.Type - The cell type annotations for each cell. Can be assigned to "Idents()" to change the name of the cell identities.

Geno.Ident - A concatenation between Genotype and Cell.Type. By re-assigning this to "Idents()" "FindMarkers()" can be used to investigate differentially expressed genes within a cell-type between genotypes.

2. Reclustered_TILs_Fig3a.rds

Genotype - whether the cell is from a WT or CaMKK2 KO mouse

HTO_maxID - The biological replicate that the cell came from (4 biological replicates per genotype)

MouseID - A concatenation between the genotype and HTO_maxID, providing a unique identifier for each biological replicate

Celltype - The cell type annotations for each cell. Can be assigned to "Idents()" to change the name of the cell identities.

Geno_Ident - A concatenation between Genotype and cell-type. By re-assigning this to "Idents()" "FindMarkers()" can be used to investigate differentially expressed genes within a cell-type between genotypes.
seurat.wnn.peak.rds
figshare.com
application/gzip
Updated Oct 21, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Liran Mao (2024). seurat.wnn.peak.rds [Dataset]. http://doi.org/10.6084/m9.figshare.27265410.v1
Explore at:
application/gzipAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.27265410.v1
Dataset updated
Oct 21, 2024
Dataset provided by
Figsharehttp://figshare.com/
Authors
Liran Mao
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This repository contains the data necessary to reproduce the results from the SpatialMuxSeq vignette (https://rpubs.com/LiranM/SpatialMuxSeq), featured in our paper "Multiplexed Spatial Mapping of Chromatin Features, Transcriptome, and Proteins in Tissues." To ensure full reproducibility of the results, we have provided a Seurat object that includes all omics layers. For further details and access to all relevant code, please visit our GitHub repository: https://github.com/liranmao/Spatial_multi_omics.
o
WORKSHOP: Single cell RNAseq analysis in R
explore.openaire.eu
Updated Sep 26, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sarah Williams; Adele Barugahare; Paul Harrison; Laura Perlaza Jimenez; Nicholas Matigan; Valentine Murigneux; Magdalena Antczak; Uwe Winter (2023). WORKSHOP: Single cell RNAseq analysis in R [Dataset]. http://doi.org/10.5281/zenodo.10042918
Explore at:
Unique identifier
https://doi.org/10.5281/zenodo.10042918
Dataset updated
Sep 26, 2023
Authors
Sarah Williams; Adele Barugahare; Paul Harrison; Laura Perlaza Jimenez; Nicholas Matigan; Valentine Murigneux; Magdalena Antczak; Uwe Winter
Description
This record includes training materials associated with the Australian BioCommons workshop 'Single cell RNAseq analysis in R'. This workshop took place over two, 3.5 hour sessions on 26 and 27 October 2023. Event description Analysis and interpretation of single cell RNAseq (scRNAseq) data requires dedicated workflows. In this hands-on workshop we will show you how to perform single cell analysis using Seurat - an R package for QC, analysis, and exploration of single-cell RNAseq data. We will discuss the 'why' behind each step and cover reading in the count data, quality control, filtering, normalisation, clustering, UMAP layout and identification of cluster markers. We will also explore various ways of visualising single cell expression data. This workshop is presented by the Australian BioCommons, Queensland Cyber Infrastructure Foundation (QCIF) and the Monash Genomics and Bioinformatics Platform with the assistance of a network of facilitators from the national Bioinformatics Training Cooperative. Lead trainers: Sarah Williams, Adele Barugahare, Paul Harrison, Laura Perlaza Jimenez Facilitators: Nick Matigan, Valentine Murigneux, Magdalena (Magda) Antczak Infrastructure provision: Uwe Winter Coordinator: Melissa Burke Training materials Materials are shared under a Creative Commons Attribution 4.0 International agreement unless otherwise specified and were current at the time of the event. Files and materials included in this record: Event metadata (PDF): Information about the event including, description, event URL, learning objectives, prerequisites, technical requirements etc. Index of training materials (PDF): List and description of all materials associated with this event including the name, format, location and a brief description of each file. scRNAseq_Schedule (PDF): A breakdown of the topics and timings for the workshop Materials shared elsewhere: This workshop follows the tutorial 'scRNAseq Analysis in R with Seurat' https://swbioinf.github.io/scRNAseqInR_Doco/index.html Slides used to introduce key topics are available via GitHub https://github.com/swbioinf/scRNAseqInR_Doco/tree/main/slides This material is based on the introductory Guided Clustering Tutorial tutorial from Seurat. It is also drawing from a similar workshop held by Monash Bioinformatics Platform Single-Cell-Workshop, with material here.
Data for Cell-type-specific alternative splicing in the cerebral cortex of a...
zenodo.org
explore.openaire.eu
application/gzip
Updated Aug 12, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Emma F. Jones; Emma F. Jones; Timothy C. Howton; Timothy C. Howton; Tabea M. Soelter; Tabea M. Soelter; Anthony B. Crumley; Anthony B. Crumley; Brittany N. Lasseigne; Brittany N. Lasseigne (2024). Data for Cell-type-specific alternative splicing in the cerebral cortex of a Schinzel-Giedion Syndrome patient variant mouse model [Dataset]. http://doi.org/10.5281/zenodo.12535061
Explore at:
application/gzipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.12535061
Dataset updated
Aug 12, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Emma F. Jones; Emma F. Jones; Timothy C. Howton; Timothy C. Howton; Tabea M. Soelter; Tabea M. Soelter; Anthony B. Crumley; Anthony B. Crumley; Brittany N. Lasseigne; Brittany N. Lasseigne
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
data.tar.gz contains all files from the data directory (except for sam outputs from STAR) associated with the 230926_EJ_Setbp1_AlternativeSplicing GitHub project and includes the following files:

./marvel: - This directory contains rds and Rdata objects that were created using the MARVEL R package

cell_type_goresults.rds - This is the go results split by cell type

marvel_04_split_counts.Rdata - This R data includes all environment objects from MARVEL script 04, and is used for downstream plotting

normalized_sj_expression.Rds - This object is the normalized splice junction expression

Setbp1_marvel_aligned.rds - Final prepared MARVEL object before any SJU analyses have been run

significant_tables.RData - For those who do not want to load multiple massive files, this includes all significant SJU results for each cell type

sj_usage_cell_type.rds - This data object has splice junction usage calculated for each cell type

sj_usage_condition.rds - This data object has splice junction usage calculated for each cell type and also split by condition

./seurat: - This directory contains all intermediate and final Seurat single-cell gene expression objects

annotated_brain_samples.rds - This is the final iteration of the processing in Seurat for a final annotated object. Please use this object for any Seurat or single-cell gene expression analyses.

clustered_brain_samples.rds - This is the clustered Seurat object, before cell type annotation based on canonical markers.

filtered_brain_samples_pca.rds - This is the filtered Seurat object, before clustering but after PCA.

filtered_brain_samples.rds - This is the filtered Seurat object, before PCA.

integrated_brain_samples.rds - This the integrated Seurat object, before other steps.

./star: - All files in the STAR directory are outputs from STARsolo, as described in our methods. Each output directory contains the same files, so only one example is included here for brevity. Intermediate SAM files were removed to optimize space.

J1/ - This directory contains outputs for brain sample J1

J13/ - This directory contains outputs for brain sample J13

J15/ - This directory contains outputs for brain sample J15

J2/ - This directory contains outputs for brain sample J2

J3/ - This directory contains outputs for brain sample J3

J4/ - This directory contains outputs for brain sample J4

K1/ - This directory contains outputs for kidney sample K1

K2/ - This directory contains outputs for kidney sample K2

K3/ - This directory contains outputs for kidney sample K3

K4/ - This directory contains outputs for kidney sample K4

K5/ - This directory contains outputs for kidney sample K5

K6/ - This directory contains outputs for kidney sample K6

./star/genome: - This directory contains outputs from running STAR genomeGenerate. Detailed file descriptions available from https://github.com/alexdobin/STAR/blob/master/doc/STARmanual.pdf

chrLength.txt

chrNameLength.txt

chrName.txt

chrStart.txt

exonGeTrInfo.tab

exonInfo.tab

geneInfo.tab

Genome

genomeParameters.txt

Log.out

SA

SAindex

sjdbInfo.txt

sjdbList.fromGTF.out.tab

sjdbList.out.tab

transcriptInfo.tab

./star/J1: - This is the head STAR directory for sample J1. It contains logs, basic QC, and gene and splice junction counts. For more information about the STAR pipeline and its outputs, please refer to the STAR documentation https://github.com/alexdobin/STAR/blob/master/doc/STARmanual.pdf

Log.final.out

Log.out

Log.progress.out

SJ.out.tab

Solo.out/

STARgenome/

./star/J1/Solo.out:- This directory contains the outputs used for downstream analysis

Barcodes.stats

GeneFull_Ex50pAS/

SJ/

./star/J1/Solo.out/GeneFull_Ex50pAS: - This directory contains the filtered and raw barcodes, features, and matrix files for gene expression (including introns)

Features.stats

filtered/

raw/

Summary.csv

UMIperCellSorted.txt

./star/J1/Solo.out/GeneFull_Ex50pAS/filtered: - This directory contains the filtered tsv and mtx gene expression files required for creating a Seurat object (or other single cell packages)

barcodes.tsv.gz - This file contains filtered cell barcodes

features.tsv.gz - This file contains filtered features (genes)

matrix.mtx.gz - This file contains the filtered cell by gene expression count matrix

./star/J1/Solo.out/GeneFull_Ex50pAS/raw: - This directory contains the unfiltered tsv and mtx gene expression files required for creating a Seurat object (or other single cell packages). Files are the same as previously described for filtered.

barcodes.tsv

features.tsv

matrix.mtx

./star/J1/Solo.out/SJ: - This directory contains the QC and raw barcodes, features, and matrix files for splice junction expression

Features.stats

raw/

Summary.csv

./star/J1/Solo.out/SJ/raw: - This directory contains the raw barcodes, features, and matrix files for splice junction expression

barcodes.tsv - This file contains filtered cell barcodes

features.tsv - This file contains filtered features (splice junctions)

matrix.mtx - This file contains the filtered cell by gene expression count matrix

./star/J1/_STARgenome: - This directory contains the STARgenome created and used by STAR for this sample. Detailed file descriptions available from https://github.com/alexdobin/STAR/blob/master/doc/STARmanual.pdf

exonGeTrInfo.tab

exonInfo.tab

geneInfo.tab

sjdbInfo.txt

sjdbList.fromGTF.out.tab

sjdbList.out.tab

transcriptInfo.tab
Azimuth Reference - Human Fetus
zenodo.org
data.niaid.nih.gov
bin
Updated May 5, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Satija Lab; Satija Lab (2021). Azimuth Reference - Human Fetus [Dataset]. http://doi.org/10.5281/zenodo.4738021
Explore at:
binAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.4738021
Dataset updated
May 5, 2021
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Satija Lab; Satija Lab
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Here we provide the reference data files used to run the Azimuth Human Fetus reference web application. For a full description of the reference data structure, please see the wiki at https://github.com/satijalab/azimuth/wiki/Azimuth-Reference-Format.
Z
Data for Altered Glia-Neuron Communication in Alzheimer's Disease Affects...
data.niaid.nih.gov
zenodo.org
Updated Nov 28, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Soelter, Tabea (2023). Data for Altered Glia-Neuron Communication in Alzheimer's Disease Affects WNT, p53, and NFkB Signaling Determined by snRNA-seq [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_10214496
Explore at:
Dataset updated
Nov 28, 2023
Dataset provided by
Soelter, Tabea
Lasseigne, Brittany
Oza, Vishal H.
Clark, Amanda D.
Howton, Timothy C.
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
data.tar.gz contains all files from the data directory associated with the 230313_TS_CCCinHumanAD GitHub project and includes the following:

CellRangerCounts/

GSE157827/

post_soupX/ : contains 21 directories for 21 samples, which each contain 3 files obtained from ambient RNA removal with soupX. Below is a representative example, but this repo contains 1 directory per sample:

SAMN16100290_S01_AD/

barcodes.tsv genes.tsv matrix.mtx pre_soupX/ : contains 21 directories for 21 samples, which each contain 2 files obtained from Cell Ranger after aligning fastq files to the reference genome. Below is a representative example, but this repo contains 1 directory per sample:

SAMN16100290_S01_AD/

filtered_feature_ bc_matrix.h5 Raw_feature_bc_matrix.h5 GSE174367/ : contains 19 directories for 19 samples, which contain 3 files each from Cell Ranger alignment of fastq files to the reference genome. Below is a representative example, but this repo contains 1 directory per sample:

SAMN19128610_S1_CTRL/

barcodes.tsv genes.tsv Matrix.mtx ccc/

nichenet_grn/

gr_network_human_21122021.rds : accessed in October 2023, gene regulation network – gene regulatory information from MultiNicheNet ligand_tf_matrix_nsga2r_final.rds: accessed in October 2023, ligand tf matrix for signaling path determination from MultiNicheNet signaling_network_human_21122021.rds : accessed in October 2023, signaling network – protein-protein interaction information from MultiNicheNet weighted_networks_nsga2r_final.rds : accessed in October 2023, networks weighted by literature evidence from MultiNicheNet nichenet_prior/

ligand_target_matrix.rds : accessed in April 2023, ligand to target matrix from NicheNet lr_network.rds : accessed in April 2023, ligand-receptor matrix from NicheNet nichenet_v2_prior/

ligand_target_matrix_nsga2r_final.rds : accessed in June 2023, ligand to target matrix from MultiNicheNet used to predict target genes. lr_network_human_21122021.rds : accessed in June 2023, ligand-receptor matrix from MultiNicheNet used to predict ligand-receptor pairs. geo_multinichenet_output.rds : MultiNicheNet output for Morabito et al., 2021 data geo_signaling_igraph_objects.rds : list of igraph objects for 17 overlapping LRTs and their signaling mediators in the Morabito et al., 2021 dataset. gse_multinichenet_output.rds : MultiNicheNet output for Lau et al., 2020 data gse_signaling_igraph_objects.rds : list of igraph objects for 17 overlapping LRTs and their signaling mediators in the Lau et al., 2020 dataset seurat_preprocessing/

geo_filtered_seurat.rds : merged and filtered seurat object of Morabito et al., 2021 data geo_integrated_seurat.rds : seurat object integrated using harmony of Morabito et al., 2021 data geo_clustered_seurat.rds : clustered seurat object of Morabito et al., 2021 data geo_processed_seurat.rds : processed seurat object with final cell type assignments at specified resolution of Morabito et al., 2021 data gse_filtered_seurat.rds : merged and filtered seurat object of Lau et al., 2020 data gse_integrated_seurat.rds : seurat object integrated using harmony of Lau et al., 2020 data gse_clustered_seurat.rds : clustered seurat object of Lau et al., 2020 data gse_processed_seurat.rds : processed seurat object with final cell type assignments at specified resolution of Lau et al., 2020 data

Facebook

Twitter

Click to copy link

Link copied

Cite

Akiko Oguchi; Yasuhiro Murakawa (2024). Transcription start site analysis for heterogenous CD4+ T cells using 5′ scRNA-seq [Dataset]. http://doi.org/10.5061/dryad.gtht76hv9

Transcription start site analysis for heterogenous CD4+ T cells using 5′ scRNA-seq

Explore at:

zipAvailable download formats

Unique identifier

https://doi.org/10.5061/dryad.gtht76hv9

Dataset updated

Apr 22, 2024

Dataset provided by

RIKEN Center for Integrative Medical Sciences

Authors

Akiko Oguchi; Yasuhiro Murakawa

License

https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html

Description

These datasets are generated by ReapTEC (read-level pre-filtering and transcribed enhancer call) using 5' single-cell RNA-seq data on human heterogenous CD4+ T cells. By taking advantage of a unique “cap signature” derived from the 5′-end of a transcript, ReapTEC simultaneously profiles gene expression and enhancer activity at nucleotide resolution using 5′-end single-cell RNA-sequencing (5′ scRNA-seq). The detail of ReapTEC pipeline is described in https://github.com/MurakawaLab/ReapTEC.

Clear search

Close search

Google apps

Main menu

Transcription start site analysis for heterogenous CD4+ T cells using 5′...

ProjecTILs murine reference atlas of tumor-infiltrating T cells, version 1

Data from: Large-scale integration of single-cell transcriptomic data...

Processed naive T cell single-cell RNA-seq, Seurat object

Processed Seurat objects for GeneTrajectory inference (Gene Trajectory...

Data from: A Single-Cell Tumor Immune Atlas for Precision Oncology

Azimuth Reference - Human Adipose

Processed CODEX Data (Seurat Objects)

Azimuth Reference - Mouse Motor Cortex

cellCounts

Seurat object subset of mouse liver scRNAseq data (Guilliams et al., Cell...

Data from: Pre-ciliated tubal epithelial cells are prone to initiation of...

Processed HSPCs single-cell RNA-seq, Seurat object

Azimuth Reference - Human Pancreas

Processed Seurat Object of scRNAseq data from wildtype and CaMKK2 KO immune...

seurat.wnn.peak.rds

WORKSHOP: Single cell RNAseq analysis in R

Data for Cell-type-specific alternative splicing in the cerebral cortex of a...

Azimuth Reference - Human Fetus

Data for Altered Glia-Neuron Communication in Alzheimer's Disease Affects...

Transcription start site analysis for heterogenous CD4+ T cells using 5′ scRNA-seq