13 datasets found

f
Data_Sheet_1_CBA: Cluster-Guided Batch Alignment for Single Cell RNA-seq.PDF...
frontiersin.figshare.com
pdf
Updated Jun 1, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Wenbo Yu; Ahmed Mahfouz; Marcel J. T. Reinders (2023). Data_Sheet_1_CBA: Cluster-Guided Batch Alignment for Single Cell RNA-seq.PDF [Dataset]. http://doi.org/10.3389/fgene.2021.644211.s001
Explore at:
pdfAvailable download formats
Unique identifier
https://doi.org/10.3389/fgene.2021.644211.s001
Dataset updated
Jun 1, 2023
Dataset provided by
Frontiers
Authors
Wenbo Yu; Ahmed Mahfouz; Marcel J. T. Reinders
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The power of single-cell RNA sequencing (scRNA-seq) in detecting cell heterogeneity or developmental process is becoming more and more evident every day. The granularity of this knowledge is further propelled when combining two batches of scRNA-seq into a single large dataset. This strategy is however hampered by technical differences between these batches. Typically, these batch effects are resolved by matching similar cells across the different batches. Current approaches, however, do not take into account that we can constrain this matching further as cells can also be matched on their cell type identity. We use an auto-encoder to embed two batches in the same space such that cells are matched. To accomplish this, we use a loss function that preserves: (1) cell-cell distances within each of the two batches, as well as (2) cell-cell distances between two batches when the cells are of the same cell-type. The cell-type guidance is unsupervised, i.e., a cell-type is defined as a cluster in the original batch. We evaluated the performance of our cluster-guided batch alignment (CBA) using pancreas and mouse cell atlas datasets, against six state-of-the-art single cell alignment methods: Seurat v3, BBKNN, Scanorama, Harmony, LIGER, and BERMUDA. Compared to other approaches, CBA preserves the cluster separation in the original datasets while still being able to align the two datasets. We confirm that this separation is biologically meaningful by identifying relevant differential expression of genes for these preserved clusters.
n
Data from: Large-scale integration of single-cell transcriptomic data...
data.niaid.nih.gov
dataone.org
+1more
zip
Updated Dec 14, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
David McKellar; Iwijn De Vlaminck; Benjamin Cosgrove (2021). Large-scale integration of single-cell transcriptomic data captures transitional progenitor states in mouse skeletal muscle regeneration [Dataset]. http://doi.org/10.5061/dryad.t4b8gtj34
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5061/dryad.t4b8gtj34
Dataset updated
Dec 14, 2021
Dataset provided by
Cornell University
Authors
David McKellar; Iwijn De Vlaminck; Benjamin Cosgrove
License
https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
Description
Skeletal muscle repair is driven by the coordinated self-renewal and fusion of myogenic stem and progenitor cells. Single-cell gene expression analyses of myogenesis have been hampered by the poor sampling of rare and transient cell states that are critical for muscle repair, and do not inform the spatial context that is important for myogenic differentiation. Here, we demonstrate how large-scale integration of single-cell and spatial transcriptomic data can overcome these limitations. We created a single-cell transcriptomic dataset of mouse skeletal muscle by integration, consensus annotation, and analysis of 23 newly collected scRNAseq datasets and 88 publicly available single-cell (scRNAseq) and single-nucleus (snRNAseq) RNA-sequencing datasets. The resulting dataset includes more than 365,000 cells and spans a wide range of ages, injury, and repair conditions. Together, these data enabled identification of the predominant cell types in skeletal muscle, and resolved cell subtypes, including endothelial subtypes distinguished by vessel-type of origin, fibro/adipogenic progenitors defined by functional roles, and many distinct immune populations. The representation of different experimental conditions and the depth of transcriptome coverage enabled robust profiling of sparsely expressed genes. We built a densely sampled transcriptomic model of myogenesis, from stem cell quiescence to myofiber maturation and identified rare, transitional states of progenitor commitment and fusion that are poorly represented in individual datasets. We performed spatial RNA sequencing of mouse muscle at three time points after injury and used the integrated dataset as a reference to achieve a high-resolution, local deconvolution of cell subtypes. We also used the integrated dataset to explore ligand-receptor co-expression patterns and identify dynamic cell-cell interactions in muscle injury response. We provide a public web tool to enable interactive exploration and visualization of the data. Our work supports the utility of large-scale integration of single-cell transcriptomic data as a tool for biological discovery.

Methods Mice. The Cornell University Institutional Animal Care and Use Committee (IACUC) approved all animal protocols, and experiments were performed in compliance with its institutional guidelines. Adult C57BL/6J mice (mus musculus) were obtained from Jackson Laboratories (#000664; Bar Harbor, ME) and were used at 4-7 months of age. Aged C57BL/6J mice were obtained from the National Institute of Aging (NIA) Rodent Aging Colony and were used at 20 months of age. For new scRNAseq experiments, female mice were used in each experiment.

Mouse injuries and single-cell isolation. To induce muscle injury, both tibialis anterior (TA) muscles of old (20 months) C57BL/6J mice were injected with 10 µl of notexin (10 µg/ml; Latoxan; France). At 0, 1, 2, 3.5, 5, or 7 days post-injury (dpi), mice were sacrificed and TA muscles were collected and processed independently to generate single-cell suspensions. Muscles were digested with 8 mg/ml Collagenase D (Roche; Switzerland) and 10 U/ml Dispase II (Roche; Switzerland), followed by manual dissociation to generate cell suspensions. Cell suspensions were sequentially filtered through 100 and 40 μm filters (Corning Cellgro #431752 and #431750) to remove debris. Erythrocytes were removed through incubation in erythrocyte lysis buffer (IBI Scientific #89135-030).

Single-cell RNA-sequencing library preparation. After digestion, single-cell suspensions were washed and resuspended in 0.04% BSA in PBS at a concentration of 106 cells/ml. Cells were counted manually with a hemocytometer to determine their concentration. Single-cell RNA-sequencing libraries were prepared using the Chromium Single Cell 3’ reagent kit v3 (10x Genomics, PN-1000075; Pleasanton, CA) following the manufacturer’s protocol. Cells were diluted into the Chromium Single Cell A Chip to yield a recovery of 6,000 single-cell transcriptomes. After preparation, libraries were sequenced using on a NextSeq 500 (Illumina; San Diego, CA) using 75 cycle high output kits (Index 1 = 8, Read 1 = 26, and Read 2 = 58). Details on estimated sequencing saturation and the number of reads per sample are shown in Sup. Data 1.

Spatial RNA sequencing library preparation. Tibialis anterior muscles of adult (5 mo) C57BL6/J mice were injected with 10µl notexin (10 µg/ml) at 2, 5, and 7 days prior to collection. Upon collection, tibialis anterior muscles were isolated, embedded in OCT, and frozen fresh in liquid nitrogen. Spatially tagged cDNA libraries were built using the Visium Spatial Gene Expression 3’ Library Construction v1 Kit (10x Genomics, PN-1000187; Pleasanton, CA) (Fig. S7). Optimal tissue permeabilization time for 10 µm thick sections was found to be 15 minutes using the 10x Genomics Visium Tissue Optimization Kit (PN-1000193). H&E stained tissue sections were imaged using Zeiss PALM MicroBeam laser capture microdissection system and the images were stitched and processed using Fiji ImageJ software. cDNA libraries were sequenced on an Illumina NextSeq 500 using 150 cycle high output kits (Read 1=28bp, Read 2=120bp, Index 1=10bp, and Index 2=10bp). Frames around the capture area on the Visium slide were aligned manually and spots covering the tissue were selected using Loop Browser v4.0.0 software (10x Genomics). Sequencing data was then aligned to the mouse reference genome (mm10) using the spaceranger v1.0.0 pipeline to generate a feature-by-spot-barcode expression matrix (10x Genomics).

Download and alignment of single-cell RNA sequencing data. For all samples available via SRA, parallel-fastq-dump (github.com/rvalieris/parallel-fastq-dump) was used to download raw .fastq files. Samples which were only available as .bam files were converted to .fastq format using bamtofastq from 10x Genomics (github.com/10XGenomics/bamtofastq). Raw reads were aligned to the mm10 reference using cellranger (v3.1.0).

Preprocessing and batch correction of single-cell RNA sequencing datasets. First, ambient RNA signal was removed using the default SoupX (v1.4.5) workflow (autoEstCounts and adjustCounts; github.com/constantAmateur/SoupX). Samples were then preprocessed using the standard Seurat (v3.2.1) workflow (NormalizeData, ScaleData, FindVariableFeatures, RunPCA, FindNeighbors, FindClusters, and RunUMAP; github.com/satijalab/seurat). Cells with fewer than 750 features, fewer than 1000 transcripts, or more than 30% of unique transcripts derived from mitochondrial genes were removed. After preprocessing, DoubletFinder (v2.0) was used to identify putative doublets in each dataset, individually. BCmvn optimization was used for PK parameterization. Estimated doublet rates were computed by fitting the total number of cells after quality filtering to a linear regression of the expected doublet rates published in the 10x Chromium handbook. Estimated homotypic doublet rates were also accounted for using the modelHomotypic function. The default PN value (0.25) was used. Putative doublets were then removed from each individual dataset. After preprocessing and quality filtering, we merged the datasets and performed batch-correction with three tools, independently- Harmony (github.com/immunogenomics/harmony) (v1.0), Scanorama (github.com/brianhie/scanorama) (v1.3), and BBKNN (github.com/Teichlab/bbknn) (v1.3.12). We then used Seurat to process the integrated data. After initial integration, we removed the noisy cluster and re-integrated the data using each of the three batch-correction tools.

Cell type annotation. Cell types were determined for each integration method independently. For Harmony and Scanorama, dimensions accounting for 95% of the total variance were used to generate SNN graphs (Seurat::FindNeighbors). Louvain clustering was then performed on the output graphs (including the corrected graph output by BBKNN) using Seurat::FindClusters. A clustering resolution of 1.2 was used for Harmony (25 initial clusters), BBKNN (28 initial clusters), and Scanorama (38 initial clusters). Cell types were determined based on expression of canonical genes (Fig. S3). Clusters which had similar canonical marker gene expression patterns were merged.

Pseudotime workflow. Cells were subset based on the consensus cell types between all three integration methods. Harmony embedding values from the dimensions accounting for 95% of the total variance were used for further dimensional reduction with PHATE, using phateR (v1.0.4) (github.com/KrishnaswamyLab/phateR).

Deconvolution of spatial RNA sequencing spots. Spot deconvolution was performed using the deconvolution module in BayesPrism (previously known as “Tumor microEnvironment Deconvolution”, TED, v1.0; github.com/Danko-Lab/TED). First, myogenic cells were re-labeled, according to binning along the first PHATE dimension, as “Quiescent MuSCs” (bins 4-5), “Activated MuSCs” (bins 6-7), “Committed Myoblasts” (bins 8-10), and “Fusing Myoctes” (bins 11-18). Culture-associated muscle stem cells were ignored and myonuclei labels were retained as “Myonuclei (Type IIb)” and “Myonuclei (Type IIx)”. Next, highly and differentially expressed genes across the 25 groups of cells were identified with differential gene expression analysis using Seurat (FindAllMarkers, using Wilcoxon Rank Sum Test; results in Sup. Data 2). The resulting genes were filtered based on average log2-fold change (avg_logFC > 1) and the percentage of cells within the cluster which express each gene (pct.expressed > 0.5), yielding 1,069 genes. Mitochondrial and ribosomal protein genes were also removed from this list, in line with recommendations in the BayesPrism vignette. For each of the cell types, mean raw counts were calculated across the 1,069 genes to generate a gene expression profile for BayesPrism. Raw counts for each spot were then passed to the run.Ted function, using
f
78 shared genes in DEGs related to age and AD.
plos.figshare.com
xlsx
Updated Nov 26, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Rong He; Qiang Zhang; Limei Wang; Yiwen Hu; Yue Qiu; Jia Liu; Dingyun You; Jishuai Cheng; Xue Cao (2024). 78 shared genes in DEGs related to age and AD. [Dataset]. http://doi.org/10.1371/journal.pone.0311374.s003
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0311374.s003
Dataset updated
Nov 26, 2024
Dataset provided by
PLOS ONE
Authors
Rong He; Qiang Zhang; Limei Wang; Yiwen Hu; Yue Qiu; Jia Liu; Dingyun You; Jishuai Cheng; Xue Cao
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
ObjectiveTo guide animal experiments, we investigated the similarities and differences between humans and mice in aging and Alzheimer’s disease (AD) at the single-nucleus RNA sequencing (snRNA-seq) or single-cell RNA sequencing (scRNA-seq) level.MethodsMicroglia cells were extracted from dataset GSE198323 of human post-mortem hippocampus. The distributions and proportions of microglia subpopulation cell numbers related to AD or age were compared. This comparison was done between GSE198323 for humans and GSE127892 for mice, respectively. The Seurat R package and harmony R package were used for data analysis and batch effect correction. Differentially expressed genes (DEGs) were identified by FindMarkers function with MAST test. Comparative analyses were conducted on shared genes in DEGs associated with age and AD. The analyses were done between human and mouse using various bioinformatics techniques. The analysis of genes in DEGs related to age was conducted. Similarly, the analysis of genes in DEGs related to AD was performed. Cross-species analyses were conducted using orthologous genes. Comparative analyses of pseudotime between humans and mice were performed using Monocle2.Results(1) Similarities: The proportion of microglial subpopulation Cell_APOE/Apoe shows consistent trends, whether in AD or normal control (NC) groups in both humans and mice. The proportion of Cell_CX3CR1/Cx3cr1, representing homeostatic microglia, remains stable with age in NC groups across species. Tuberculosis and Fc gamma R-mediated phagocytosis pathways are shared in microglia responses to age and AD across species, respectively. (2) Differences: IL1RAPL1 and SPP1 as marker genes are more identifiable in human microglia compared to their mouse counterparts. Most genes of DEGs associated with age or AD exhibit different trends between humans and mice. Pseudotime analyses demonstrate varying cell density trends in microglial subpopulations, depending on age or AD across species.ConclusionsMouse Apoe and Cell_Apoe maybe serve as proxies for studying human AD, while Cx3cr1 and Cell_Cx3cr1 are suitable for human aging studies. However, AD mouse models (App_NL_G_F) have limitations in studying human genes like IL1RAPL1 and SPP1 related to AD. Thus, mouse models cannot fully replace human samples for AD and aging research.
Single-cell CITE-seq of murine bone marrow across aging (Young, Mid, Old)
zenodo.org
bin, csv
Updated Jun 3, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Zenodo (2025). Single-cell CITE-seq of murine bone marrow across aging (Young, Mid, Old) [Dataset]. http://doi.org/10.5281/zenodo.15587768
Explore at:
bin, csvAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.15587768
Dataset updated
Jun 3, 2025
Dataset provided by
Zenodohttp://zenodo.org/
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Single-cell CITE-seq dataset of FACS-sorted HSPC and Mature bone marrow mononuclear cells from mice across three age groups:

- Young: 10 weeks (n=2917 cells)
- Mid: 63 weeks (n=2057 cells)
- Old: 103 weeks (n=3116 cells)

Total cells: 8,090 Total genes: 16,285
Analysis Products: Transcription factor stoichiometry, motif affinity and...
zenodo.org
tsv, zip
Updated Nov 11, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Surag Nair; Surag Nair; Mohamed Ameen; Kevin Wang; Kevin Wang; Anshul Kundaje; Anshul Kundaje; Mohamed Ameen (2023). Analysis Products: Transcription factor stoichiometry, motif affinity and syntax regulate single cell chromatin dynamics during fibroblast reprogramming to pluripotency [Dataset]. http://doi.org/10.5281/zenodo.8313962
Explore at:
zip, tsvAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.8313962
Dataset updated
Nov 11, 2023
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Surag Nair; Surag Nair; Mohamed Ameen; Kevin Wang; Kevin Wang; Anshul Kundaje; Anshul Kundaje; Mohamed Ameen
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This record contains analysis products for the paper "Transcription factor stoichiometry, motif affinity and syntax regulate single cell chromatin dynamics during fibroblast reprogramming to pluripotency" by Nair, Ameen et al. Please refer to the READMEs in the directories, which are summarized below.

The record contains the following files:

`clusters.tsv`: contains the cluster id, name and colour of clusters in the paper

scATAC.zip

Analysis products for the single-cell ATAC-seq data. Contains:

- `cells.tsv`: list of barcodes that pass QC. Columns include:
- `barcode`
- `sample`: (time point)
- `umap1`
- `umap2`
- `cluster`
- `dpt_pseudotime_fibr_root`: pseudotime values treating a fibroblast cell as root
- `dpt_pseudotime_xOSK_root`: pseudotime values treating xOSK cell as root
- `peaks.bed`: list of peaks of 500bp across all cell states. 4th column contains the peak set label. Note that ~5000 peaks are not assigned to any peak set and are marked as NA.
- `features.tsv`: 50 dimensional representation of each cell
- `cell_x_peak.mtx.gz`: sparse matrix of fragment counts within peaks. Load using scipy.io.mmread in python or readMM in R. Columns correspond to cells from `cells.tsv` (combine sample + barcode). Rows correspond to peaks in `peaks.bed`

scATAC_clusters.zip

Analysis products corresponding to cluster pseudo-bulks of the single-cell ATAC-seq data.

- `clusters.tsv`: contains the cluster id, name and colour used in the paper
- `peaks`: contains `overlap_reproducibilty/overlap.optimal_peak` peaks called using ENCODE bulk ATAC-seq pipeline in the narrowPeak format.
- `fragments`: contains per cluster fragment files

scATAC_scRNA_integration.zip

Analysis products from the integration of scATAC with scRNA. Contains:

- `peak_gene_links_fdr1e-4.tsv`: file with peak gene links passing FDR 1e-4. For analyses in the paper, we filter to peaks with absolute correlation >0.45.
- `harmony.cca.30.feat.tsv`: 30 dimensional co-embedding for scATAC and scRNA cells obtained by CCA followed by applying Harmony over assay type.
- `harmony.cca.metadata.tsv`: UMAP coordinates for scATAC and scRNA cells derived from the Harmony CCA embedding. First column contains barcode.

scRNA.zip

Analysis products for the single-cell RNA-seq data. Contains:

- `seurat.rds`: seurat object that contains expression data (raw counts, normalized, and scaled), reductions (umap, pca), knn graphs, all associated metadata. Note that barcode suffix (1-9 corresponds to samples D0, D2, ..., D14, iPSC)
- `genes.txt`: list of all genes
- `cells.tsv`: list of barcodes that pass QC across samples. Contains:
- `barcode_sample`: barcode with index of sample (1-9 corresponding to D0, D2, ..., D14, iPSC)
- `sample`: sample name (D0, D2, .., D14, iPSC)
- `umap1`
- `umap2`
- `nCount_RNA`
- `nFeature_RNA`
- `cluster`
- `percent.mt`: percent of mitochondrial transcripts in cell
- `percent.oskm`: percent of OSKM transcripts in cell
- `gene_x_cell.mtx.gz`: sparse matrix of gene counts. Load using scipy.io.mmread in python or readMM in R. Columns correspond to cells from `cells.tsv` (barcode suffix contains sample information). Rows correspond to genes in `genes.txt`
- `pca.tsv`: first 50 PC of each cell
- `oskm_endo_sendai.tsv`: estimated raw counts (cts, may not be integers) and log(1+ tp10k) normalized expression (norm) for endogenous and exogenous (Sendai derived) counts of POU5F1 (OCT4), SOX2, KLF4 and MYC genes. Rows are consistent with `seurat.rds` and `cells.tsv`

multiome.zip

multiome/snATAC:

These files are derived from the integration of nuclei from multiome (D1M and D2M), with cells from day 2 of scATAC-seq (labeled D2).

- `cells.tsv`: This is the list of nuclei barcodes that pass QC from multiome AND also cell barcodes from D2 of scATAC-seq. Includes:
- `barcode`
- `umap1`: These are the coordinates used for the figures involving multiome in the paper.
- `umap2`: ^^^
- `sample`: D1M and D2M correspond to multiome, D2 corresponds to day 2 of scATAC-seq
- `cluster`: For multiome barcodes, these are labels transfered from scATAC-seq. For D2 scATAC-seq, it is the original cluster labels.
- `peaks.bed`: This is the same file as scATAC/peaks.bed. List of peaks of 500bp. 4th column contains the peak set label. Note that ~5000 peaks are not assigned to any peak set and are marked as NA.
- `cell_x_peak.mtx.gz`: sparse matrix of fragment counts within peaks. Load using scipy.io.mmread in python or readMM in R. Columns correspond to cells from `cells.tsv` (combine sample + barcode). Rows correspond to peaks in `peaks.bed`.
- `features.no.harmony.50d.tsv`: 50 dimensional representation of each cell prior to running Harmony (to correct for batch effect between D2 scATAC and D1M,D2M snMultiome). Rows correspond to cells from `cells.tsv`.
- `features.harmony.10d.tsv`: 10 dimensional representation of each cell after running Harmony. Rows correspond to cells from `cells.tsv`.

multiome/snRNA:

- `seurat.rds`: seurat object that contains expression data (raw counts, normalized, and scaled), reductions (umap, pca),associated metadata. Note that barcode suffix (1,2 corresponds to samples D1M, D2M). Please use the UMAP/features from snATAC/ for consistency.
- `genes.txt`: list of all genes (this is different from the list in scRNA analysis)
- `cells.tsv`: list of barcodes that pass QC across samples. Contains:
- `barcode_sample`: barcode with index of sample (1,2 corresponding to D1M, D2M respectively)
- `sample`: sample name (D1M, D2M)
- `nCount_RNA`
- `nFeature_RNA`
- `percent.oskm`: percent of OSKM genes in cell
- `gene_x_cell.mtx.gz`: sparse matrix of gene counts. Load using scipy.io.mmread in python or readMM in R. Columns correspond to cells from `cells.tsv` (barcode suffix contains sample information). Rows correspond to genes in `genes.txt`
m
Data from: Single-cell RNA sequencing reveals the expansion of circulating...
data.mendeley.com
Updated Mar 1, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jantarika Kumar Arora (2024). Single-cell RNA sequencing reveals the expansion of circulating tissue-homing B cell subsets in secondary acute dengue viral infection [Dataset]. http://doi.org/10.17632/xmnp8c5c65.1
Explore at:
Unique identifier
https://doi.org/10.17632/xmnp8c5c65.1
Dataset updated
Mar 1, 2024
Authors
Jantarika Kumar Arora
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
We subsetted B cells from the integrated Seurat object deposited in Mendeley under URL : https://data.mendeley.com/datasets/6ry3x7r8hf/3. We re-clustered B cells, plasma cells, and plasmablasts using the harmony package with default parameters. The B cell subsets were then annotated using established canonical marker genes described in TableS1.
Data from: Pre-ciliated tubal epithelial cells are prone to initiation of...
data.niaid.nih.gov
datadryad.org
zip
Updated Oct 17, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Coulter Ralston; Alexander Nikitin; Benjamin Cosgrove (2024). Pre-ciliated tubal epithelial cells are prone to initiation of high-grade serous ovarian carcinoma [Dataset]. http://doi.org/10.5061/dryad.4mw6m90hm
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5061/dryad.4mw6m90hm
Dataset updated
Oct 17, 2024
Dataset provided by
Cornell University
Authors
Coulter Ralston; Alexander Nikitin; Benjamin Cosgrove
License
https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
Description
The distal region of the uterine (Fallopian) tube is commonly associated with high-grade serous carcinoma (HGSC), the predominant and most aggressive form of ovarian or extra-uterine cancer. Specific cell states and lineage dynamics of the adult tubal epithelium (TE) remain insufficiently understood, hindering efforts to determine the cell of origin for HGSC. Here, we report a comprehensive census of cell types and states of the mouse uterine tube. We show that distal TE cells expressing the stem/progenitor cell marker Slc1a3 can differentiate into both secretory (Ovgp1+) and ciliated (Fam183b+) cells. Inactivation of Trp53 and Rb1, whose pathways are commonly altered in HGSC, leads to elimination of targeted Slc1a3+ cells by apoptosis, thereby preventing their malignant transformation. In contrast, pre-ciliated cells (Krt5+, Prom1+, Trp73+) remain cancer-prone and give rise to serous tubal intraepithelial carcinomas and overt HGSC. These findings identify transitional pre-ciliated cells as a previously unrecognized cancer-prone cell state and point to pre-ciliation mechanisms as novel diagnostic and therapeutic targets. Methods

Single-cell RNA-sequencing library preparation For TE single cell expression and transcriptome analysis we isolated TE from C57BL6 adult estrous female mice. In 3 independent experiments a total of 62 uterine tubes were collected. Each uterine tube was placed in sterile PBS containing 100 IU ml-1 of penicillin and 100 µg ml-1 streptomycin (Corning, 30-002-Cl), and separated in distal and proximal regions. Tissues from the same region were combined in a 40 µl drop of the same PBS solution, cut open lengthwise, and minced into 1.5-2.5 mm pieces with 25G needles. Minced tissues were transferred with help of a sterile wide bore 200 µl pipette tip into a 1.8 ml cryo vial containing 1.2 ml A-mTE-D1 (300 IU ml-1 collagenase IV mixed with 100 IU ml-1 hyaluronidase; Stem Cell Technologies, 07912, in DMEM Ham’s F12, Hyclone, SH30023.FS). Tissues were incubated with loose cap for 1 h at 37°C in a 5% CO2 incubator. During the incubation tubes were taken out 4 times and tissues suspended with a wide bore 200 µl pipette tip. At the end of incubation, the tissue-cell suspension from each tube was transferred into 1 ml TrypLE (Invitrogen, 12604013) pre-warmed to 37°C, suspended 70 times with a 1000 µl pipette tip, 5 ml A-SM [DMEM Ham’s F12 containing 2% fetal bovine serum (FBS)] were added to the mix, and TE cells were pelleted by centrifugation 300x g for 10 minutes at 25°C. Pellets were then suspended with 1 ml pre-warmed to 37°C A-mTE-D2 (7 mg ml-1 Dispase II, Worthington NPRO2, and 10 µg ml-1 Deoxyribonuclease I, Stem Cell Technologies, 07900), and mixed 70 times with a 1000 µl pipette tip. 5 ml A-mTE-D2 was added and samples were passed through a 40 µm cell strainer, and pelleted by centrifugation at 300x g for 7 minutes at +4°C. Pellets were suspended in 100 µl microbeads per 107 total cells or fewer, and dead cells were removed with the Dead Cell Removal Kit (Miltenyi Biotec, 130-090-101) according to the manufacturer’s protocol. Pelleted live cell fractions were collected in 1.5 ml low binding centrifuge tubes, kept on ice, and suspended in ice cold 50 µl A-Ri-Buffer (5% FBS, 1% GlutaMAX-I, Invitrogen, 35050-079, 9 µM Y-27632, Millipore, 688000, and 100 IU ml-1 penicillin 100 μg ml-1 streptomycin in DMEM Ham’s F12). Cell aliquots were stained with trypan blue for live and dead cell calculation. Live cell preparations with a target cell recovery of 5,000-6,000 were loaded on Chromium controller (10X Genomics, Single Cell 3’ v2 chemistry) to perform single cell partitioning and barcoding using the microfluidic platform device. After preparation of barcoded, next-generation sequencing cDNA libraries samples were sequenced on Illumina NextSeq500 System.

Download and alignment of single-cell RNA sequencing data For sequence alignment, a custom reference for mm39 was built using the cellranger (v6.1.2, 10x Genomics) mkref function. The mm39.fa soft-masked assembly sequence and the mm39.ncbiRefSeq.gtf (release 109) genome annotation last updated 2020-10-27 were used to form the custom reference. The raw sequencing reads were aligned to the custom reference and quantified using the cellranger count function.

Preprocessing and batch correction All preprocessing and data analysis was conducted in R (v.4.1.1 (2021-08-10)). The cellranger count outs were first modified with the autoEstCont and adjustCounts functions from SoupX (v.1.6.1) to output a corrected matrix with the ambient RNA signal (soup) removed (https://github.com/constantAmateur/SoupX). To preprocess the corrected matrices, the Seurat (v.4.1.1) NormalizeData, FindVariableFeatures, ScaleData, RunPCA, FindNeighbors, and RunUMAP functions were used to create a Seurat object for each sample (https://github.com/satijalab/seurat). The number of principal components used to construct a shared nearest-neighbor graph were chosen to account for 95% of the total variance. To detect possible doublets, we used the package DoubletFinder (v.2.0.3) with inputs specific to each Seurat object. DoubletFinder creates artificial doublets and calculates the proportion of artificial k nearest neighbors (pANN) for each cell from a merged dataset of the artificial and actual data. To maximize DoubletFinder’s predictive power, mean-variance normalized bimodality coefficient (BCMVN) was used to determine the optimal pK value for each dataset. To establish a threshold for pANN values to distinguish between singlets and doublets, the estimated multiplet rates for each sample were calculated by interpolating between the target cell recovery values according to the 10x Chromium user manual. Homotypic doublets were identified using unannotated Seurat clusters in each dataset with the modelHomotypic function. After doublets were identified, all distal and proximal samples were merged separately. Cells with greater than 30% mitochondrial genes, cells with fewer than 750 nCount RNA, and cells with fewer than 200 nFeature RNA were removed from the merged datasets. To correct for any batch defects between sample runs, we used the harmony (v.0.1.0) integration method (github.com/immunogenomics/harmony).

Clustering parameters and annotations After merging the datasets and batch-correction, the dimensions reflecting 95% of the total variance were input into Seurat’s FindNeighbors function with a k.param of 70. Louvain clustering was then conducted using Seurat’s FindClusters with a resolution of 0.7. The resulting 19 clusters were annotated based on the expression of canonical genes and the results of differential gene expression (Wilcoxon Rank Sum test) analysis. One cluster expressing lymphatic and epithelial markers was omitted from later analysis as it only contained 2 cells suspected to be doublets. To better understand the epithelial populations, we reclustered 6 epithelial populations and reapplied harmony batch correction. The clustering parameters from FindNeighbors was a k.param of 50, and a resolution of 0.7 was used for FindClusters. The resulting 9 clusters within the epithelial subset were further annotated using differential expression analysis and canonical markers.

Pseudotime analysis Potential of heat diffusion for affinity-based transition embedding (PHATE) is dimensional reduction method to more accurately visualize continual progressions found in biological data 35. A modified version of Seurat (v4.1.1) was developed to include the ‘RunPHATE’ function for converting a Seurat Object to a PHATE embedding. This was built on the phateR package (v.1.0.7) (https://github.com/scottgigante/seurat/tree/patch/add-PHATE-again). In addition to PHATE, pseudotime values were calculated with Monocle3 (v.1.2.7), which computes trajectories with an origin set by the user 36,55–57. The origin was set to be a progenitor cell state confirmed with lineage tracing experiments. 35. Moon, K. R. et al. Visualizing structure and transitions in high-dimensional biological data. Nat Biotechnol 37, 1482–1492 (2019). doi:10.1038/s41587-019-0336-3 36. Cao, J. et al. The single-cell transcriptional landscape of mammalian organogenesis. Nature 566, 496–502 (2019). doi:10.1038/s41586-019-0969-x 55. Trapnell, C. et al. The dynamics and regulators of cell fate decisions are revealed by pseudotemporal ordering of single cells. Nature Biotechnology 32, 381–386 (2014). doi:10.1038/nbt.2859 56. Qiu, X. et al. Single-cell mRNA quantification and differential analysis with Census. Nature Methods 14, 309–315 (2017). doi:10.1038/nmeth.4150 57. Qiu, X. et al. Reversed graph embedding resolves complex single-cell trajectories. Nat Methods 14, 979–982 (2017). doi:10.1038/nmeth.4402
Data from: A spatial transcriptomics atlas of live donors reveals unique...
zenodo.org
bin
Updated Jul 27, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Oran Yakubovsky; Oran Yakubovsky; Shalev Itzkovitz; Shalev Itzkovitz (2025). A spatial transcriptomics atlas of live donors reveals unique zonation patterns in the healthy human liver [Dataset]. http://doi.org/10.5281/zenodo.16414453
Explore at:
binAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.16414453
Dataset updated
Jul 27, 2025
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Oran Yakubovsky; Oran Yakubovsky; Shalev Itzkovitz; Shalev Itzkovitz
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Sample Group Description

The dataset includes the following human & non-human samples:

Human samples:

-Live healthy donors (LHD; n = 8), labeled as M

-Patients with liver pathology (adjacent normal tissue sampled; n = 8), labeled as P

Non-human samples:

-Wild boar (n = 2), labeled as non_human_P

-Cow (n = 2), labeled as non_human_C

-Domesticated pig (n = 3), labeled as non_human_PD

Visium:

LHDs and adjacent normal samples:

Human loupe files:

This folder includes the Loupe Browser-compatible files (16 total) corresponding to the human samples (M1–M8, P2, P3, P6, P7, P14, P17, P18, P21) for downstream visualization and exploration.

Human_h5_files

This folder contains .h5 formatted output files from Space Ranger for the 16 human samples.

Human_Spatial_transcriptomics_data:

This folder contains spatial transcriptomics data from 16 human samples.
For each sample the following files are included:

counts_ALL.csv – full gene expression matrix

counts_UTT.csv – filtered matrix (UTT: under the tissue)

tissue_positions_list.csv – spatial barcode coordinates

scalefactors_json.json – image scaling information

tissue_hires_image.png – high-resolution histology image

Non_Human_Loupe_files:

This folder contains Loupe Browser-compatible files for the 7 non-human samples (C1, C2, P1, P2, PD1, PD2, PD3).

Non_human_h5_files

This folder contains.h5 formatted output files from Space Ranger for the 7 non-human samples

Non_Human_Spatial_transcriptomics_data:

This folder includes spatial transcriptomics data from 7 non-human samples (C1, C2, P1,P2, PD1, PD2, PD3).

counts_ALL.csv – full gene expression matrix

counts_UTT.csv – filtered matrix (UTT: under the tissue)

tissue_positions_list.csv – spatial barcode coordinates

scalefactors_json.json – image scaling information

tissue_hires_image.png – high-resolution histology image

VisumHD:

This folder contains spatial transcriptomics data from 10x Genomics Visium HD for human liver samples:

M1_VisiumHD.cloupe – Loupe Browser visualization file for patient M1, showing spatial transcriptomics data at 8 μm bin resolution.

M2_VisiumHD.cloupe – Loupe Browser visualization file for patient M2, showing spatial transcriptomics data at 8 μm bin resolution.

M6_VisiumHD.cloupe – Loupe Browser file for a Visium HD slide that includes two tissue sections. The tissue at the bottom of the slide corresponds to patient M6, which is the one analyzed in the downstream dataset (marked under ‘patients’ as ‘M6-high quality’). Data is shown at 8 μm bin resolution.

visiumHD_data_M2_M6.h5ad – A filtered and integrated .h5ad file containing single-cell–resolved spatial gene expression data from both M2 and M6 samples.

-Resolution: Cells. Based on single-cell segmentation (see “Liver Cell Atlas using Visium HD” method).

-Cell filtering: Only cells detected via segmentation and that have passed quality filters were included.

-UMI threshold: Cells with fewer than 200 UMIs were excluded.

-Batch correction: Harmony was applied to correct sample-specific effects prior to UMAP visualization.

-Format: .h5ad (AnnData format, compatible with Scanpy).

-Includes: Single-cell expression matrix, spatial coordinates, Harmony-corrected UMAP, cluster identity, and metadata.

M6:

This folder contains spatial transcriptomics data (8*8 μm) for sample M6, generated using the 10x Genomics Visium HD platform.

NOTES:

-The gene expression matrices (*.h5) come from the full slide output of Space Ranger, including both tissue sections (like the M6 Loupe file ).

-The spatial metadata files (*.json, *.tif, .csv) refer to the cropped region, corresponding to the bottom tissue, which is the actual M6 sample used in downstream analysis.

-This is the raw Space Ranger output, prior to cell segmentation or high-level filtering (apart from the default filtered feature matrix).

-This data reflects raw 8 μm resolution bins, not single-cell segmentations.

-For downstream analysis based on cell segmentation, refer to the visiumHD_data_M2_M6.h5ad file in the top-level VisiumHD folder.

Gene Expression Matrices (uncropped – both tissues included):

filtered_feature_bc_matrix_8um.h5
raw_feature_bc_matrix_8um.h5

Spatial Metadata (cropped – M6 tissue only):

scalefactors_json.json Images: tissue_hires_image.tif / tissue_lowres_image.tif / tissue_fullres_image.tif
tissue_positions.csv - Barcode-to-position table corresponding only to the cropped region, i.e., the M6 tissue. Only the barcodes listed in this file are relevant to M6 and should be used to extract or analyze this tissue’s expression data from the full matrix.

M2:

This folder contains the full, unmodified output of the 10x Genomics Visium HD Space Ranger for the M2 liver tissue sample (8X8 um resolution).

filtered_feature_bc_matrix_8um.h5
raw_feature_bc_matrix_8um.h5 scalefactors_json.jsonImages: tissue_hires_image.tif / tissue_lowres_image.tif
tissue_positions_orig.csv

M1:

This folder contains the full, unmodified output of the 10x Genomics Visium HD Space Ranger for the M1 liver tissue sample (8X8 um resolution)

filtered_feature_bc_matrix_8um.h5
raw_feature_bc_matrix_8um.h5scalefactors_json.jsonImages: tissue_hires_image.tif / tissue_lowres_image.tif
tissue_positions_orig.csv

snRNAseq:

This folder contains single-nucleus RNA-seq (snRNA-seq) data from four human liver samples (M5, M6, M7, M8). Data was generated using Cell Ranger multi.

single_nuc_RNAseq.cloupe - Output from Cell Ranger multi. data from all four samples.

snRNAseq.h5ad - Processed and filtered .h5ad file containing single-nucleus expression data from M5–M8, integrated into one dataset.

-Filtering includes standard QC (e.g., low-gene/UMI exclusion, mitochondrial content, etc.)

-Batch correction: Harmony was applied to correct sample-specific effects prior to UMAP visualization.

-Format: .h5ad (AnnData format, compatible with Scanpy).

-Includes: expression matrix, spatial coordinates, Harmony-corrected UMAP, cluster identity and metadata.

M5, M6, M7, M8

Each sample folder contains raw and filtered matrices generated by Cell Ranger:

-sample_filtered_feature_bc_matrix

-sample_raw_feature_bc_matrix

MERFISH:

For both samples- M5 and M8, each sample folder contains:

-cell_by_gene.csv

-cell_metadata.csv

-detected_transcripts.csv
EPI-Clone supplementary dataset: Single cell RNA-seq of clonally barcoded...
figshare.com
application/gzip
Updated Nov 26, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Lars Velten; Michael Scherer; Alejo Rodriguez-Fraticelli; Indranil Singh (2024). EPI-Clone supplementary dataset: Single cell RNA-seq of clonally barcoded hematopoietic progenitors [Dataset]. http://doi.org/10.6084/m9.figshare.24260743.v1
Explore at:
application/gzipAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.24260743.v1
Dataset updated
Nov 26, 2024
Dataset provided by
Figsharehttp://figshare.com/
Authors
Lars Velten; Michael Scherer; Alejo Rodriguez-Fraticelli; Indranil Singh
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This is the dataset supporting the EPI-Clone manuscript: scRNA-seq profiling of hematopoietic stem and progenitor cells (HSPCs) was performed with the 3' 10x Genomics profiling. Three experiments are included: Two where HSCs were clonally labeled with the LARRY system, transplanted to recipient mouse and profiled 4-5 months later (post-transplant hematopoiesis), and one where HSPCs were profiled straight from an unperturbed mouse.Dataset is a seurat (v4) object with the following assays, reductions and metadata:ASSAYS:AB: Antibody expression dataRNA: RNA expression profilesintegrated: Integration of DNA methylation data performed across experimental batches with two batch correction methods: CCA (https://satijalab.org/seurat/reference/runcca) and harmony (https://portals.broadinstitute.org/harmony/articles/quickstart.html).DIMENSIONALITY REDUCTIONpca_cca: PCA performed on the integrated data (CCA integration)umap_cca: UMAP computed on the integrated data (CCA integration)umap_harmony: UMAP computed on the integrated data (Harmony integration)METADATAExperiment: The experiment that the cell is from, values are "LARRY main experiment", "LARRY replicate" and "Native hematopoiesis"ProcessingBatch: Experiments were processed in several batches.CellType: Cell type annotationLARRY: Error corrected LARRY barcodepercent.mt: percentage of mitochondrial DNAnCount_RNA: Read count for the RNA modalitynFeature_RNA: Number of RNAs with at least one readnCount_AB: Read count for the surface protein modalitynFeature_AB: Number of ABs with at least one read
Visium Spatial and snRNA data of Brain section from Parkinson Mouse Model...
zenodo.org
bin, csv, zip
Updated Jun 5, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jaehyun Lee; Jaehyun Lee (2025). Visium Spatial and snRNA data of Brain section from Parkinson Mouse Model based on inducible expression of human a-syn constructs: 20-months + snRNA 23 months dataset [Dataset]. http://doi.org/10.5281/zenodo.14988055
Explore at:
csv, bin, zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.14988055
Dataset updated
Jun 5, 2025
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Jaehyun Lee; Jaehyun Lee
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Using 23-months old mice of a inducible expression of human a-syn constructs based Parkinson mouse model, we produced a single nucleus RNA dataset by cutting 0mm Bregma to -5mm Bregma. The Chromium 3’ Single Cell Library Kit (10x Genomics) was used and Sequencing was performed on a NovaSeq 6000. From the same model we also used 20-months old mice with the Visium Spatial V1 platform (10x Genomics). Sequencing was performed on a NovaSeq 6000. Both were PE150.

snRNA pipeline: For the alignment of reads, a custom reference was created by adding the sequences of the V1S/SV2 transgene and the Camk2a promoter to the mm10 mouse reference genome. Count matrices generated by cellranger count 7.1 were loaded into an AnnData object and processed using the Python-based framework Scanpy 1.10.2. Integration with R, where needed, was facilitated through the rpy2 package. Raw count matrices were corrected for ambient RNA contamination using the SoupX 1.6.2. To remove potential doublets, scDblFinder 1.18.0 was employed with a fixed seed (123). Nuclei with nUMI and nGenes values exceeding three median absolute deviations (MADs) from the median were excluded. Genes detected in fewer than five nuclei across the dataset were excluded. The resulting dataset was normalized via scanpy.pp.normalize_total and scanpy.pp.log1p. Highly variable genes were identified using the function scanpy.pp.highly_variable_genes with the Seurat v3 flavor, selecting the top 4,000 genes. Dimensionality reduction was performed using principal component analysis (PCA) and batch effects were corrected using the python-implemented version of Harmony via the function scanpy.external.pp.harmony_integrate. Harmony embeddings were then used to construct a k-nearest neighbor (kNN) graph with scanpy.pp.neighbors. Clustering was performed using Leiden clustering with standard parameters via the function scanpy.tl.leiden. Clusters were annotated using literature, the mousebrain.org, and markers identified via the FindConservedMarkers function in Seurat. First, neurons and non-neuronal cells were distinguished using mainly canonical markers, such as but not limited to Rbfox3 (neurons), Mbp (oligodendrocytes), Acsbg1 (astrocytes), Pdgfra (oligodendrocyte precursor cells), Inpp5d (microglia), Colec12 (vascular cells), and Ttr (choroid plexus cells). Neurons were further classified into Vglut1 (Slc17a7), Vglut2 (Slc17a6), GABA (Gad2), cholinergic (Scube1), and dopaminergic (Th) neurons. Vglut1 and GABA neurons were further annotated into subtypes based on subclustering and FindConservedMarkers markers.

visium spatial pipeline: Sequences were fiducially aligned to spots using Loupe Browser ver. 8. All aligned sequences were mapped using spaceranger count 3.0.1 with a custom refence, which included sequences for the promotor and transgene (Camk2aTTA, V1S/SV2) to the mouse genome mm39. We filtered each sample of the Visium Spatial dataset based on the MAD filtering of number of reads (nUMI), number of genes (nGene), and percentage of mitochondrial genes (percent.mt). A spot was filtered out if it was outside of 3x MAD value in at least two metrics. Filtered samples were merged into one Seurat 5.1.0 object and we obtained normalized counts by the SCTransform function of Seurat. Integration was performed using Harmony 1.2.0 on 50 PCA embeddings and clustering was done using Leiden clustering based on 30 harmony embeddings. Integrated clusters were visualized using the UMAP method. Samples that were not successfully integrated (based on similarity measures of the harmony embeddings) and showed high percentage.mt or low nUMI levels compared to other samples, were removed from subsequent analysis. A final integration and clustering were performed after filtering. Regions were first annotated based on a 0.1 resolution clustering to get high level region annotation (Cortex, Hippocampus, Subcortex). Each high-level region was further annotated based on either more granular resolutions or subclustering. Marker genes from mousebrain.org and literature were used in combination with the Allen mouse brain atlas to obtain anatomically relevant annotations.
f
DataSheet_1_Molecular mechanisms regulating natural menopause in the female...
frontiersin.figshare.com
xlsx
Updated Jul 24, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Quan Liu; Fangqin Wei; Jiannan Wang; Haiyan Liu; Hua Zhang; Min Liu; Kaili Liu; Zheng Ye (2023). DataSheet_1_Molecular mechanisms regulating natural menopause in the female ovary: a study based on transcriptomic data.xlsx [Dataset]. http://doi.org/10.3389/fendo.2023.1004245.s001
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.3389/fendo.2023.1004245.s001
Dataset updated
Jul 24, 2023
Dataset provided by
Frontiers
Authors
Quan Liu; Fangqin Wei; Jiannan Wang; Haiyan Liu; Hua Zhang; Min Liu; Kaili Liu; Zheng Ye
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
IntroductionNatural menopause is an inevitable biological process with significant implications for women's health. However, the molecular mechanisms underlying menopause are not well understood. This study aimed to investigate the molecular and cellular changes occurring in the ovary before and after perimenopause.MethodsSingle-cell sequencing data from the GTEx V8 cohort (30-39: 14 individuals; 40-49: 37 individuals; 50-59: 61 individuals) and transcriptome sequencing data from ovarian tissue were analyzed. Seurat was used for single-cell sequencing data analysis, while harmony was employed for data integration. Cell differentiation trajectories were inferred using CytoTrace. CIBERSORTX assessed cell infiltration scores in ovarian tissue. WGCNA evaluated co-expression network characteristics in pre- and post-perimenopausal ovarian tissue. Functional enrichment analysis of co-expression modules was conducted using ClusterprofileR and Metascape. DESeq2 performed differential expression analysis. Master regulator analysis and signaling pathway activity analysis were carried out using MsViper and Progeny, respectively. Machine learning models were constructed using Orange3.ResultsWe identified the differentiation trajectory of follicular cells in the ovary as ARID5B+ Granulosa -> JUN+ Granulosa -> KRT18+ Granulosa -> MT-CO2+ Granulosa -> GSTA1+ Granulosa -> HMGB1+ Granulosa. Genes driving Granulosa differentiation, including RBP1, TMSB10, SERPINE2, and TMSB4X, were enriched in ATP-dependent activity regulation pathways. Genes involved in maintaining the Granulosa state, such as DCN, ARID5B, EIF1, and HSP90AB1, were enriched in the response to unfolded protein and chaperone-mediated protein complex assembly pathways. Increased contents of terminally differentiated HMGB1+ Granulosa and GSTA1+ Granulosa were observed in the ovaries of individuals aged 50-69. Signaling pathway activity analysis indicated a gradual decrease in TGFb and MAPK pathway activity with menopause progression, while p53 pathway activity increased. Master regulator analysis revealed significant activation of transcription factors FOXR1, OTX2, MYBL2, HNF1A, and FOXN4 in the 30-39 age group, and GLI1, SMAD1, SMAD7, APP, and EGR1 in the 40-49 age group. Additionally, a diagnostic model based on 16 transcription factors (Logistic Regression L2) achieved reliable performance in determining ovarian status before and after perimenopause.ConclusionThis study provides insights into the molecular and cellular mechanisms underlying natural menopause in the ovary. The findings contribute to our understanding of perimenopausal changes and offer a foundation for health management strategies for women during this transition.
f
Fig. 2b-c | Quantitative immunofluorescence microscopy for intracellular FUS...
su.figshare.com
bin
Updated Jun 13, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Christoph Schweingruber; Erin Hedges; Marc-David Ruepp (2025). Fig. 2b-c | Quantitative immunofluorescence microscopy for intracellular FUS localization in ALS motor neurons [Dataset]. http://doi.org/10.17045/sthlmuni.27952080.v1
Explore at:
binAvailable download formats
Unique identifier
https://doi.org/10.17045/sthlmuni.27952080.v1
Dataset updated
Jun 13, 2025
Dataset provided by
Stockholm University
Authors
Christoph Schweingruber; Erin Hedges; Marc-David Ruepp
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This item is part of the Figshare Project:Early mitochondrial dysfunction revealed across FUS- and TARDBP-ALS at single cell resolutionFrom Data Availability Statement for the paper in Nature Communications entitled:Single-cell RNA sequencing reveals early mitochondrial dysfunction unique to motor neurons shared across FUS- and TARDBP-ALS"We have deposited all raw and processed RNA sequencing data generated in this study on the NCBI Gene Expression Omnibus (GEO) under the accession number GSE226482. The C9orf72-ALS bulk RNA sequencing data was retrieved directly from the authors of the study.[Items under this Figshare Project contain:] "Scans of fluorescent western blots, raw imaging files from confocal microscopy, the analysis files from Opera Phenix, qPCR data sets, and Seahorse assay result files."-------------------------[Item specific description:]Quantititative fluorescene microscopy for FUS protein in ALS motor neuronsThe dataset are Harmony archives for the high content imager Opera Phenix (Perkim Elmer). The proprietary harmony software creates a database with file organization stored in xml and oar files and folders with raw tif files. More information here on our imaging platform: https://www.kclwcic.co.uk/operaphenixThese archives contain thousands of tif files. Therefore, they are compressed into tar.gz files with matching md5 checksums for verification.The prism file is just a graphpad prism file with the summary data. More info here: https://www.graphpad.com/features
f
Table_1_Integrated single-cell RNA-seq analysis identifies immune...
frontiersin.figshare.com
xlsx
Updated Jun 13, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Xiaoyu Liu; Xu Xu; Zhuozhuo Wu; Qungang Shan; Ziyin Wang; Zhiyuan Wu; Xiaoyi Ding; Wei Huang; Zhongmin Wang (2023). Table_1_Integrated single-cell RNA-seq analysis identifies immune heterogeneity associated with KRAS/TP53 mutation status and tumor-sideness in colorectal cancers.xlsx [Dataset]. http://doi.org/10.3389/fimmu.2022.961350.s007
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.3389/fimmu.2022.961350.s007
Dataset updated
Jun 13, 2023
Dataset provided by
Frontiers
Authors
Xiaoyu Liu; Xu Xu; Zhuozhuo Wu; Qungang Shan; Ziyin Wang; Zhiyuan Wu; Xiaoyi Ding; Wei Huang; Zhongmin Wang
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
BackgroundThe main objective of this study was to analyze the effects of KRAS/TP53 mutation status and tumor sideness on the immune microenvironment of colorectal cancer using integrated scRNA-seq data.MethodsA total of 78 scRNA-seq datasets, comprising 42 treatment-naive colorectal tumors, 13 tumor adjacent tissues and 23 normal mucosa tissues were included. Standardized Seurat procedures were applied to identify cellular components with canonical cell marks. The batch-effect was assessed and corrected using harmony algorithm. The scMetabolism algorithm was used for single-cell metabolic analysis. The results and clinical significance were further validated using immunofluorescent-staining and TCGA-COAD datasets. Immune-infiltration scores of bulk-RNA-seq data were estimated using ssGSEA. The presto-wilcoxauc algorithm was used to identify differentially enriched genes or pathways across different subgroups. Two-sided p-value less than 0.05 was considered statistically significant.ResultsWe refined the landscape of functional immune cell subtypes, especially T cells and myeloid cells, across normal mucosa, tumor adjacent and tumor tissue. The existence and function of two states of exhausted CD8+ T (Tex) subtypes in colorectal cancer, and FOLR2+ LYVE1+ macrophages indicating unfavorable prognosis in colorectal cancer were identified and validated. The diverse tumor mutation status reshaped the immune cell function and immune checkpoint ligands/receptors (ICLs/ICRs) expression pattern. Importantly, the KRAS/TP53 dual mutations significantly reduced the major energy metabolic functions in immune cells, and promoted the cell-to-cell communications towards immunosuppression in colorectal cancers. The results revealed LAG3, CD24-SIGLEC10 and HBEGF-CD9 pathways as potential therapeutic targets for dual mutant colorectal cancers.ConclusionsWe revealed that the immune microenvironment underwent a gradual remodeling with an enrichment of immunosuppressive myeloid cells from normal mucosa to tumor regions in colorectal cancers. Moreover, we revealed the metabolic heterogeneity of tumor-infiltrating immune cells and suggested that the KRAS/TP53 dual mutation may impair antitumor immunity by reducing T and myeloid cell energy metabolism and reshaping cellular interactions toward immunosuppression.
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

Wenbo Yu; Ahmed Mahfouz; Marcel J. T. Reinders (2023). Data_Sheet_1_CBA: Cluster-Guided Batch Alignment for Single Cell RNA-seq.PDF [Dataset]. http://doi.org/10.3389/fgene.2021.644211.s001

Data_Sheet_1_CBA: Cluster-Guided Batch Alignment for Single Cell RNA-seq.PDF

Explore at:

pdfAvailable download formats

Unique identifier

https://doi.org/10.3389/fgene.2021.644211.s001

Dataset updated

Jun 1, 2023

Dataset provided by

Frontiers

Authors

Wenbo Yu; Ahmed Mahfouz; Marcel J. T. Reinders

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

The power of single-cell RNA sequencing (scRNA-seq) in detecting cell heterogeneity or developmental process is becoming more and more evident every day. The granularity of this knowledge is further propelled when combining two batches of scRNA-seq into a single large dataset. This strategy is however hampered by technical differences between these batches. Typically, these batch effects are resolved by matching similar cells across the different batches. Current approaches, however, do not take into account that we can constrain this matching further as cells can also be matched on their cell type identity. We use an auto-encoder to embed two batches in the same space such that cells are matched. To accomplish this, we use a loss function that preserves: (1) cell-cell distances within each of the two batches, as well as (2) cell-cell distances between two batches when the cells are of the same cell-type. The cell-type guidance is unsupervised, i.e., a cell-type is defined as a cluster in the original batch. We evaluated the performance of our cluster-guided batch alignment (CBA) using pancreas and mouse cell atlas datasets, against six state-of-the-art single cell alignment methods: Seurat v3, BBKNN, Scanorama, Harmony, LIGER, and BERMUDA. Compared to other approaches, CBA preserves the cluster separation in the original datasets while still being able to align the two datasets. We confirm that this separation is biologically meaningful by identifying relevant differential expression of genes for these preserved clusters.

Clear search

Close search

Google apps

Main menu

Data_Sheet_1_CBA: Cluster-Guided Batch Alignment for Single Cell RNA-seq.PDF...

Data from: Large-scale integration of single-cell transcriptomic data...

78 shared genes in DEGs related to age and AD.

Single-cell CITE-seq of murine bone marrow across aging (Young, Mid, Old)

Analysis Products: Transcription factor stoichiometry, motif affinity and...

Data from: Single-cell RNA sequencing reveals the expansion of circulating...

Data from: Pre-ciliated tubal epithelial cells are prone to initiation of...

Data from: A spatial transcriptomics atlas of live donors reveals unique...

EPI-Clone supplementary dataset: Single cell RNA-seq of clonally barcoded...

Visium Spatial and snRNA data of Brain section from Parkinson Mouse Model...

DataSheet_1_Molecular mechanisms regulating natural menopause in the female...

Fig. 2b-c | Quantitative immunofluorescence microscopy for intracellular FUS...

Table_1_Integrated single-cell RNA-seq analysis identifies immune...

Data_Sheet_1_CBA: Cluster-Guided Batch Alignment for Single Cell RNA-seq.PDF