Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset contains all the Seurat objects that were used for generating all the figures in Pal et al. 2021 (https://doi.org/10.15252/embj.2020107333). All the Seurat objects were created under R v3.6.1 using the Seurat package v3.1.1. The detailed information of each object is listed in a table in Chen et al. 2021.
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Scripts used for analysis of V1 and V2 Datasets.seurat_v1.R - initialize seurat object from 10X Genomics cellranger outputs. Includes filtering, normalization, regression, variable gene identification, PCA analysis, clustering, tSNE visualization. Used for v1 datasets. merge_seurat.R - merge two or more seurat objects into one seurat object. Perform linear regression to remove batch effects from separate objects. Used for v1 datasets. subcluster_seurat_v1.R - subcluster clusters of interest from Seurat object. Determine variable genes, perform regression and PCA. Used for v1 datasets.seurat_v2.R - initialize seurat object from 10X Genomics cellranger outputs. Includes filtering, normalization, regression, variable gene identification, and PCA analysis. Used for v2 datasets. clustering_markers_v2.R - clustering and tSNE visualization for v2 datasets. subcluster_seurat_v2.R - subcluster clusters of interest from Seurat object. Determine variable genes, perform regression and PCA analysis. Used for v2 datasets.seurat_object_analysis_v1_and_v2.R - downstream analysis and plotting functions for seurat object created by seurat_v1.R or seurat_v2.R. merge_clusters.R - merge clusters that do not meet gene threshold. Used for both v1 and v2 datasets. prepare_for_monocle_v1.R - subcluster cells of interest and perform linear regression, but not scaling in order to input normalized, regressed values into monocle with monocle_seurat_input_v1.R monocle_seurat_input_v1.R - monocle script using seurat batch corrected values as input for v1 merged timecourse datasets. monocle_lineage_trace.R - monocle script using nUMI as input for v2 lineage traced dataset. monocle_object_analysis.R - downstream analysis for monocle object - BEAM and plotting. CCA_merging_v2.R - script for merging v2 endocrine datasets with canonical correlation analysis and determining the number of CCs to include in downstream analysis. CCA_alignment_v2.R - script for downstream alignment, clustering, tSNE visualization, and differential gene expression analysis.
Facebook
TwitterThe dataset contains an integrated, annotated Seurat v4 object. One can load the dataset into the R environment using the code below:
seurat_obj <- readRDS('PATH/TO/DOWNLOAD/seurat.rds')
The object has three assays: (I) RNA, (II) SCT and (III) integrated.
Facebook
TwitterThis dataset contains R seurat objects used to reproduce the single-cell RNA-seq analyses for the manuscript Single-cell consequences of X-linked meiotic drive in stalk-eyed flies. Testis tissue from eight male Teleopsis dalmanni (drive and standard genotypes) was dissociated and sequenced using the 10X Genomics Chromium platform. Sequencing reads were processed with Cell Ranger v7.2.0, and downstream filtering, doublet removal, integration, and clustering were performed in Seurat v5.1.0. The final dataset (seurat_final.RData) comprises 12,217 high-quality cells expressing 12,454 genes, with cell types identified using orthologous markers from Drosophila melanogaster. Provided files include the filtered integrated Seurat object and a final processed object with reclustered and annotated cell types. These resources enable full reproducibility of the analyses and support future exploration of testis cell populations in stalk-eyed flies. , , # Seurat objects for the manuscript Single-cell consequences of X-linked meiotic drive in stalk-eyed flies
Dataset DOI: 10.5061/dryad.q573n5twb
Sequencing data from Price et al. (2025; 10.5061/dryad.zkh1893kb) was processed using Cell Ranger v7.2.0. First, a custom reference genome was built with the T. dalmanni reference genome using mkref. Using cellrangers count function, fastq reads were then aligned against the custom index and counted, creating gene-by-cell count matrices. Data filtering and downstream analyses were performed using Seurat v5.1.0 in R v4.3.2. Cells in each sample were removed from the analysis if they expressed fewer than 200 features and more than 20% mitochondrial expression. Count data for each sample was also filtered by only keeping genes with expression (counts > 1) in at least three cells. We used DoubletFinder v2.0.4 in R with default parameters to identify and remove doublets. O...,
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
This project is a collection of files to allow users to reproduce the model development and benchmarking in "Dawnn: single-cell differential abundance with neural networks" (Hall and Castellano, under review). Dawnn is a tool for detecting differential abundance in single-cell RNAseq datasets. It is available as an R package here. Please contact us if you are unable to reproduce any of the analysis in our paper. The files in this collection correspond to the benchmarking dataset based on simulated linear trajectories.
FILES: Data processing code
adapted_traj_sim_milo_paper.R Lightly adapted code from Dann et al. to simulate single-cell RNAseq datasets that form linear trajectories . generate_test_data_linear_traj_sim_milo_paper.R R code to assign simulated labels to datatsets generated from adapted_traj_sim_milo_paper.R. Seurat objects saved as cells_sim_linear_traj_gex_seed_*.rds. Simulated labels saved as benchmark_dataset_sim_linear_traj.csv.
Resulting datasets
cells_sim_linear_traj_gex_seed_*.rds Seurat objects generated by generate_test_data_linear_traj_sim_milo_paper.R. benchmark_dataset_sim_linear_traj.csv Cell labels generated by generate_test_data_linear_traj_sim_milo_paper.R.
Facebook
TwitterCode for RSTUDIO with Seurat package integration and analysis of scRNA-Seq data for 20 GBM from Neftel et al., 2019
Facebook
TwitterThese are the raw data for HuPSA and MoPSA scRNAseq datasets. Both RDS files can be loaded into R and processed through the Seurat package.https://doi.org/10.1038/s41698-024-00667-x
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Code for RSTUDIO Seurat package analysis of 2 recurrent GBM from Yuan, Sims et al., 2018
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
In the context of the Human Cell Atlas, we have created a single-cell taxonomy of cell types and states in human tonsils. This repository contains the Seurat objects derived from this effort. In particular, we have datasets for each modality (scRNA-seq, scATAC-seq, CITE-seq, spatial transcriptomics), as well as cell type-specific datasets. Most importantly, this is the input that we used to create the HCATonsilData package, which allows programmatic access to all this datasets within R.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This repository contains the R code used to systematically benchmark ten Seurat-implemented differential gene-expression (DGE) methods for scRNA-seq data and to provide guidance on selecting appropriate DGE methods.
Facebook
TwitterThis archive contains data of scRNAseq and CyTOF in form of Seurat objects, txt and csv files as well as R scripts for data analysis and Figure generation.
A summary of the content is provided in the following.
R scripts
Script to run Machine learning models predicting group specific marker genes: CML_Find_Markers_Zenodo.R Script to reproduce the majority of Main and Supplementary Figures shown in the manuscript: CML_Paper_Figures_Zenodo.R Script to run inferCNV analysis: inferCNV_Zenodo.R Script to plot NATMI analysis results:NATMI_CvsA_FC0.32_Updown_Column_plot_Zenodo.R Script to conduct sub-clustering and filtering of NK cells NK_Marker_Detection_Zenodo.R
Helper scripts for plotting and DEG calculation:ComputePairWiseDE_v2.R, Seurat_DE_Heatmap_RCA_Style.R
RDS files
General scRNA-seq Seurat objects:
scRNA-seq seurat object after QC, and cell type annotation used for most analysis in the manuscript: DUKE_DataSet_Doublets_Removed_Relabeled.RDS
scRNA-seq including findings e.g. from NK analysis used in the shiny app: DUKE_final_for_Shiny_App.rds
Neighborhood enrichment score computed for group A across all HSPCs: Enrichment_score_global_groupA.RDS
UMAP coordinates used in the article: Layout_2D_nNeighbours_25_Metric_cosine_TCU_removed.RDS
SCENIC files:
Regulon set used in SCENIC: 2.6_regulons_asGeneSet.Rds
AUC values computed for regulons: 3.4_regulonAUC.Rds
MetaData used in SCENIC cellInfo.Rds
Group specific regulons for LCS: groupSpecificRegulonsBCRAblP.RDS
Patient specific regulons for LSC: patientSpecificRegulonsBCRAblP.RDS
Patient specificity score for LSC: PatientSpecificRegulonSpecificityScoreBCRAblP.RDS
Regulon specificty score for LSC: RegulonSpecificityScoreBCRAblP.RDS
BCR-ABL1 inference:
HSC with inferred BCR-ABL1 label: HSCs_CML_with_BCR-Abl_label.RDS
UMAP for HSC with inferred BCR-ABL1 label: HSCs_CML_with_BCR-Abl_label_UMAP.RDS
HSPCs with BCR-ABL1 module scores: HSPC_metacluster_74K_with_modscore_27thmay.RDS
NK sub-clustering and filtering:
NK object with module scores: NK_8617cells_with_modscore_1stjune.RDS
Feature genes for NK cells computed with DubStepR: NK_Cells_DubStepR
NK cells Seurat object excluding contaminating T and B cells: NK_cells_T_B_17_removed.RDS
NK Seurat object including neighbourhood enrichment score calculations: NK_seurat_object_with_enrichment_labels_V2.RDS
txt and csv files:
Proportions per cluster calculated from CyTOF: CyTOF_Proportions.txt
Correlation between scRNAseq and CyTOF cell type abundance: scRNAseq_Cor_Cytof.txt
Correlation between manual gating and FlowSOM clustering: Manual_vs_FlowSOM.txt
GSEA results:
HSPC, HSC and LSC results: FINAL_GSEA_DATA_For_GGPLOT.txt
NK: NK_For_Plotting.txt
TFRC and HLA expression: TFRC_and_HLA_Values.txt
NATMI result files:
UP-regulated_mean.csv
DOWN-regulated_mean.csv
Gene position file used in inferCNV: inferCNV_gene_positions_hg38.txt
Module scores for NK subclusters per cell: NK_Supplementary_Module_Scores.csv
Compressed folders:
All CyTOF raw data files: CyTOF_Data_raw.zip
Results of the patient-based classifier: PatientwiseClassifier.zip
Results of the single-cell based classifier: SingleCellClassifierResults.zip
For general new data analysis approaches, we recommend the readers to use the Seruat object stored in DUKE_final_for_Shiny_App.rds or to use the shiny app(http://scdbm.ddnetbio.com/) and perform further analysis from there.
RAW data is available at EGA upon request using Study ID: EGAS00001005509
Revision
The for_CML_manuscript_revision.tar.gz folder contains scripts and data for the paper revision including 1) Detection of the BCR-ABL fusion with long read sequencing; 2) Identification of BCR-ABL junction reads with scRNAseq; 3) Detection of expressed mutations using scRNAseq.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This is the Seurat object in .rds format with the raw matrix information (after filtering) , cell type annotation information and the UMAP coordinates. Users can use R readRDS function to load this .rds file. If you are using this dataset, please cite our paper: Qian, Peipei, Jiahui Kang, Dong Liu, and Gangcai Xie. "Single cell transcriptome sequencing of Zebrafish testis revealed novel spermatogenesis marker genes and stronger Leydig-germ cell paracrine interactions." Frontiers in genetics 13 (2022): 851719.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Data accompanying the seuFLViz R package for interactive exploratory data analysis of single cell datasets as seurat objects.
Data collected by Dominic Shayler and described in:
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
RSTUDIO and Seurat package analysis of 4 primary GBM
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Dan R Laks Code of Seurat analysis integration of 8PDX scRNA-Seq datasets_Xie-Laks-Parada et al., 2021
Facebook
TwitterDataset created in the study "A Spatial Transcriptomics Atlas of the Malaria-infected Liver Indicates a Crucial Role for Lipid Metabolism and Hotspots of Inflammatory Cell Infiltration"
Structure
ST_berghei_liver
contains data generated during stpipeline analysis and imaging on 2k arrays Spatial Transcriptomics platform as well as data necessary for and from hepaquery analysis. These samples include 38 sections in total of which 8 are from mice (n=4) infected with sporozoites for 12h, 5 sections from control mice (n=3) at 12h, 7 sections from mice (n=4) infected with sporozoites for 24h and 4 sections from control mice (n=3) for 24 as well as 8 samples of mice (n=2) infected with sporozoites for 38h and control mice (n =2) for 38h.
count contains gene expression matrix output from stpipeline in .tsv format
spotfiles contains coordinate files for count matrices
images contains scaled H&E, Fluorescence (FL) and annotated H&E images (from FL annotations) scaled to 10% of the original image size.
masks contains image masks for hepaquery analysis
distances contains distance measurements from original section sorted by timepoint as well as combined across timepoints
cluster contains clustering information across spatial positions used in spatial enrichment analysis
STUtiility_mus_pb_ST.RDS describes seurat object generated using the STUtility package using ST data of the 38 liver sections of which the data is stored in ST_berghei_liver
visium_berghei_liver
contains data generated with the spaceranger pipeline and imaging using the Visium spatial transcriptomics platform. These samples include 8 sections in total, of which 1 was infected with sporozoites for 12h, 1 control section at 12h, 1 section infected with sporozoites for 24h and 1 control section at 24 as well as 2 sporozoite infected sections, and 2 control sections at 38h.
V10S29-135_A1 contains spaceranger output for section 1 for infected and control sections at 38h post-infection
V10S29-135_B1 contains spaceranger output for section 1 for infected and control sections at 12h post-infection
V10S29-135_C1 contains spaceranger output for section 1 for infected and control sections at 24h post-infection
V10S29-135_D1 contains spaceranger output for section 2 for infected and control sections at 38h post-infection
se_visium.RDS describes seurat object generated using the STUtility package using ST data of the 38 liver sections of which the data is stored in visium_berghei_liver
snSeq_berghei_liver
contains data generated with the cellranger pipeline and imaging using the Visium spatial transcriptomics platform. These samples include single nuclei of 2 infected and control mice after 12h, 2 infected and control mice after 24h, 2 infected and control mice after 38h, and 2 uninfected mice prior to a challenge.
cellranger_cnt_out contains feature count matrix information from cell ranger output
final_merged_curated_annotations_270623.RDS describes seurat object generated using the STUtility package using ST data of the 38 liver sections of which the data is stored in snSeq_berghei_liver.tar.gz
raw images.zip contains raw images for supplementary figures 20-22
adjusted images.zip contains brightness and contrast adjusted images for supplementary figures 20-22
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
We profile the transcriptomes of ~30,000 mouse single cells to deconvolve the hepatic mesenchyme in healthy and fibrotic liver at high resolution. We reveal spatial zonation of hepatic stellate cells across the liver lobule, designated portal vein-associated HSC and central vein-associated HSC, and uncover an equivalent functional zonation in a mouse model of centrilobular fibrosis. Our work illustrates the power of single-cell transcriptomics to resolve key collagen-producing cells driving liver fibrosis with high precision. We provide the contents of these data as Seurat R objects.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This page includes the data and code necessary to reproduce the results of the following paper: Yang Liao, Dinesh Raghu, Bhupinder Pal, Lisa Mielke and Wei Shi. cellCounts: fast and accurate quantification of 10x Chromium single-cell RNA sequencing data. Under review. A Linux computer running an operating system of CentOS 7 (or later) or Ubuntu 20.04 (or later) is recommended for running this analysis. The computer should have >2 TB of disk space and >64 GB of RAM. The following software packages need to be installed before running the analysis. Software executables generated after installation should be included in the $PATH environment variable.
R (v4.0.0 or newer) https://www.r-project.org/ Rsubread (v2.12.2 or newer) http://bioconductor.org/packages/3.16/bioc/html/Rsubread.html CellRanger (v6.0.1) https://support.10xgenomics.com/single-cell-gene-expression/software/overview/welcome STARsolo (v2.7.10a) https://github.com/alexdobin/STAR sra-tools (v2.10.0 or newer) https://github.com/ncbi/sra-tools Seurat (v3.0.0 or newer) https://satijalab.org/seurat/ edgeR (v3.30.0 or newer) https://bioconductor.org/packages/edgeR/ limma (v3.44.0 or newer) https://bioconductor.org/packages/limma/ mltools (v0.3.5 or newer) https://cran.r-project.org/web/packages/mltools/index.html
Reference packages generated by 10x Genomics are also required for this analysis and they can be downloaded from the following link (2020-A version for individual human and mouse reference packages should be selected): https://support.10xgenomics.com/single-cell-gene-expression/software/downloads/latest After all these are done, you can simply run the shell script ‘test-all-new.bash’ to perform all the analyses carried out in the paper. This script will automatically download the mixture scRNA-seq data from the SRA database, and it will output a text file called ‘test-all.log’ that contains all the screen outputs and speed/accuracy results of CellRanger, STARsolo and cellCounts.
Facebook
Twitterhttps://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
Single cell RNA sequencing (drop-seq) data of forebrain organoids carrying pathogenic MAPT R406W and V337M mutations. Organoids were generated from 5 heterozygous donor lines (two R406W lines and three V337M lines) and respective CRISPR-corrected isogenic controls. Organoids were also generated from one homozygous R406W donor line. Single-cell sequencing was performed at 1, 2, 3, 4, 6 and 8 months of organoid maturation. Methods Single-cell transcriptomes were obtained using drop-seq (Macosko et al., 2015, https://doi.org/10.1016/j.cell.2015.05.002). Counts matrices were generated using the Drop-seq tools package (Macosko et al. 2015), with full details available online (https://github.com/broadinstitute/Drop-seq/files/2425535/Drop-seqAlignmentCookbookv1.2Jan2016.pdf). Briefly, raw reads were converted to BAM files, cell barcodes and UMIs were extracted, and low-quality reads were removed. Adapter sequences and polyA tails were trimmed, and reads were converted to Fastq for STAR alignment (STAR version 2.6). Mapping to human genome (hg19 build) was performed with default settings. Reads mapped to exons were kept and tagged with gene names, beads synthesis errors were corrected, and a digital gene expression matrix was extracted from the aligned library. We extracted data from twice as many cell barcodes as the number of cells targeted (NUM_CORE_BARCODES = 2x # targeted cells). Downstream analysis was performed using Seurat 3.0 in R version 3.6.3. An individual Seurat object was generated for each sample, and filtered and clustered individually. Cells with < 300 genes detected were filtered out, as were cells with > 10% mitochondrial gene content. Counts data were log-normalized using the default NormalizeData function and the default scale of 1e4. Then, the top 2000 variable genes were identified using the Seurat FindVariableFeatures function (selection.method = “vst”, nfeatures = 2000), followed by scaling and centering using the default ScaleData function. Principal Components Analysis was carried out on the scaled expression values of the 2000 top variable genes, and the cells were clustered using the first 50 principal components (PCs) as input in the FindNeighbors function, and a resolution of 0.4 in the FindClusters function. Non-linear dimensionality reduction was performed by running UMAP on the first 50 PCs. Following clustering and dimensionality reduction, putative cell doublets were identified using DoubletFinder (McGinnis et al. 2019; https://doi.org/10.1016/j.cels.2019.03.003), assuming a doublet formation rate of 5%. For each sample, the optimal pK value was identified based on the results of paramSweep_vs, summarizeSweep and find.pK functions of the DoubletFinder package. Instead of using the default paramSweep_vs function, we extended the upper range of computed pK values to 1.2. We visually verified cells identified as doublets had high nFeatures (number of genes expressed) by plotting the pANN metric against nFeatures. For samples not showing this correlation, we adjusted the pK value to the next highest peak in the pK/BCmetric plot. Finally, the individual Seurat objects were merged.
Facebook
Twitterhttps://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
The human adult intestinal system is a complex organ that is approximately 9 meters long and performs a variety of complex functions including digestion, nutrient absorption, and immune surveillance. We performed snRNA-seq on 8 regions of of the human intestine (duodenum, proximal-jejunum, mid-jejunum, ileum, ascending colon, transverse colon, descending colon, and sigmoid colon) from 9 donors (B001, B004, B005, B006, B008, B009, B010, B011, and B012). In the corresponding paper, we find cell compositions differ dramatically across regions of the intestine and demonstrate the complexity of epithelial subtypes. We map gene regulatory differences in these cells suggestive of a regulatory differentiation cascade, and associate intestinal disease heritability with specific cell types. These results describe the complexity of the cell composition, regulation, and organization in the human intestine, and serve as an important reference map for understanding human biology and disease. Methods For a detailed description of each of the steps to obtain this data see the detailed materials and methods in the associated manuscript. Briefly, intestine pieces from 8 different sites across the small intestine and colon were flash frozen. Nuclei were isolated from each sample and the resulting nuclei were processed with either 10x scRNA-seq using Chromium Next GEM Single Cell 3’ Reagent Kits v3.1 (10x Genomics, 1000121) or Chromium Next GEM Chip G Single Cell Kits (10x Genomics, 1000120) or 10x multiome sequencing using Chromium Next GEM Single Cell Multiome ATAC + Gene Expression Kits (10x Genomics, 1000283). Initial processing of snRNA-seq data was done with the Cell Ranger Pipeline (https://support.10xgenomics.com/single-cell-gene-expression/software/pipelines/latest/what-is-cell-ranger) by first running cellranger mkfastq to demultiplex the bcl files and then running cellranger count. Since nuclear RNA was sequenced, data were aligned to a pre-mRNA reference. Initial processing of the mutiome data, including alignment and generation of fragments files and expression matrices, was performed with the Cell Ranger ARC Pipeline. The raw expression matrices from these pipelines are included here. Downstream processing was performed in R, using the Seurat package.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset contains all the Seurat objects that were used for generating all the figures in Pal et al. 2021 (https://doi.org/10.15252/embj.2020107333). All the Seurat objects were created under R v3.6.1 using the Seurat package v3.1.1. The detailed information of each object is listed in a table in Chen et al. 2021.